Monarch geneset OGS2.0

DPOGS210214
TranscriptDPOGS210214-TA2763 bp
ProteinDPOGS210214-PA920 aa
Genomic positionDPSCF300196 - 670548-686508
RNAseq coverage1549x (Rank: top 8%)
Annotation
HeliconiusHMEL0157760.059.34% 
BombyxBGIBMGA002539-TA0.086.13% 
DrosophilaVha100-1-PE0.068.52% 
EBI UniRef50UniRef50_Q0IFY30.070.37%Vacuolar proton atpases n=49 Tax=Metazoa RepID=Q0IFY3_AEDAE
NCBI RefSeqXP_002054489.10.070.01%GJ22780 [Drosophila virilis]
NCBI nr blastpgi|3072139110.073.23%Vacuolar proton translocating ATPase 116 kDa subunit a isoform 1 [Harpegnathos saltator]
NCBI nr blastxgi|3072139110.073.23%Vacuolar proton translocating ATPase 116 kDa subunit a isoform 1 [Harpegnathos saltator]
Group
Gene OntologyGO:00159910ATP hydrolysis coupled proton transport
GO:00331770proton-transporting two-sector ATPase complex, proton-transporting domain
GO:00150780hydrogen ion transmembrane transporter activity
KEGG pathwaydmo:Dmoj_GI227770.0 
 K02154 (ATPeVI, ATP6N1A)maps-> Collecting duct acid secretion
    Oxidative phosphorylation
    Lysosome
    Phagosome
    Vibrio cholerae infection
    Epithelial cell signaling in Helicobacter pylori infection
InterPro domain[2-920] IPR0024900ATPase, V0/A0 complex, 116kDa subunit
Orthology groupMCL10092 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210214-TA
ATGGGGTCGTTATTTCGAAGTGAGGAAATGACTCTGTGTCAACTATTTCTGCAGAGTGAAGCAGCCTATGCATGCGTGTCTGAACTTGGTGAACTGGGGCTTGTTCAGTTCCGCGATTTGAACCCAGACGTGAACGCGTTCCAACGGAAGTTCGTCAATGAAGTGCGCCGTTGCGATGAGATGGAGAGGAAGCTGCGCTACCTGGAGAAGGAGATCAGGAGAGACGGCATCCCCATGCTGGAGATACCGGGGGAGGTGCCCGAGGCGCCGCAGCCCAGGGAGATGATCGACCTTGAGGCTACGTTTGAGAAACTCGAAAATGAACTCCGGGAAGTCAATCAGAACGCTGAGGCGTTGAAGAGGAACTACTTGGAACTGACGGAGCTGAAGCACATACTGAGGAAGACGCAGGTGTTCTTCGACGAGATGGCGGACCCGTCGCGGGAGGAGGAACAAGTCACCCTCCTGGGGGAGGAGGGGCTGATGGCGGGAGGGCAAGCGCTCAAGCTGGGGTATGCAGTATATATAAGTGCTATCAGCTACGGACCGCTCCCGACTCCTGCGACCGCTCGGTCAGGGCCGGTCCTAGCTCGGTATTATTTTTGCAGCGCTTGTACTGTGACGCTGATACGGGAGTCTACCGGTCACCACTCAGGCAAGCATTGGAGACAGCCGCACGAGGGTGGTGCCAATACCACTGAGTCAATGACCCGGGCTCTGATATCCGACGATCCGAACAGACATATGGGACAGGTCCAACTAGGTTTCGTTGCCGGAGTTATTCTCCGTGAGAGAATTCCTGCCTTTGAGCGTATGCTGTGGCGTGCGTGTCGCGGTAACGTCTTCCTGAGGCAGGCCGAGATCGACACGCCTCTAGAGGACCCGTCATCGTCGGACCAGGTGTACAAGTCGGTGTTCATAATCTTCTTCCAAGGAGACCAGCTCAAGACCCGCGTGAAGAAGATCTGTGAAGGTTTCCGCGCCACCTTATACCCGTGCCCGGAGGCTCCCGCCGACCGTCGGGAAATGGCCATGGGGGTCATGACCAGGATCGAAGATCTTAACACGGTGTTGGGTCAGACCCAGGACCACCGTCACCGCGTGTTGGTCGCCGCTGCCAAGAACATAAAGAACTGGTTCGTGAAGGTGCGCAAGATTAAGGCCATCTATCACACCTTGAACCTGTTCAACCTGGACGTGACCCAGAAGTGTCTCATCGCCGAGTGCTGGGTCCCCGCCCTGGACATGGAGACCATACAGTTGGCCCTACGGAGAGGAACGGAGCGCAGCGGCAGTTCGGTCCCGCCGATCCTGAACCGCATGGACACGTCCGAGCCGCCGCCGACCTACAACCGCACTAACAAGTTCACCTCCGCCTTCCAGCACCTCATATACGCCTACGGTGTCGCCACCTACCGGGAGGTCAACCCCGCTCCGTACACCATAATCACGTTTCCGTTCCTGTTCGCCGTGATGTTCGGTGACCTGGGTCACGGGGCGCTCATGGCCGCCTTCGGCTTCTGGATGTGTTACAAGGAGAAGCCGCTGCAGGCCAAGAGGATCGACAGCGAGATCTGGACCATCTTCTTCGGCGGGCGCTACATCATCTTGCTGATGGGCCTGTTCTCCATGTACACGGGCATCATCTACAACGACATCTTCTCTAAGAGTCTCAACATCTTCGGCTCCTCGTGGGTCAACAACTACAACGAGTCCACTCTCCTCACCAACAAGGACCTCCAGCTCAACCCCGACTCCGAGGACTACTTGCAGACGCCCTACCCCTTCGGCATAGATCCTGTGTGGCAGCTGGCGGAGGCTAACAAGATCATCTTCATGAACGCCTACAAGATGAAGATCTCCATCATCATCGGCGTCTTCCACATGTTGTTCGGAGTCTGCCTCTCGCTGTGGAACCATCTGTACTTCAAGCGCCGCATCTCGATATACGTGGAGTTCGTCCCTCAGATCTTTTTCCTCACGCTGCTGTTCTTCTACATGGTGCTGCTGATGTTCATCAAGTGGACCTCCTACGGCCCGACCCCCGGGCACTTCGGAGACGAGGCCTACGTGAAGACCAGCGGCTTCTGCGCGCCGTCCATCCTGATCACCTTCATCAACATGATGCTGTTCAAGACGGACGAGAACACGCGGCCGCAGTGCGACGACACCATGTACGCCGGACAGATAGGACTCCAGAAGCTGTTCGTCATACTGGCCCTGATGTGCGTGCCTGTGATGTTGTTCGGGAAGCCGTACTTCATCAGGAAGGAGCAGAAGTTACGCGCTGCGCAAGGTCACCAGAGCATCGAGGCGAGCGCTGAGAACGGCACGGCCGGCGGAGCGCCCGTCCCCGCTCACGACCACGGCGACGAGGACATCACCGAGGTGTTCATACACCAGGCCATCCACACCATCGAGTACGTGCTGGGGAGCGTCTCGCACACGGCGTCCTACCTGCGACTGTGGGCGCTGTCTCTGGCGCACGCTCAGCTGGCCGAGGTCGCCTGGAACATGTTGCTGAGGAAGGGTCTCATGTCTCCCAGCTACGAGGGCGGCATCTTCCTGTACATCGTGTTCGCGGGCTGGGCCGCCATCTCCGTCTCCATCCTGGTGCTGATGGAGGGCCTGTCCGCCTTCCTGCACACACTGCGTCTGCATTGGGTGGAGTTCCAGAGTAAGTTCTACGCGGGCGAGGGTTACCTCTTCATGCCGTTCTCGTTCGAGATCATTCTGGACTCGGCGGGTCAGGCCGAGGAGTAA

Protein sequence:

>DPOGS210214-PA
MGSLFRSEEMTLCQLFLQSEAAYACVSELGELGLVQFRDLNPDVNAFQRKFVNEVRRCDEMERKLRYLEKEIRRDGIPMLEIPGEVPEAPQPREMIDLEATFEKLENELREVNQNAEALKRNYLELTELKHILRKTQVFFDEMADPSREEEQVTLLGEEGLMAGGQALKLGYAVYISAISYGPLPTPATARSGPVLARYYFCSACTVTLIRESTGHHSGKHWRQPHEGGANTTESMTRALISDDPNRHMGQVQLGFVAGVILRERIPAFERMLWRACRGNVFLRQAEIDTPLEDPSSSDQVYKSVFIIFFQGDQLKTRVKKICEGFRATLYPCPEAPADRREMAMGVMTRIEDLNTVLGQTQDHRHRVLVAAAKNIKNWFVKVRKIKAIYHTLNLFNLDVTQKCLIAECWVPALDMETIQLALRRGTERSGSSVPPILNRMDTSEPPPTYNRTNKFTSAFQHLIYAYGVATYREVNPAPYTIITFPFLFAVMFGDLGHGALMAAFGFWMCYKEKPLQAKRIDSEIWTIFFGGRYIILLMGLFSMYTGIIYNDIFSKSLNIFGSSWVNNYNESTLLTNKDLQLNPDSEDYLQTPYPFGIDPVWQLAEANKIIFMNAYKMKISIIIGVFHMLFGVCLSLWNHLYFKRRISIYVEFVPQIFFLTLLFFYMVLLMFIKWTSYGPTPGHFGDEAYVKTSGFCAPSILITFINMMLFKTDENTRPQCDDTMYAGQIGLQKLFVILALMCVPVMLFGKPYFIRKEQKLRAAQGHQSIEASAENGTAGGAPVPAHDHGDEDITEVFIHQAIHTIEYVLGSVSHTASYLRLWALSLAHAQLAEVAWNMLLRKGLMSPSYEGGIFLYIVFAGWAAISVSILVLMEGLSAFLHTLRLHWVEFQSKFYAGEGYLFMPFSFEIILDSAGQAEE-