Monarch geneset OGS2.0

DPOGS201994
TranscriptDPOGS201994-TA2535 bp
ProteinDPOGS201994-PA844 aa
Genomic positionDPSCF300060 + 183559-193476
RNAseq coverage6334x (Rank: top 2%)
Annotation
HeliconiusHMEL0026280.082.22% 
BombyxBGIBMGA010403-TA0.079.04% 
DrosophilaCG31075-PA2e-17964.77% 
EBI UniRef50UniRef50_Q7Q1658e-17359.07%AGAP009944-PA n=3 Tax=Eukaryota RepID=Q7Q165_ANOGA
NCBI RefSeqNP_001040198.10.079.04%mitochondrial aldehyde dehydrogenase [Bombyx mori]
NCBI nr blastpgi|1140519660.079.04%mitochondrial aldehyde dehydrogenase [Bombyx mori]
NCBI nr blastxgi|1140519660.079.04%mitochondrial aldehyde dehydrogenase [Bombyx mori]
Group
Gene OntologyGO:00081525.1e-180metabolic process
GO:00551145.1e-180oxidation-reduction process
GO:00164915.1e-180oxidoreductase activity
GO:00166201.1e-68oxidoreductase activity, acting on the aldehyde or oxo group of donors, NAD or NADP as acceptor
KEGG pathwaytca:6594380.0 
 K00128 (E1.2.1.3)maps-> 1,2-Dichloroethane degradation
    Arginine and proline metabolism
    Glycolysis / Gluconeogenesis
    Propanoate metabolism
    Limonene and pinene degradation
    Tryptophan metabolism
    Lysine degradation
    Valine, leucine and isoleucine degradation
    Pyruvate metabolism
    beta-Alanine metabolism
    Fatty acid metabolism
    3-Chloroacrylic acid degradation
    Glycerolipid metabolism
    Ascorbate and aldarate metabolism
    Histidine metabolism
InterPro domain[374-834] IPR0155905.1e-180Aldehyde dehydrogenase domain
[366-843] IPR0161611.3e-179Aldehyde/histidinol dehydrogenase
[8-268] IPR0161623.6e-107Aldehyde dehydrogenase, N-terminal
[625-809] IPR0161631.1e-68Aldehyde dehydrogenase, C-terminal
Orthology groupMCL10890 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201994-TA
ATGGCCAACGTAGAAATCAAATATACAAAGTTGTTTATCAACAATGAGTTCGTGGATGCCGTGAGCAAAAAAACATTCCCCACCATCAATCCTCAAGATGAATCCGTTATATGCCAAGTTGCTGAGGCCGATAAGGCTGATATTGACTTGGCAGTGTCTGCAGCTGTCAAGGCATTCCATCGTTACTCCGAATGGAGGAAGCTTGACGCGTCAGAAAGGGGCAGATTACTTCTCAAATTAGCTGACCTAATTGAGAGAGACGCCAACTATCTGGGTAAATTGGAAACATTGGACAATGGAAAACCAGTCGCCCAAGCCATTGGAGAGGCCATATGGTCCACTAATATAATAAGATATTATGCCGGAAAAGCTGATAAGATATTGGGAAACACTATTCCAGCAGACGGCGAAGTTCTCTCTATGACACTCAAGGAACCAGTTGGAGTTTGCGGTCAAATTCTACCATGGAATTACCCGATTCCGATGTTCGTGTGGAAAATAGCTCCAGCTCTTGCAGCAGGTTGCACAGTGGTTGTGAAACCGGCAGAACAAACTCCACTCACCGCTTTAGCTTTAGCTGCGTTGGTTAAAGAGGCTGGTTTTCCCCCCGGCGTGGTTAATGTTCTCCCCGGCTATGGACCTACAGCTGGCGCAGCCCTTACAAGCCATCCCCAGGTTGACAAAATGGCCTTCACTGGATCCACTGAAGTTGGCCGTATCATCATGAAAGGCGCTAGCGAGGTCAATCTGAAACGTGTCACCCTTGAATTGGGAGGAAAAAGCCCTTTGGTCGTCTTCAATGATGCTGATGTTGATAAGGCTGCCGAAATCGCTCATCGTGCTGCTTTCGCTAACGCTGGTCAATGTTGTGTTGCTGGTACAAGAACATACGTCCAATCTGGTATCTACGATAAGTTTGTTGCGAAAGCTGCTGAGATAGCTAAGAAGAGATCCGTTGGTAACCCCTACACAGATGTCCAACAGGGACCACAGTTGTTTATCAACAATGAGTTCGTGGATGCCGTGAGCAAAAAAACATTCCCTACCATCAATCCTCAAGATGAATCCGTTATATGCCAAGTTGCTGAGGCCGATAAGTTGTTTATCAACAATGAGTTCGTGGATGCCGTGAGCAAAAAAACATTCCCCACCATCAATCCTCAAGATGAATCCGTTATATGCCAAGTTGCTGAGGCCGATAAGGCTGATATTGACTTGGCAGTGTCTGCAGCTGTCAAGGCATTCCATCGTTACTCCGAATGGAGGAAGCTTGACGCGTCAGAAAGGGGCAGATTACTTCTCAAATTAGCTGACCTAATTGAGAGAGACGCCAACTATCTGGGTAAATTGGAAACATTGGACAATGGAAAACCAGTCGCCCAAGCCATTGGAGAGGCCATATGGTCCACAAATATAATAAGATATTATGCCGGAAAAGCTGATAAGATATTGGGAAACACTATTCCAGCAGACGGCGAAGTTCTCTCTATGACACTCAAGGAACCAGTTGGAGTTTGCGGTCAAATTCTACCATGGAACTACCCAATTCCGATGTTCGTGTGGAAAATAGCTCCAGCTCTTGCAGCAGGTTGCACAGTGGTTGTGAAACCAGCGGAACAAACTCCACTCACCGCTTTAGCTTTAGCTGCATTGGTTAAAGAGGCTGGTTTTCCCCCCGGCGTGGTTAATGTTCTTCCCGGCTATGGACCTACAGCTGGCGCAGCCCTTACAAGCCATCCCCAGGTTGACAAAATGGCCTTCACTGGATCCACTGAAGTTGGCCGTATCATCATGAAAGGCGCTAGCGAGGTCAACCTGAAACGTGTCACCCTTGAATTGGGAGGAAAAAGCCCTTTGGTCGTCTTCAATGATGCTGATGTTGATAAGGCTGCCGAAATCGCTCATCGTGCTGCTTTCGCTAACGCTGGTCAATGTTGTGTTGCTGGTACAAGAACATACGTCCAATCTGGTATCTACGATAAGTTTGTTGCGAAAGCTGCTGAGATAGCTAAGAAGAGATCCGTTGGTAACCCCTACACAGATGTCCAACAGGGACCACAGATCGACGATGAAATGTTCACAAAAGTGATGGGTTACATCAACGCTGGTAAGAAGGAAGGCGCGAAGTGTGTTGCCGGAGGTGATCGACACGGTAAAGTCGGCTTCTTCGTTCAACCGACTGTGTTTGCTGACGTTACTGACAATATGAAAATAGCCAGAGAAGAAATCTTTGGTCCAGTTCAAAGCATCTTAAAATTCGAGACATTCGAGGAAGTCATTGATCGTGCGAACGATTCCAACTACGGTCTCGGAGCTGGAGTCATCACCAACGACATCACATTAGCTATGGCTTTTGTGAGACACGTCCGTGCTGGCAGTGTTTGGGTTAACACTTACGAGCACGTTGCCCCACAAACACCATTCGGCGGTTTTAGGGAATCGGGTATTGGTCGTGAATTGGGTGAGGATGGAATCATGCAGTACTTGGAGAACAAGACCGTTACAATTAATCTACCAAAAGCACCCATTGCATAA

Protein sequence:

>DPOGS201994-PA
MANVEIKYTKLFINNEFVDAVSKKTFPTINPQDESVICQVAEADKADIDLAVSAAVKAFHRYSEWRKLDASERGRLLLKLADLIERDANYLGKLETLDNGKPVAQAIGEAIWSTNIIRYYAGKADKILGNTIPADGEVLSMTLKEPVGVCGQILPWNYPIPMFVWKIAPALAAGCTVVVKPAEQTPLTALALAALVKEAGFPPGVVNVLPGYGPTAGAALTSHPQVDKMAFTGSTEVGRIIMKGASEVNLKRVTLELGGKSPLVVFNDADVDKAAEIAHRAAFANAGQCCVAGTRTYVQSGIYDKFVAKAAEIAKKRSVGNPYTDVQQGPQLFINNEFVDAVSKKTFPTINPQDESVICQVAEADKLFINNEFVDAVSKKTFPTINPQDESVICQVAEADKADIDLAVSAAVKAFHRYSEWRKLDASERGRLLLKLADLIERDANYLGKLETLDNGKPVAQAIGEAIWSTNIIRYYAGKADKILGNTIPADGEVLSMTLKEPVGVCGQILPWNYPIPMFVWKIAPALAAGCTVVVKPAEQTPLTALALAALVKEAGFPPGVVNVLPGYGPTAGAALTSHPQVDKMAFTGSTEVGRIIMKGASEVNLKRVTLELGGKSPLVVFNDADVDKAAEIAHRAAFANAGQCCVAGTRTYVQSGIYDKFVAKAAEIAKKRSVGNPYTDVQQGPQIDDEMFTKVMGYINAGKKEGAKCVAGGDRHGKVGFFVQPTVFADVTDNMKIAREEIFGPVQSILKFETFEEVIDRANDSNYGLGAGVITNDITLAMAFVRHVRAGSVWVNTYEHVAPQTPFGGFRESGIGRELGEDGIMQYLENKTVTINLPKAPIA-