Monarch geneset OGS2.0

DPOGS208887
TranscriptDPOGS208887-TA1446 bp
ProteinDPOGS208887-PA481 aa
Genomic positionDPSCF300009 - 1072194-1081770
RNAseq coverage3563x (Rank: top 3%)
Annotation
HeliconiusHMEL0026282e-14657.17% 
BombyxBGIBMGA002457-TA0.068.72% 
DrosophilaCG31075-PA1e-14754.51% 
EBI UniRef50UniRef50_Q7Q1655e-15154.37%AGAP009944-PA n=3 Tax=Eukaryota RepID=Q7Q165_ANOGA
NCBI RefSeqXP_001864975.11e-16260.33%aldehyde dehydrogenase [Culex quinquefasciatus]
NCBI nr blastpgi|1700585642e-16160.33%aldehyde dehydrogenase [Culex quinquefasciatus]
NCBI nr blastxgi|1140519661e-16059.88%mitochondrial aldehyde dehydrogenase [Bombyx mori]
Group
Gene OntologyGO:00081525.4e-175metabolic process
GO:00551145.4e-175oxidation-reduction process
GO:00164915.4e-175oxidoreductase activity
GO:00166201.2e-73oxidoreductase activity, acting on the aldehyde or oxo group of donors, NAD or NADP as acceptor
KEGG pathwaytca:6594381e-156 
 K00128 (E1.2.1.3)maps-> 1,2-Dichloroethane degradation
    Arginine and proline metabolism
    Glycolysis / Gluconeogenesis
    Propanoate metabolism
    Limonene and pinene degradation
    Tryptophan metabolism
    Lysine degradation
    Valine, leucine and isoleucine degradation
    Pyruvate metabolism
    beta-Alanine metabolism
    Fatty acid metabolism
    3-Chloroacrylic acid degradation
    Glycerolipid metabolism
    Ascorbate and aldarate metabolism
    Histidine metabolism
InterPro domain[5-475] IPR0161615.4e-175Aldehyde/histidinol dehydrogenase
[16-472] IPR0155902.1e-172Aldehyde dehydrogenase domain
[7-261] IPR0161626.5e-93Aldehyde dehydrogenase, N-terminal
[262-447] IPR0161631.2e-73Aldehyde dehydrogenase, C-terminal
Orthology groupMCL44364 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208887-TA
ATGGCTCCGCAAATTAAATATACGAAAATTTTTATCAACAATTCCTGGGTAGACTCGGTCAGTGGAAAGACATTCCAAACTATAAATCCTCACGATGGATCAGTCAATGCCGAGGCAGATGTGGATGCAGCTGTCGGAGCAGCTAAAAGTGCATTCCACCGCAACTCTGAATGGCGTCTGATGGACCCGTCGGAAAGAGTGAAGCTTTTGAACAAATGGGCTGATCTCGTAAATCGGGATATAGATTACCTTATAAAATTGGAAACATTAGATAACGGTATCGTGGTACAAACCAATCAAAGATTTATGTCAGTGGCTGTTAATGCTATACGTTACAACGCCAGTTGGGCTGATAAGATTCAAGGAACTACGATACCCGTGGACGGTGAAGCGTTTTCCTACACACTGAAGCAACCAGTTGGTGTATGCGCTATAATCATACCATGGAATGCGCCGGTCTTGTTTTTCTGCAGTAAAGTATCAGCGGCTTTAGCTGCAGGCTGCACCGTAGTAGTGAAGCCGGCAGAACAGACTCCTTTAACAGCGCTGGCGCTGGCTTCTCTGGTCGCGGAGGCTGGGATTCCACCAGGTGTTGTGAATGTGGTGCCTGGGTATGGGGAGACAGCAGGAGCGGCTCTAACACATCACCCTGATGTCGCACATATATCGTTCACGGGATCTTTACAGGTGGGTAAGATAATCCAACAGGCGGCAGGCGCCAACAATCTCAAGCGTGTCCAACTTGAGCTAGGCGGGAAAAGTCCTCTCGTTGTTATGAACGATGCAGACTTGGATGCTGCGGTGCAGTTTGCTGCTCTCGGGGTTTTTACCAATCAAGGACAAATGTGTATAGCTGCTTCCCGTCTTTTTGTGCAATCAGGAATTTACGACGAATTTGTTAAAAGAGCTTCCGAATTTGCAAAGAGTCTTGTTGTTGGTAAACCACTAGACCTCAAAACACAGCACGGTCCTCAGATTGATGAAAACTTAATGAATAGGGTGTTAGGTTACATCGAAAAAGGAGTATCCGAAGGTGCAAAGCTTTTGACTGGCGGAAAAAGAATTGGAAAAACTGGTTATTATGTTGAGCCTACCGTCTTTTCTGATGTCACGGATGATATGACCATCGCTGTAGAAGAAATTTTCGGTCCGGTCCAAAACATCTTAAAGTTCGAAACATTTGAAGAAGTTATTGAACGTGCTAACGCTACCAACTATGGTTTGGCGGCTGGGATATTTACAAGCTCTGTCGAAACTGCTCTACAGTTTAGCAAACATATTGAAGCAGGAATTGTTTGGGTGAATACTTATTTACATTTTGGAAGTCAGCTACCATTCGGTGGTTTCAAGGACTCCGGGATTGGCAGAGAAAATGGACCCAACGGAGTGGAAGCTTACTTGGAACTCAAAACAGTAATAATGAAACTTTCGAAGAAGTTGCAATAA

Protein sequence:

>DPOGS208887-PA
MAPQIKYTKIFINNSWVDSVSGKTFQTINPHDGSVNAEADVDAAVGAAKSAFHRNSEWRLMDPSERVKLLNKWADLVNRDIDYLIKLETLDNGIVVQTNQRFMSVAVNAIRYNASWADKIQGTTIPVDGEAFSYTLKQPVGVCAIIIPWNAPVLFFCSKVSAALAAGCTVVVKPAEQTPLTALALASLVAEAGIPPGVVNVVPGYGETAGAALTHHPDVAHISFTGSLQVGKIIQQAAGANNLKRVQLELGGKSPLVVMNDADLDAAVQFAALGVFTNQGQMCIAASRLFVQSGIYDEFVKRASEFAKSLVVGKPLDLKTQHGPQIDENLMNRVLGYIEKGVSEGAKLLTGGKRIGKTGYYVEPTVFSDVTDDMTIAVEEIFGPVQNILKFETFEEVIERANATNYGLAAGIFTSSVETALQFSKHIEAGIVWVNTYLHFGSQLPFGGFKDSGIGRENGPNGVEAYLELKTVIMKLSKKLQ-