Monarch geneset OGS2.0

DPOGS202918
TranscriptDPOGS202918-TA1653 bp
ProteinDPOGS202918-PA550 aa
Genomic positionDPSCF300126 + 421541-425066
RNAseq coverage56x (Rank: top 69%)
Annotation
HeliconiusHMEL0145822e-3531.73% 
BombyxBGIBMGA004155-TA1e-2227.96% 
Drosophila% 
EBI UniRef50UniRef50_D7CU769e-1022.59%Aldehyde Dehydrogenase n=37 Tax=cellular organisms RepID=D7CU76_TRURR
NCBI RefSeqXP_001662561.12e-0626.47%hypothetical protein AaeL_AAEL012427 [Aedes aegypti]
NCBI nr blastpgi|2976230833e-0922.59%unnamed protein product [Truepera radiovictrix DSM 17093]
NCBI nr blastxgi|2976230831e-0722.13%unnamed protein product [Truepera radiovictrix DSM 17093]
Group
Gene OntologyGO:00081528.3e-06metabolic process
GO:00551148.3e-06oxidation-reduction process
GO:00164918.3e-06oxidoreductase activity
KEGG pathwaytra:Trad_08405e-10 
 K00128 (E1.2.1.3)maps-> 1,2-Dichloroethane degradation
    Arginine and proline metabolism
    Glycolysis / Gluconeogenesis
    Propanoate metabolism
    Limonene and pinene degradation
    Tryptophan metabolism
    Lysine degradation
    Valine, leucine and isoleucine degradation
    Pyruvate metabolism
    beta-Alanine metabolism
    Fatty acid metabolism
    3-Chloroacrylic acid degradation
    Glycerolipid metabolism
    Ascorbate and aldarate metabolism
    Histidine metabolism
InterPro domain[214-402] IPR0161618.3e-06Aldehyde/histidinol dehydrogenase
Orthology groupMCL21016 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202918-TA
ATGGAGGGTATTAAGTTCGACGAAGCAAATGTTAGCACAAAACTAGCGGAGGATTATTTGAAGAAAAATCCAATATTTAAATCAAAAGGATTCGCCGACAGTGACAGATTCACGTTAAACGATCACAAAAACACAATATACTATACTAATTATACGAGAAACCTCGGGGCTGTGGTGTTGACTAAAGATGTAAGTGACCTTCCATTAAAATTGATGAACCTGGCCAAGAAAATTGAGGAGAATGTCGAACTTTTCTGCCAATTGGAACTGTTACTGCGAAAGATCCCCTTAGAAGATACAGAGCGCATAGATTTGAAACTCATGACACAAAACTTTATGTATCACAGTTCGTTAAAAAGCTCCAAGGAAGCGACGAGCACAGCTCGCCAGATTACGTTCGAGTCGGCCACGCCGCTGGTGGCGCTGGGTTTCATCAGCCAGGTGCTGATGGCTCACATGCCGAAGGTCGTCATTGAATGCACCAAGGAAACAGCTCCCGTCACCACCCTCTTCATTGAACTTTGTCACCAAGTCGGATTGGAGGGAAAGGTACTTCTAGCGATTTTACCTGAAGTTATAGAAGGCGGTTCCTCCAAAGTGACGATGGATTTTACGAAAATGGGTGAATTTCTGAACGGCTGCGTGGGAGTTGTGAGCGGGAAAAGTGACATCGACTCCGCCGTTGAGTACTTCTTGGACGCGTCCAGTCGATATCCCTGGGCGTTGAGGAAGATTTTTGTTCAAGAAAATGCTGTGGAAAGATTCACCACCACCATGACCTGGAAGGAGGAGAAGCAGATGGAAGTAACATCAGCCGGGAAAAGCAAAAGCGTTACGGACCGCGAATTATCATACTTCTTTGGGGAAAAAACGTTTCTGATGAAGCCGGGTCGAGATAGCACTGAACATGACAATAATGTTGTCATCCTTGAGGCTTTTAGGACGGTGAAGGAACTGATAGGTCTGCTTGCGAACGAGAAGCCGTTCGCGCTCTCCATCTGGTGCAGTGATATTTCAGAGACGAATGAGCTCGCCCACAACGTAGACGCCAGCATCGTGTGGGTCAACGACTTCGCCAACTTCGAAGGACCACCTCGCTCTTCGCAAGCCTTCTTTTCCCTCATCGATATTTACTTCAGTTCTCAGCATATCGAACAATTTCCTGAAATGGCTGAATTGACGAAACTCAAAGAATCGTGGTTGGAACTCAGCGTTGAGCAAAGAAGAGCTGTCGTAAGAGATGCATTGGCAAAAATAGATAGGAACCAGTCCAAGAAAATGTTAGATGTCCTTGATGATATGACGACGGAAATGGAAAGTTTCGTTTACCTCACCAAGAATATGATAGCGACTGGCATTGAACTACAACCGCAAGCGATGATGCCGAGCGCGATGTACGACCACGGACTGGAATCGTCGATAATGTCTTACATTATAAAGGGTGGTGCCATCCTGCTACACATACCTCCGCGTGATCTGCCAGTAAAAAGCAAAGATATTTCTTACATGTTCTATGATAACTTACGGAACATGGCCGGTCCCGTACTATTTCTAGAGAAAGAGTATAAAATCGGTGACGTGTCTGTTATAAGGCATCCAATTAAAAGATATAAAGTCATTTGGACCAATTTCGGAACGATATTCGCTAATTAA

Protein sequence:

>DPOGS202918-PA
MEGIKFDEANVSTKLAEDYLKKNPIFKSKGFADSDRFTLNDHKNTIYYTNYTRNLGAVVLTKDVSDLPLKLMNLAKKIEENVELFCQLELLLRKIPLEDTERIDLKLMTQNFMYHSSLKSSKEATSTARQITFESATPLVALGFISQVLMAHMPKVVIECTKETAPVTTLFIELCHQVGLEGKVLLAILPEVIEGGSSKVTMDFTKMGEFLNGCVGVVSGKSDIDSAVEYFLDASSRYPWALRKIFVQENAVERFTTTMTWKEEKQMEVTSAGKSKSVTDRELSYFFGEKTFLMKPGRDSTEHDNNVVILEAFRTVKELIGLLANEKPFALSIWCSDISETNELAHNVDASIVWVNDFANFEGPPRSSQAFFSLIDIYFSSQHIEQFPEMAELTKLKESWLELSVEQRRAVVRDALAKIDRNQSKKMLDVLDDMTTEMESFVYLTKNMIATGIELQPQAMMPSAMYDHGLESSIMSYIIKGGAILLHIPPRDLPVKSKDISYMFYDNLRNMAGPVLFLEKEYKIGDVSVIRHPIKRYKVIWTNFGTIFAN-