Monarch geneset OGS2.0

DPOGS209785
TranscriptDPOGS209785-TA1524 bp
ProteinDPOGS209785-PA507 aa
Genomic positionDPSCF300117 - 1043738-1045261
RNAseq coverage1833x (Rank: top 7%)
Annotation
HeliconiusHMEL0121640.091.29% 
BombyxBGIBMGA008010-TA0.088.17% 
DrosophilaAlas-PA6e-15151.76% 
EBI UniRef50UniRef50_B4J6629e-15353.50%GH20180 n=3 Tax=Drosophila RepID=B4J662_DROGR
NCBI RefSeqXP_312882.42e-16158.02%AGAP003184-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|2700069074e-16054.80%hypothetical protein TcasGA2_TC013340 [Tribolium castaneum]
NCBI nr blastxgi|2700069072e-15554.80%hypothetical protein TcasGA2_TC013340 [Tribolium castaneum]
Group
Gene OntologyGO:00330148.5e-169tetrapyrrole biosynthetic process
GO:00038708.5e-1695-aminolevulinate synthase activity
GO:00301708.5e-169pyridoxal phosphate binding
GO:00038245.3e-86catalytic activity
GO:00167691e-66transferase activity, transferring nitrogenous groups
GO:00090581e-66biosynthetic process
KEGG pathwayaga:AgaP_AGAP0031847e-161 
 K00643 (E2.3.1.37, ALAS)maps-> Porphyrin and chlorophyll metabolism
    Glycine, serine and threonine metabolism
InterPro domain[75-473] IPR0109618.5e-169Tetrapyrrole biosynthesis, 5-aminolevulinic acid synthase
[69-475] IPR0154247.2e-103Pyridoxal phosphate-dependent transferase, major domain
[137-353] IPR0154215.3e-86Pyridoxal phosphate-dependent transferase, major region, subdomain 1
[119-464] IPR0048391e-66Aminotransferase, class I/classII
[354-470] IPR0154227.7e-29Pyridoxal phosphate-dependent transferase, major region, subdomain 2
Orthology groupMCL11432 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209785-TA
ATGCCCTGTCCGTTTTTAGGATCAATGAATCAAACCTTTTTAAGGAATTACAGCAGCGTACTGCTAAAGCAGTATGGAAACTACTGCCCTATACTTTCAAGAAACTTTCGCTCTCTGGGCGTAGATGAAACGAAATGCCCATTTATTCAAAAGAACTCCATTATTTCGGAAGCGCCTAAAGAAATGACTGAAGATATTGTAGACAGCGCTGCACCAACTTATCAATATGAAAATTTCTTTAGCAAACAGATCAATGCTAAAAAAAATGATTACTCGTATCGAGTATTTAGAAAGGTGTCACGACTGGCGGCCGAGGGCGCGTATCCGCAGGCATTGGAAGGGTCTGACAACCGTCGCGTGACCGTGTGGTGCGCCAACGACTACCTCGGAGCATCGCGTCACCCCGTTGTCCAGGATGCTGCAATTTCTGCCATTAGATCCTACGGAACCGGAGCGGGAGGCACTCGTAACATCGCTGGTAACTCACAGATGACTGAAAAACTAGAACATGAGATAGCCAAACTCCATAAAAAACCCGCAGCTTTAATATTTAGTTCCTGTTTCGTTGCCAATGATGCGACTCTCTCTACATTAGCGAAAATACTACCAGGATGTATCGTTTACTCCGACGCGGGTAACCATGCATCCATGATACAGGGTATAAGGAACAGCCGGGCTCCCAAACATATATTCAGGCACAACGACCCCACCCACCTTAGACAATTGTTAGCCGAATCTCCTGCGGGCGTACCGAAGCTAGTCGTATTTGAAACTGTGCATTCCATGAGTGGAGCGATATGTCCCTTAGAAGAAATGTGTAACATAGCCCACGAGTACGGCGCCTTGACATTCGTAGACGAAGTCCACGCTGTGGGATTATATGGGAAGCATGGAGCAGGTATCGGGGAAGAGAGGGGAGTCGAAGATCATATAGACATCGTGTCCGGTACTTTGGGTAAAGCGTACGGTAACGTTGGCGGATATATCGCGGGTTCATCACTCCTGATAGACACCGTTAGGTCTCTGGCGCCTGGATTCATATTCACCACGGCGCTGCCGCCTCCGATCTTGGCCGGGTCTTTAGCTGCGATAAGACTTCTAGCCAGCGAGGAAGGGAGATCGTTACGAGCGAAACATCAAGCTATCGTCCGCTATCTCAAGCTCTCGCTTCTGATCGCTGGTCTGCCGCAGATGCCGTCAGTGAGTCACATAGTCCCTGTACCCATCACCGGGGCGGACAAAGTGGCGTTGGTGGCGGAGTCGCTGATGAAGCGAGGCCACTACGTGCAAGCCATCAACTATCCGACGGTAGCCCGAGGAGAGGAGCGCCTACGTTTCGCTCCCGGCCCCTACCACACGCCGGGAATGATAGACAACCTCGTCACTGCCCTCATCGAGTCATTCCACGAGAACAATATTAGCTTTAACCAGTTCATGGTCAACGGAGCCTGCAGGGAATGCAGCATGGAGTATAAAGTAGACATCGCTTACGAGGAGCCCTACAAGTACCCGATAGCTGTATAA

Protein sequence:

>DPOGS209785-PA
MPCPFLGSMNQTFLRNYSSVLLKQYGNYCPILSRNFRSLGVDETKCPFIQKNSIISEAPKEMTEDIVDSAAPTYQYENFFSKQINAKKNDYSYRVFRKVSRLAAEGAYPQALEGSDNRRVTVWCANDYLGASRHPVVQDAAISAIRSYGTGAGGTRNIAGNSQMTEKLEHEIAKLHKKPAALIFSSCFVANDATLSTLAKILPGCIVYSDAGNHASMIQGIRNSRAPKHIFRHNDPTHLRQLLAESPAGVPKLVVFETVHSMSGAICPLEEMCNIAHEYGALTFVDEVHAVGLYGKHGAGIGEERGVEDHIDIVSGTLGKAYGNVGGYIAGSSLLIDTVRSLAPGFIFTTALPPPILAGSLAAIRLLASEEGRSLRAKHQAIVRYLKLSLLIAGLPQMPSVSHIVPVPITGADKVALVAESLMKRGHYVQAINYPTVARGEERLRFAPGPYHTPGMIDNLVTALIESFHENNISFNQFMVNGACRECSMEYKVDIAYEEPYKYPIAV-