Monarch geneset OGS2.0

DPOGS206880
TranscriptDPOGS206880-TA2784 bp
ProteinDPOGS206880-PA927 aa
Genomic positionDPSCF300001 - 2087847-2095504
RNAseq coverage286x (Rank: top 38%)
Annotation
HeliconiusHMEL0104970.085.11% 
BombyxBGIBMGA014201-TA0.082.94% 
DrosophilaCG8665-PA0.062.02% 
EBI UniRef50UniRef50_B0WX460.064.78%10-formyltetrahydrofolate dehydrogenase n=15 Tax=Coelomata RepID=B0WX46_CULQU
NCBI RefSeqXP_969916.10.068.06%PREDICTED: similar to aldehyde dehydrogenase [Tribolium castaneum]
NCBI nr blastpgi|3838592220.068.60%PREDICTED: cytosolic 10-formyltetrahydrofolate dehydrogenase-like [Megachile rotundata]
NCBI nr blastxgi|3838592220.068.60%PREDICTED: cytosolic 10-formyltetrahydrofolate dehydrogenase-like [Megachile rotundata]
Group
Gene OntologyGO:00081526.5e-177metabolic process
GO:00551146.5e-177oxidation-reduction process
GO:00164916.5e-177oxidoreductase activity
GO:00166205.8e-69oxidoreductase activity, acting on the aldehyde or oxo group of donors, NAD or NADP as acceptor
GO:00090583.4e-51biosynthetic process
GO:00167423.4e-51hydroxymethyl-, formyl- and related transferase activity
GO:00038241.5e-14catalytic activity
GO:00000366e-07acyl carrier activity
GO:00480375.2e-06cofactor binding
KEGG pathwayxtr:4964360.0 
 K00289 (E1.5.1.6, FTHFD)maps-> One carbon pool by folate
InterPro domain[455-922] IPR0155906.5e-177Aldehyde dehydrogenase domain
[429-927] IPR0161614.4e-174Aldehyde/histidinol dehydrogenase
[446-711] IPR0161622.1e-105Aldehyde dehydrogenase, N-terminal
[712-898] IPR0161635.8e-69Aldehyde dehydrogenase, C-terminal
[15-217] IPR0023763.4e-51Formyl transferase, N-terminal
[219-334] IPR0110341.5e-14Formyl transferase, C-terminal-like
[222-333] IPR0057934.3e-11Formyl transferase, C-terminal
[344-415] IPR0090816e-07Acyl carrier protein-like
[356-413] IPR0061635.2e-06Phosphopantetheine-binding
Orthology groupMCL12052 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206880-TA
ATGCCACCAGTAGCTGTGCCCGAGGAGCCTTCCAAGAAAAAGCTCCGTGTAGCCATAATAGGTCAGAGTACTTTCGCTGCTGAGGTGTTCAAATTACTACAGAGAGATGGTCATGAAGTGGTCGGTGTATTCACAGTACTCGATAAAGGAAATCGAGAAGACCCGTTGGCCACGATCGCAGCCCAGAACGGTAAACCAGTGTTCAAATACAAGACGTGGAGGGTGAAAGGACAAGTTATACCGGGAATATTAGAGGAGTACAAATCTGTAAACGCCGACATCAACGTTCTACCGTTCTGTACTCAATTTATTCCCATGGAAGTTATTCTGCATCCGAAATATCAGAGTATCTGCTACCATCCCAGCATCTTACCTAGGCATAGAGGAGCGTCGTCTATCAATTGGACCCTTATCGAAGGTGACACCACCTGCGGTCTCACTATCTTCTGGGCAGATGACGGCTTGGACACTGGGCCCATTTTACTACAGAGAAGTTTTCCTTGCACTATTGATGACACTGTTGACTCACTGTACAATAAATATTTATATCCTGAGGGCATCAAAGCATTGGCAGAATCTGTGAATATGGTGGCTAATGGAGTAGCTCCGCGGATAAAACAAACCGAAGAGGGAGCTACATACGATCCAGCACTCTTTAAACCAGAAACACATCAGATTGATTGGTCTAAAGGTGGTCTCGCTTTACACAATTTTATACGTGGTCTGGATTCATCCCCTGGCGCTACCACCTTCATAAAACCACAAAACAAAGACGGCGTTGATAAAACCAGTGATGCCAACATTGAAATTAAGTTCTTTGGTTCCTCATTGTGGGAAGCGGAGTACGAAACAGAGGGTGATAAATTATTTATCACAGGATTAAACAAACCAGCCGTGGTACACGCTGATGGTTTATTAATAACAGCTAATGATGGAATTAAGCTTAACATTCAGAGGTTGAAAGTAAACGGTAAGATGATTAATGCCCAAAACTTTTATAAAGGCAGTGAGAACAAAGTCTCCCTTGATTTAACTGCGGAAGAGAAACAGTTCATAGAAAAAGCACGTGATGTTTGGAAAGCAATATTAAGAATCGAAATAGAAAACGATACTGACTTCTTCGCTTCTGGAGCAGGCTCCATGGACGTTGTCAGATTAGTAGAAGAAATAAAGGATATATCAGAACTCGAATTACAAAATGAAGATATTTACATGAACACAACATTTGAAGATTTTTATAATGCAGCTATACTTAAACAAAGAGGTGGTTCAGGTAGCAAGGAAGTTATTTACGACGGGGTAGAGATGGAAATTAATAAAATGAAAATTAAATTTCCGACGCAACTATTCATCAATGGAGAATTCGTTAATTCTGACGGCGGAAAAACAACAGCTATAGTAAATCCCACAGATGAATCTGTTATATGCAAAGTTCAAGCTGCCACAGTATCTGATGTCGACAGGGCTGTTAAAGCTGCTGAGAAGGCCTTCGGAGAAGGAGAATGGTCCAAAATCAGCGCAAGGGAAAGAGGACAGTTATTATTCAAGTTGGCGGATCTAATGGAGCAGCATAAAGAAGAATTAGCCACAATCGAGTCAATAGATTCAGGAGCAGTTTACACTCTAGCACTAAAAACTCACGTGGGCATGTCCATCGAGACATGGAGATATTTCGCCGGCTGGTGTGACAAGATTACAGGTTCAACCATTCCTATTAGCCATGCAAGACCAAATAAAAATCTGACGTTGACAAAAAGAGAACCCATTGGTGTCTGCGGACTGATCACTCCTTGGAATTATCCATTGATGATGTTATCCTGGAAAATGGCCGCTTGCTTGGCAGCTGGCAACACTGTCGTTATGAAACCAGCAGCGGTATGCCCACTCACCGCACTCAAATTCGCTGAGTTGTGCGTGCTAGCCGGCATTCCACCGGGAGTTGTTAATATTGTAACGGGAAGCGGAGCCCTGGCAGGACAAGCCCTTGCTGATCATCCTCGTATCAGGAAGCTTGGATTTACTGGCAGTACTGAAATTGGACAAACTATTATGAAGTCTTGTGCAGCATCAAATTTGAAAAAGGTGTCCTTAGAACTGGGAGGCAAATCTCCATTGATCATCTTTGAAGATTGTGATCTCGATAAAGCAGTTAAAAATGGTATGGCATCAGTATTTTTCAACAAGGGTGAGAATTGCATAGCAGCCGGTCGTTTATTCGTGGAAGAGAAAATACACGACGAGTTTGTTAGACGTGTTGTGGAAGAAACCAAGAAAATGAGCATCGGAGATCCATTAAACAGAGGAACTGCTCATGGCCCACAAAACCACAAAGCCCATATGGATAAACTTATATCGTACGTTGAGACAGGAGTAAAGGAAGGCGCAAAACTGGTTTACGGTGGAAAACGCCTAGATAGACCAGGATACTTCTTCCAACCGACTATATTTACTGATGTCACCGATAATATGGTCATTGCTAAAGAGGAATCTTTTGGACCCATTATGATCATTAGCAAATTTAGCAGCAATAACCTGGATGAAGTGATCCGTCGTGCAAACAACACTGAATATGGGCTAGCGAGCGGCGTATTCACGAAAGACGTTTCACGTGCACTGCGCGTCGCTGAGCGCGTGGAGGCTGGTACCGTCTTCGTGAACACATACAATAAGACCGATGTCGCGGCGCCGTTCGGCGGATTCAAACAGAGTGGTTTTGGAAAGGATCTAGGTCAAGAAGCTCTTAATGAATACCTCAAGACTAAATGTATTACTATAGAATATTGA

Protein sequence:

>DPOGS206880-PA
MPPVAVPEEPSKKKLRVAIIGQSTFAAEVFKLLQRDGHEVVGVFTVLDKGNREDPLATIAAQNGKPVFKYKTWRVKGQVIPGILEEYKSVNADINVLPFCTQFIPMEVILHPKYQSICYHPSILPRHRGASSINWTLIEGDTTCGLTIFWADDGLDTGPILLQRSFPCTIDDTVDSLYNKYLYPEGIKALAESVNMVANGVAPRIKQTEEGATYDPALFKPETHQIDWSKGGLALHNFIRGLDSSPGATTFIKPQNKDGVDKTSDANIEIKFFGSSLWEAEYETEGDKLFITGLNKPAVVHADGLLITANDGIKLNIQRLKVNGKMINAQNFYKGSENKVSLDLTAEEKQFIEKARDVWKAILRIEIENDTDFFASGAGSMDVVRLVEEIKDISELELQNEDIYMNTTFEDFYNAAILKQRGGSGSKEVIYDGVEMEINKMKIKFPTQLFINGEFVNSDGGKTTAIVNPTDESVICKVQAATVSDVDRAVKAAEKAFGEGEWSKISARERGQLLFKLADLMEQHKEELATIESIDSGAVYTLALKTHVGMSIETWRYFAGWCDKITGSTIPISHARPNKNLTLTKREPIGVCGLITPWNYPLMMLSWKMAACLAAGNTVVMKPAAVCPLTALKFAELCVLAGIPPGVVNIVTGSGALAGQALADHPRIRKLGFTGSTEIGQTIMKSCAASNLKKVSLELGGKSPLIIFEDCDLDKAVKNGMASVFFNKGENCIAAGRLFVEEKIHDEFVRRVVEETKKMSIGDPLNRGTAHGPQNHKAHMDKLISYVETGVKEGAKLVYGGKRLDRPGYFFQPTIFTDVTDNMVIAKEESFGPIMIISKFSSNNLDEVIRRANNTEYGLASGVFTKDVSRALRVAERVEAGTVFVNTYNKTDVAAPFGGFKQSGFGKDLGQEALNEYLKTKCITIEY-