Monarch geneset OGS2.0

DPOGS215066
TranscriptDPOGS215066-TA1770 bp
ProteinDPOGS215066-PA589 aa
Genomic positionDPSCF300208 + 508141-510134
RNAseq coverage29x (Rank: top 76%)
Annotation
HeliconiusHMEL0020150.061.49% 
BombyxBGIBMGA005545-TA2e-12155.76% 
DrosophilaCG9518-PA1e-8933.70% 
EBI UniRef50UniRef50_UPI00022C901E4e-12140.91%UPI00022C901E related cluster n=1 Tax=unknown RepID=UPI00022C901E
NCBI RefSeqXP_968381.25e-11739.51%PREDICTED: similar to CG6142 CG6142-PA [Tribolium castaneum]
NCBI nr blastpgi|3838609269e-12643.41%PREDICTED: glucose dehydrogenase [acceptor]-like [Megachile rotundata]
NCBI nr blastxgi|3838609269e-12643.69%PREDICTED: glucose dehydrogenase [acceptor]-like [Megachile rotundata]
Group
Gene OntologyGO:00166146.2e-86oxidoreductase activity, acting on CH-OH group of donors
GO:00088126.2e-86choline dehydrogenase activity
GO:00506606.2e-86flavin adenine dinucleotide binding
GO:00551146.2e-86oxidation-reduction process
GO:00060666.2e-86alcohol metabolic process
KEGG pathwaydpo:Dpse_GA218493e-89 
 K00108 (E1.1.99.1, betA, CHDH)maps-> Glycine, serine and threonine metabolism
InterPro domain[1-566] IPR0121326.2e-86Glucose-methanol-choline oxidoreductase
[44-306] IPR0001721.4e-38Glucose-methanol-choline oxidoreductase, N-terminal
[414-555] IPR0078672.6e-35Glucose-methanol-choline oxidoreductase, C-terminal
Orthology groupMCL16224 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215066-TA
ATGACGCATATATACCTCTTCCAGTTCTTTATATTTAGTATATGTGTAAATTTTTTCTCAATATTCACTTGGTTAATTTACTACAGCGGTTTATTAACATCGAATTTGTTTCAAGATATCAAATCAGAGTATGATTATATTATAGTTGGATCTGGTACTGCTGGCTCCCTGATAGCACATAGACTAGCAGAAACCAACTATACATATATAGTAATTGAAGCCGGAGGTTTTGGTCATCACTTTCATGACATCCCAGCATTTGGACCATTACTTCTTGGATCGCACTTCGACTGGGGGTTTGAGACTGTTCCCCAGGACAATGCATGTCTTGCTATGGATGGCCATAAATGCAAACTCTCACAGGGAAAAATATTTGGGGGTTCATCCAAAATGAATAACATGATTCATGTTAGAGGGAATATATCTCATTATGTTGATTGGTTTCATGGGAAATATACTAAAGGGTATATCGAAGAACAGTTTAATTATATAGAGTCTGAAATATTTAATTTGAGTCCTCTCCAATATGATAGTAATTTAGGTAATGCTATATTAAATGCCACCAAGGAGTTAGGTTACAAGGAGATAAAAGATTTTGGTAATGGCTTTAAGAAATCAACTATTACCCAATATAATGGCAAAAGATGGACAACATCACATAACTTACAATTAGATCAGACAAATGTTTTGACTCATGTATTTATTGAAAAACTTTTGATAGAAAAATCTAAGTGTATTGGTGTCCAGACAAGAAATACAAAAATTCTAGCAAGGAAAGGTGTAATTCTTAGTGCAGGGACGATAAATTCAGCAAAAATACTTCAGCTTTCTGGCATAGGGCCGTCCGAGCTATTACACTCTCTTAATATACCCATTGTAAAGGACTTGCCAGTTGGAAAAAACCTTCAAGACCATATCGGTACTGGTTTAGACTTAGTGTTGTTTGATGAACCACAGTCAATTACAGCTTCAGATATTATGGATCCAATAAATGTTGTTCAATATTTCTATAGTGGTAAAGGACCTTTGACTACTCCGGGTTGTGAAGTTGTCGGTTTCATTTCTACGAAAAACGAAGAAATCCCTGACATACAGTTCATGGTGTTGCCAGTTGGCATAACTTCAGATAGAGGTTCACATTTGAGGAGGAACCTCGGTATTTCCGATGAAATATGGAAAAATTATTTCGAAAAAGTATTTCATAAGCACGCAGCAACATTCTTTCCGATAATACTTCATCCTAAAAGTAAAGGCGAAGTAAAAATACAAAGTAAAAATTCAAATGTACCGCCTCTTATTAATCCAAAATATTTATCTGACGAAAATGATATCAGAAGTCTGGTAGAAGGTGTTAAGTTTGTCATTAAATTATTAAAAACCGAATCGCTTAAAGTCATGAGCGCTCACATGAACGATACACCATTCCCGAGTTGCAAGAAATATAAAATATTTTCTGATTTATATCTTAAATGCTATGTTCAACATTTGACTTTAACGAGCTACCATCCTGTGGGTACATGTTCCATGGGGCTGCCAGAGTCCATAAATACAGTAGTAGACACTTCGTTTAGATTGTTAGGAGTAAAAAACTTGTATGTTGTCGATGGTTCAGTGCTACCAACCCTACCAAGCGGTAATATTAACGCCGCCATTGCTATGATGGGTAACATATTTTTTGAAAATGTAATACTTAACAATATTGATAGAATTGAAAATTGCCAAAGAATTTTCTTAGTGGAATTGCTGCAAAGCGTATGCTTAGCTAGATAA

Protein sequence:

>DPOGS215066-PA
MTHIYLFQFFIFSICVNFFSIFTWLIYYSGLLTSNLFQDIKSEYDYIIVGSGTAGSLIAHRLAETNYTYIVIEAGGFGHHFHDIPAFGPLLLGSHFDWGFETVPQDNACLAMDGHKCKLSQGKIFGGSSKMNNMIHVRGNISHYVDWFHGKYTKGYIEEQFNYIESEIFNLSPLQYDSNLGNAILNATKELGYKEIKDFGNGFKKSTITQYNGKRWTTSHNLQLDQTNVLTHVFIEKLLIEKSKCIGVQTRNTKILARKGVILSAGTINSAKILQLSGIGPSELLHSLNIPIVKDLPVGKNLQDHIGTGLDLVLFDEPQSITASDIMDPINVVQYFYSGKGPLTTPGCEVVGFISTKNEEIPDIQFMVLPVGITSDRGSHLRRNLGISDEIWKNYFEKVFHKHAATFFPIILHPKSKGEVKIQSKNSNVPPLINPKYLSDENDIRSLVEGVKFVIKLLKTESLKVMSAHMNDTPFPSCKKYKIFSDLYLKCYVQHLTLTSYHPVGTCSMGLPESINTVVDTSFRLLGVKNLYVVDGSVLPTLPSGNINAAIAMMGNIFFENVILNNIDRIENCQRIFLVELLQSVCLAR-