Monarch geneset OGS2.0

DPOGS207059
TranscriptDPOGS207059-TA1782 bp
ProteinDPOGS207059-PA593 aa
Genomic positionDPSCF300001 + 2234733-2236514
RNAseq coverage204x (Rank: top 47%)
Annotation
HeliconiusHMEL0101910.082.34% 
BombyxBGIBMGA013005-TA0.077.98% 
DrosophilaCG9521-PA5e-16249.41% 
EBI UniRef50UniRef50_UPI00015B46705e-16747.22%UPI00015B4670 related cluster n=3 Tax=unknown RepID=UPI00015B4670
NCBI RefSeqXP_972532.10.058.14%PREDICTED: similar to glucose dehydrogenase [Tribolium castaneum]
NCBI nr blastpgi|910852190.058.14%PREDICTED: similar to glucose dehydrogenase [Tribolium castaneum]
NCBI nr blastxgi|910852190.058.14%PREDICTED: similar to glucose dehydrogenase [Tribolium castaneum]
Group
Gene OntologyGO:00166146.2e-174oxidoreductase activity, acting on CH-OH group of donors
GO:00088126.2e-174choline dehydrogenase activity
GO:00506606.2e-174flavin adenine dinucleotide binding
GO:00551146.2e-174oxidation-reduction process
GO:00060666.2e-174alcohol metabolic process
KEGG pathwaydpo:Dpse_GA218496e-156 
 K00108 (E1.1.99.1, betA, CHDH)maps-> Glycine, serine and threonine metabolism
InterPro domain[1-590] IPR0121326.2e-174Glucose-methanol-choline oxidoreductase
[25-320] IPR0001729.3e-83Glucose-methanol-choline oxidoreductase, N-terminal
[433-577] IPR0078674.7e-39Glucose-methanol-choline oxidoreductase, C-terminal
Orthology groupMCL10159 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207059-TA
ATGAATTTTTTACAAGAAGGTACGAATCAACGTGACAATGAACCACCCGACCAAGTTAATTTGTTGACGGAGTACGACTTCATTGTTGTTGGTGCGGGAACAGCTGGGTGCGTTGTGGCTAACCGATTAACAGAATTAAAGGACGTGAAAGTTCTACTCTTAGAAGCTGGAGTTAATGAGAACTACGTTATGGACATACCAATTCTAGCAAATTATCTGCAGTTCACTGAAGCGAACTGGGGATACAAGACGAAACCCTCGAAAAAATATTGTGCAGGTTTCGAAAATCAGCAATGTAATTGGCCACGCGGAAAAGTTGTCGGTGGATCAAGTGTCCTAAATTATATGATATACACACGAGGGGCTGCAGATGATTATAACAATTGGGCATCAAAAGGTAATGAAGGCTGGGGATGGGACGATGTACTGGATTATTTCAAAAAAATTGAAAATTACAACATACCAGCCTTTGACGATCCTAAATATCACGGCCATGACGGCCATGTTAATGTAGAGTATGCACCATTTCGTACAACAAAAGGAAAAGCTTGGGTTAAAGGGGCCCAAGAATTAGGCTTTAAGTATAATGATTACAATGGACAAAATCCAAGTGGTGTCTCTTTCCTACAACTGTCTATGAAGAACGGAACAAGGCACAGTTCCAGTCGAGCATATCTTCATCCTATAAAGAAAAGAAATAATTTACACGTATCTAAAGTGAGCATGGCTACGAGATTACTGTTCGATACAACAAAAACTCGTGTAATTGGAGTCGAATTCGAGAAACGAGGAAAGCGCTATAAAATATTAGCAAAAAAAGAGATCATTGTATCGGCTGGTGCAATCAATTCACCTCAACTCCTCATGTTATCAGGAATAGGCCCTAAAAAGCATTTAGAGTCACTAAATATTCCAGTTGTAAAAGATTTACCTGTAGGATATAATCTAATGGACCACATTGCCGCCGGTGGACTCCAATTTATTGTTCAACAACAAAACCTCAGTCTGTCTACTGGTTATATTTTAAACCATTTAGAATTGGTATTTAAGTGGATGCGGAATCATAAAGGACCGTTGTCTGTGCCTGGTGGTTGCGAAGCATTAGTATTTTTGGATTTAAAAGATAGATTTAACGTGAGCGGCTGGCCGGACTTAGAACTGCTTTTTATAAGTGGGGGATTAAATTCAGATCCTTTGTTAAGAAGAAATTTTGGTTTCGATGAACAAATATTCACAGACACCTATACAGCTCTAGGTAATAATGAAGTTTTTATGGTTTTTCCAATGTTGATGAGACCAAAATCAAGAGGCAGGGTAATGTTACAAAACAGAAATCCAAAGTCACATCCGATATTAATCCCAAATTACTTTGATGATCCAGAAGATTTGCAAAAAATTGTGGAAGGCATCAAAGTGGCAATTGAGATAACTCGTCAACCGTCAATGAAAAAGATACAAACGAAATTATATGACGTTCCTATCGCTGACTGTCTGAAGTATGGGCCTTTCGGCAGTGACGAGTACTTCGCGTGTCAAGCACAAATGTTCACTTTTACAATTTACCATCAAAGTGGGAGTTGTAAAATGGGTGTCAAAAGTGATCCTACAGCGGTTGTAGATCCTAGACTAAGAGTACATGGTATAGAAAATCTAAGAGTAATCGATGCTAGTATAATGCCAGAAATTGTTTCAAGTCATACAAATGCCCCAACATTCATGATAGCAGAAAAGGGCGCAGACATGATTAAAGAAGACTGGGGGAGAAAATCGCAGAACATGTAA

Protein sequence:

>DPOGS207059-PA
MNFLQEGTNQRDNEPPDQVNLLTEYDFIVVGAGTAGCVVANRLTELKDVKVLLLEAGVNENYVMDIPILANYLQFTEANWGYKTKPSKKYCAGFENQQCNWPRGKVVGGSSVLNYMIYTRGAADDYNNWASKGNEGWGWDDVLDYFKKIENYNIPAFDDPKYHGHDGHVNVEYAPFRTTKGKAWVKGAQELGFKYNDYNGQNPSGVSFLQLSMKNGTRHSSSRAYLHPIKKRNNLHVSKVSMATRLLFDTTKTRVIGVEFEKRGKRYKILAKKEIIVSAGAINSPQLLMLSGIGPKKHLESLNIPVVKDLPVGYNLMDHIAAGGLQFIVQQQNLSLSTGYILNHLELVFKWMRNHKGPLSVPGGCEALVFLDLKDRFNVSGWPDLELLFISGGLNSDPLLRRNFGFDEQIFTDTYTALGNNEVFMVFPMLMRPKSRGRVMLQNRNPKSHPILIPNYFDDPEDLQKIVEGIKVAIEITRQPSMKKIQTKLYDVPIADCLKYGPFGSDEYFACQAQMFTFTIYHQSGSCKMGVKSDPTAVVDPRLRVHGIENLRVIDASIMPEIVSSHTNAPTFMIAEKGADMIKEDWGRKSQNM-