Monarch geneset OGS2.0

DPOGS207061
TranscriptDPOGS207061-TA1806 bp
ProteinDPOGS207061-PA601 aa
Genomic positionDPSCF300001 + 2251898-2255640
RNAseq coverage10x (Rank: top 84%)
Annotation
HeliconiusHMEL0101940.075.83% 
BombyxBGIBMGA013009-TA0.074.88% 
DrosophilaCG6142-PA1e-14845.66% 
EBI UniRef50UniRef50_D2A3N83e-16652.26%Putative uncharacterized protein GLEAN_15725 n=4 Tax=Endopterygota RepID=D2A3N8_TRICA
NCBI RefSeqXP_972797.21e-16652.25%PREDICTED: similar to glucose dehydrogenase [Tribolium castaneum]
NCBI nr blastpgi|2700090901e-16552.26%hypothetical protein TcasGA2_TC015725 [Tribolium castaneum]
NCBI nr blastxgi|2700090902e-16252.26%hypothetical protein TcasGA2_TC015725 [Tribolium castaneum]
Group
Gene OntologyGO:00166141.6e-158oxidoreductase activity, acting on CH-OH group of donors
GO:00088121.6e-158choline dehydrogenase activity
GO:00506601.6e-158flavin adenine dinucleotide binding
GO:00551141.6e-158oxidation-reduction process
GO:00060661.6e-158alcohol metabolic process
KEGG pathwaydme:Dmel_CG95182e-145 
 K00108 (E1.1.99.1, betA, CHDH)maps-> Glycine, serine and threonine metabolism
InterPro domain[1-602] IPR0121321.6e-158Glucose-methanol-choline oxidoreductase
[37-332] IPR0001723.5e-75Glucose-methanol-choline oxidoreductase, N-terminal
[446-589] IPR0078673.1e-37Glucose-methanol-choline oxidoreductase, C-terminal
Orthology groupMCL10024 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207061-TA
ATGCTGAAAACTCCAATGTATCAGTTCGAAAAGACCACGCCAAATATATTTGCATCGTTCAAAGATAACTACGAACTGCCAAAAGAATTCAAAGGCCCTTTGAAGGAATACGACTTCATTGTAGTAGGAGCAGGATCTGCAGGGAGTGTACTGGCTTCGAGACTTAGTGAAGGAAAACAAGCCTCAGTACTACTTTTAGAGGCTGGCCAAGGAGAAGCTATCCTTACAGGAGTGCCCATTCTGGCACCAATGTTACAACGAACTAATTACGTATGGCCTTACCTCATGGAGTATCAACCAGGAGTATGCATGGGTATGGAAAACGGGCGTTGTTTCTGGCCGCGAGGGAAAGCAGTCGGTGGCACAAGCGTCGTCAACTATATGATTTACACAAGAGGATTCAAGGAAGACTGGGACAGAATAGCCGCTAAAGGCAATTATGGATGGTCATACGACGACGTTATCCCGTACTACATAAAATCCGAGAGAGCAAAACTTCGTGGATTAAACAAATCCCCGTGGCACGGGAAAGATGGCGAGTTGAGCGTAGAGGATGTACCTTTTAGATCGAAACTATCAAAAGCATTTATGGATGCTGCAAAATTATTAGGACAGAGACAAGTCGACTATAACAGCCCAGACAGCTTTGGCTCGAGTTACATTCAAGCAACAATAAGTAAAGGAATACGAGCGAGTAGCGCGAGAGCATTTCTTCACAACAATAAGAAAAGAAAGAACCTCCACATCTTGACAAACAGTAGGGTGACAAGAATTATTATAGATCCATACACAAAAACAGCCATCGGTGTGGAGTTCCAAAGGGAGGGGAAAATGTACAATATTACAGCTAAAAAGGAAGTCATACTTAGTGCTGGACCCATCGAATCGCCACATTTGCTCATGTTATCAGGGATAGGACCCAGGGAGCATCTTCAAAGCATGGGAATTAATGTGATACAAGATCTTAGAGTTGGAGAGACTCTATATGACCATATATCTTTCCCGGCTTTAGCATTTACTTTAAACGCGACGAGATTGACTTTAGTAGAAAGAAAACTTGCCACGTTGGATAATGTTGTCCAGTACACACAGTATGGAGACGGACCGATGTCTTCTTTGGCTGGAGTAGAAACTTTAGGATATATTAAAACAGAACTATCTGATGAACCTGGTGATTATCCTGACATTGAACTCTTAGGTAGCTGCGCCTCTCTGGCGTCAGACGAAGGCGATGTAGTAGCTCGGGGAATAAGAATCGCTGATTGGCTATACAATGACGTCTACAGACCTATAGAAAATGTCGAAAGTTTCACAATACTGTTTATGCTTTTACATCCGAAATCTAAAGGGCACTTAAAGTTAAAATCGAAAAATCCATTTGAACAACCAAATCTCTATGGCAACTATTTAACACACCCTAAAGATGTAGCGACCATGATTGCAGCTATTCGATACATATTACGATTAGTAGACACCCCGCCATATCAAAAATATGGCGCTACATTACATACTAAAAAATTCCCTAATTGTATGTCATACCAATTTAACAGTGACGCTTATTGGGAGTGTGCTATTAGAACGGTGACGTCAACACTTCACCACCAAATCGCGACATGTAAAATGGGCCCCCCGCAAGACCCCGAAGCAGTTGTGGACCCCGAATTGCGAGTTTATGGAATAAAAAAATTACGAGTTATAGACTCAGGGGTTATACCTCAGACAATAGTAGCACACACTAACGCACCCGCTATTATGATAGGGGAGAAGGGTGCGGATTTAATAAAACGTACATGGGGTCTGCTCTAG

Protein sequence:

>DPOGS207061-PA
MLKTPMYQFEKTTPNIFASFKDNYELPKEFKGPLKEYDFIVVGAGSAGSVLASRLSEGKQASVLLLEAGQGEAILTGVPILAPMLQRTNYVWPYLMEYQPGVCMGMENGRCFWPRGKAVGGTSVVNYMIYTRGFKEDWDRIAAKGNYGWSYDDVIPYYIKSERAKLRGLNKSPWHGKDGELSVEDVPFRSKLSKAFMDAAKLLGQRQVDYNSPDSFGSSYIQATISKGIRASSARAFLHNNKKRKNLHILTNSRVTRIIIDPYTKTAIGVEFQREGKMYNITAKKEVILSAGPIESPHLLMLSGIGPREHLQSMGINVIQDLRVGETLYDHISFPALAFTLNATRLTLVERKLATLDNVVQYTQYGDGPMSSLAGVETLGYIKTELSDEPGDYPDIELLGSCASLASDEGDVVARGIRIADWLYNDVYRPIENVESFTILFMLLHPKSKGHLKLKSKNPFEQPNLYGNYLTHPKDVATMIAAIRYILRLVDTPPYQKYGATLHTKKFPNCMSYQFNSDAYWECAIRTVTSTLHHQIATCKMGPPQDPEAVVDPELRVYGIKKLRVIDSGVIPQTIVAHTNAPAIMIGEKGADLIKRTWGLL-