Monarch geneset OGS2.0

DPOGS215384
TranscriptDPOGS215384-TA2814 bp
ProteinDPOGS215384-PA937 aa
Genomic positionDPSCF300088 - 446149-460819
RNAseq coverage325x (Rank: top 35%)
Annotation
HeliconiusHMEL0174191e-17989.06% 
BombyxBGIBMGA012397-TA3e-16888.06% 
DrosophilaCG11859-PA4e-14272.33% 
EBI UniRef50UniRef50_Q7PXQ63e-14273.58%AGAP001526-PA n=14 Tax=Eukaryota RepID=Q7PXQ6_ANOGA
NCBI RefSeqNP_001037592.12e-16479.13%sorbitol dehydrogenase [Bombyx mori]
NCBI nr blastpgi|951030821e-16379.42%sorbitol dehydrogenase [Bombyx mori]
NCBI nr blastxgi|951030826e-16279.42%sorbitol dehydrogenase [Bombyx mori]
Group
Gene OntologyGO:00055245.2e-56ATP binding
GO:00046745.2e-56protein serine/threonine kinase activity
GO:00038243.4e-52catalytic activity
GO:00054883.5e-35binding
GO:00064685.2e-33protein phosphorylation
GO:00167724.2e-29transferase activity, transferring phosphorus-containing groups
GO:00551147.4e-29oxidation-reduction process
GO:00164917.4e-29oxidoreductase activity
GO:00082701.3e-24zinc ion binding
KEGG pathway 
InterPro domain[301-476] IPR0110327.5e-59GroES-like
[66-289] IPR0006875.2e-56RIO kinase
[108-279] IPR0189343.4e-52RIO-like kinase
[468-591] IPR0160403.5e-35NAD(P)-binding domain
[9-91] IPR0152855.2e-33RIO2 kinase, winged helix, N-terminal
[94-273] IPR0110094.2e-29Protein kinase-like domain
[326-436] IPR0131547.4e-29Alcohol dehydrogenase GroES-like
[476-605] IPR0131491.3e-24Alcohol dehydrogenase, C-terminal
Orthology groupMCL11197 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215384-TA
ATGGGTAAACTCGATGTAGCAATATTGCGATATCTAACACCGGAAGATTTTCGCGTTCTTACTGCAGTGGAAATGGGTATGAAGAATCACGAACTGGTTCCAGGCTCTTTAGTGGCTTCTATAGCCAATTTGCGACACGGCGGAGTCCACAAACTGATGAAAGATCTTTGTAAACATAGATTGTTGACATACGAACGTGGTAAACACTATGATGGCTACCGTCTTACAAATGCTGGTTATGATTACTTGGCTCTTAAAGCTTTAACAAACAGAAAGGTTATTGCCTCTTTTGGTAACCAAATTGGTGTTGGAAAGGAATCTAACATTTACACCGTAGCTGATGAAGACAGAAACCCGCTATGTTTGAAACTTCATAGGCTGGGCAGAACATGCTTCCGGAACATCAAAGACAAGAGAGACTATCATGCACACCGTAACCGGGCCTCATGGCTGTATCTCTCCCGCATATCAGCTACCAAGGAATTTGCATACATGAAGGCTTTATATGACCGTGGCTTCCCAGTACCAAAACCCATTGATTTTAACAGACACTGTGTTGTCATGCAGCTGGTTGGAGGGGGACCTTTAACCCATGTGTCAGCAGTAGATGATGTGGAAGCACTTTATGACGAGTTAATGAACCTTATAGTGAGACTCGGCAACTGTGGTGTCATTCATGGTGACTTCAACGAGTTCAACATCATGATAGATGAAGAAGGACACCCCATCATCATAGACTTCCCTCAGATGATCTCCACAATGCATCCCAATGCGGAACTATACTTTGACCGAGATGTTCAATGTGTCCGAGCGTTCTTTAAGAAGAGGTTTGGATACGAATCCAGTCTGTACCCCAAGTTCTCTGAACTGGAGCGAGATGAAGACTTGGATCGAGAGGTGGCTTGTTCAGGGTACCGGAAGGAACATGATCACCAGCTGTTACAGGAACAAACGCCGATTCCTGAAATTGGCGACGATGAGGTTCTTCTTCGTATGGACTGTGTGGGCATTTGCGGCTCCGACGTCCACTACTGGCAGGGCGGCAGCTGTGGACATTTTGTATTGAAGGATCCTATGATTATGGGACACGAGGCCTCTGGCGTCGTCGCGAAGGTGGGGGGAAACGTTAAAAATCTGTGTGTGGGCGATCGTGTGGCTATTGAACCGGGTGTTCCGTGCCGCTACTGCGAGTTCTGTAAGACTGGACGGTACCACCTCTGCCCTGACATACAGTTCTGCGCCACGCCTCCCGTCCACGGAAACCTCTCCAGATACTACAAACACGCCGCGGACTTTTGTTACAAATTGCCAGATCATGTCTCTATGGAGGAAGGTGCTTTGTTGGAACCTCTATCAGTGGGAATCCACGCGTGTCGTCGCGGAGGCGTCACAGCTGGGGACTTCGTGCTGATACTAGGAGCTGGTCCCATAGGCCTCGTCACACTCCTCGCAGCCAGGGCCATGGGCGCCAGCAAGATCGTGATCACAGACATCTTGGAGTCTCGGCTGGAGACAGCCCGTGCGTTGGGCGCGGATCATACGTTGTTGGTGTCTCGTGACTCCAACGAGGCGGACCTGGTCCGAGCACTCCACGACCTCCTGGGGGCGCACCCCGATGTGTCCGTGGACGCCAGCGGAGCGCCCGCTACCGTGCGACTCGCGCTACTGGCCACTAAGTCAGGGGGTTGTGCTGTCCTGGTCGGTATGGGCAGCCCTGAGGTCACCCTGCCTCTGGCCGGGGCCATGGCGCGAGAGGTCGACATCAGAGGCATCTTCAGATACGTCAACGAATACCCCATCGCTCTATCGCTGGTGTCGAGCGGTCAGATCAACCTGAAGCCGCTGGTGACGCACCACTTCTCACTGGAGGAGACCTTGGAGGCCTACGAGGTCGCGCGGAGAGGAGCCGGCATCAAGGTCATGATACACGTCCAGCCGAGGGATGCCAACAACAAAGTGGGGGGAAACGTTAAAAATCTGTGTGTGGGCGATCGTGTGGCTATTGAACCGGGTGTTCCGTGCCGCTACTGCGAGTTCTGTAAGACTGGACGGTACCACCTCTGTCCTGACATACAGTTCTGCGCCACGCCTCCCGTCCACGGAAACCTCTCCAGATACTACAAACACGCCGCGGACTTCTGTTACAAATTGCCAGATCATGTCTCTATGGAGGAAGGTGCTTTGTTGGAACCTCTATCAGTGGGAATCCACGCGTGTCGTCGCGGAGGCGTCACAGCTGGGGACTTCGTGCTGATACTAGGAGCTGGTCCCATAGGCCTCGTCACACTCCTCGCAGCCAGGGCCATGGGCGCCAGCAAGATCGTGATCACAGACATCTTGGAGTCTCGGCTGGAGACAGCCCGTGCGTTGGGCGCGGATCATACGTTGTTGGTGTCTCGTGACTCCAACGAGGCGGACCTGGTCCGAGCACTCCACGACCTCCTGGGGGCGCACCCCGATGTGTCCGTGGACGCCAGCGGAGCGCCCGCTACCGTGCGACTCGCGCTACTGGCCACTAAGTCAGGGGGTTGTGCTGTCCTGGTCGGTATGGGCAGCCCTGAGGTCACCCTGCCTCTGGCCGGGGCCATGGCGCGAGAGGTCGACATCAGAGGCATCTTCAGATACGTCAACGAATACCCCATCGCTCTATCGCTGGTGTCGAGTGGTCAGATCAACCTGAAGCCGCTGGTGACGCACCACTTCTCACTGGAGGAGACCTTGGAGGCCTACGAGGTCGCGCGGAGAGGAGCCGGCATCAAGGTCATGATACACGTCCAGCCGAGGGATGCCAACAACAAAGTCAAATTCCAATGA

Protein sequence:

>DPOGS215384-PA
MGKLDVAILRYLTPEDFRVLTAVEMGMKNHELVPGSLVASIANLRHGGVHKLMKDLCKHRLLTYERGKHYDGYRLTNAGYDYLALKALTNRKVIASFGNQIGVGKESNIYTVADEDRNPLCLKLHRLGRTCFRNIKDKRDYHAHRNRASWLYLSRISATKEFAYMKALYDRGFPVPKPIDFNRHCVVMQLVGGGPLTHVSAVDDVEALYDELMNLIVRLGNCGVIHGDFNEFNIMIDEEGHPIIIDFPQMISTMHPNAELYFDRDVQCVRAFFKKRFGYESSLYPKFSELERDEDLDREVACSGYRKEHDHQLLQEQTPIPEIGDDEVLLRMDCVGICGSDVHYWQGGSCGHFVLKDPMIMGHEASGVVAKVGGNVKNLCVGDRVAIEPGVPCRYCEFCKTGRYHLCPDIQFCATPPVHGNLSRYYKHAADFCYKLPDHVSMEEGALLEPLSVGIHACRRGGVTAGDFVLILGAGPIGLVTLLAARAMGASKIVITDILESRLETARALGADHTLLVSRDSNEADLVRALHDLLGAHPDVSVDASGAPATVRLALLATKSGGCAVLVGMGSPEVTLPLAGAMAREVDIRGIFRYVNEYPIALSLVSSGQINLKPLVTHHFSLEETLEAYEVARRGAGIKVMIHVQPRDANNKVGGNVKNLCVGDRVAIEPGVPCRYCEFCKTGRYHLCPDIQFCATPPVHGNLSRYYKHAADFCYKLPDHVSMEEGALLEPLSVGIHACRRGGVTAGDFVLILGAGPIGLVTLLAARAMGASKIVITDILESRLETARALGADHTLLVSRDSNEADLVRALHDLLGAHPDVSVDASGAPATVRLALLATKSGGCAVLVGMGSPEVTLPLAGAMAREVDIRGIFRYVNEYPIALSLVSSGQINLKPLVTHHFSLEETLEAYEVARRGAGIKVMIHVQPRDANNKVKFQ-