Monarch geneset OGS2.0

DPOGS204829
TranscriptDPOGS204829-TA3504 bp
ProteinDPOGS204829-PA1167 aa
Genomic positionDPSCF300221 + 327377-332913
RNAseq coverage35x (Rank: top 74%)
Annotation
HeliconiusHMEL0143920.052.24% 
BombyxBGIBMGA001568-TA2e-10755.29% 
DrosophilaCG14476-PE2e-1224.28% 
EBI UniRef50UniRef50_E3X5T89e-4925.59%Putative uncharacterized protein n=1 Tax=Anopheles darlingi RepID=E3X5T8_ANODA
NCBI RefSeqXP_968946.13e-6227.76%PREDICTED: similar to neutral alpha-glucosidase ab [Tribolium castaneum]
NCBI nr blastpgi|910766067e-6127.76%PREDICTED: similar to neutral alpha-glucosidase ab [Tribolium castaneum]
NCBI nr blastxgi|910766061e-6027.76%PREDICTED: similar to neutral alpha-glucosidase ab [Tribolium castaneum]
Group
Gene OntologyGO:00045539.6e-40hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059759.6e-40carbohydrate metabolic process
GO:00302468.2e-05carbohydrate binding
GO:00038248.2e-05catalytic activity
KEGG pathwaygga:4164625e-22 
 K12316 (GAA)maps-> Starch and sucrose metabolism
    Galactose metabolism
    Lysosome
InterPro domain[584-1046] IPR0003229.6e-40Glycoside hydrolase, family 31
[591-958] IPR0178532e-10Glycoside hydrolase, superfamily
Orthology groupMCL18463 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204829-TA
ATGATTCCCCACGCCAAAACTATGGAGGAAGCTTTGGACTCTGAAATAAGATATCAGAACCTTAAAATTATTAACCATATAACATGGTACGACCGAATTCTATTAAATCGACCAATAAAAGTCGTACTTGCTTTATTTTTGGCAGCGGTCGTCAGTCCCTTGTTGTTATACAGATACTCTTTCTTCGTCAGTATAAACCTGCCACCATCTGACGGGTTTTCTATAGGATCATGTTTAATTCCTCGATCGACAAGACTTCCCTGTGGTAATGGATCCAGCTTACAAGAGCATTGTCACAGCCAGTGCTGCTTTGATCTCAATCTCCATGTCTGCTATCATCGTCTACCCTCAAGGTTTTCCTATATAATGAATCAGCCTTGGAATGAAAATATAACATTGTCGCCTCGGATAGCCACAGAGCCTTACGCTTTTCAGAACAGTATACCTGCAGTCAGACTGTCCGTTGATGAAGTAACAGCCACACATATGACTCTCACGTTTTATAATTCGAGAAATATATCTCTTAATGGAAGGAGACTCCAAAACAAGGAGTATTCGTATACAGTCACGTCGCCGGAGTTGAATATCGTCGTGAGAGCATCAAATAGAACAATTTTTAACACAGCAAGTGGACCCTTCATAGCATCGGATAATATTTGGGAGATGTCCTTTATATTAACCAACGAAATGATGTACGGATTGGGCGAAATACCGTTAAAAAAGAACAATACAAAAGTGATATACAGTCACAAAGGTGGATTTAGTTCCGTGCCACTGATATTCGCTAAGCTTAACAACTCCTATCACGGATTACTATTCGATGCTAATGATCCCACAGAAATCTTCATCTCATTGGAAAATCACGTCGTTGTTCGGAGTATCACGAACTTCGGTTTAAAGTTTCACTTGTTTTCTGGACCGGAGCCGAAAGACATCATGAAAGACGTTATGGCCATAACTGGGAAATACAAAAAGTTGGAATATTGGATGCTGGGCGTTCATATTTGCAGTGAAGTTCAAGGTTTGGAGTTGAATGCATTTTTAAAAAATGCAACAGCTGAAAGGATGCCATTTGATAGTCACTGTGGTGTCCAACCTATTGTGTTTACTAGTGATCAATGTAACAGCAATGACATAAATAACATCGACGCTATCAATGCTGGTTCTAAATTGCTCGAAACCGCTCAGAAAAAATTCGTACCTCACGTTTCCCCTTACCCTTGGAATGAAAATATAACATTGTCGCCTCGGATAGCCACAGAGCCTTACGCTTTTCAGAACAGTATACCTGCAGTCAGACTGTCCGTTGATGAAGTAACAGCCACACATATGACTCTCACGTTTTATAATTCGAGAAATATATCTCTTAATGGAAGGAGACTCCAAAACAAGGAGTATTCGTATACAGTCACGTCGCCGGAGTTGAATATCGTCGTGAGAGCATCAAATAGAACAATTTTTAACACAGCAAGTGGACCCTTCATAGCATCGGATAATATTTGGGAGATGTCCTTTATATTAACCAACGAAATGATGTACGGATTGGGCGAAATACCGTTAAAAAAGAACAATACAAAAGTGATATACAGTCACAAAGGTGGATTTAGTTCCGTGCCACTGATATTCGCTAAGCTTAACAACTCCTATCACGGATTACTATTCGATGCTAATGATCCCACAGAAATCTTCATCTCATTGGAAAATCACGTCGTTGTTCGGAGTATCACGAACTTCGGTTTAAAGTTTCACTTGTTTTCTGGACCGGAGCCGAAAGACATCATGAAAGACGTTATGGCCATAACTGGGAAATACAATAAGTTGGAATATTGGATGCTGGGCGTTCATATTTGCAGTGAAGTTCAAGGTTTGGAGTTGAATGCATTTTTAAAAAATGCAACAGCTGAAAGGATGCCATTTGATAGTCACTGTGGTGTCCAACCTATTGTGTTTACTAGTGATCAATGTAACAGCAATGACATAAATAACATCGACGCTATCAATGCTGGTTCTAAATTGCTCGAAACCGCTCAGAAAAAATTCGTACCTCACGTTTCCCCTTACATTCGTTATGAAATAAAAAATGACACAGATATTCAAAACACGACCACATTTACTGAATATAACGTAAGCTGTGAAATTATGCCGCATTTTGATAAATTAATGTATCGAACTCCAAATGCTCATGAGGTGTACACCGGGGAAATCAATGATTTTGCAGTCATATATCCTAACTACGAGGACGCTCCACCAGAATTTCTAGAGAGTTTATGGGCTTATAACAAAAAAATTGATGGTATTGTGCTCGAAAACAATTTTCCCTTAGACGAAAAAGAGAAAGATCTGGAAGAAATGTCTTTATACCTGCCTTATTTTAGTCAGCACTTTAAAAATGCGTTTAACTATACGCCACCATGGAACTTAACACTGGCTGATTATAACCAAAGCTACCTCTTCCAACATAACAGATATGGCAACAATTTTGTAGATGCTTTCATAAAAAGGTCCAACGATATTCCTGTCTGGTCGAGCAGTCTCTGGCTAAATTCTGGGACTAATATAAACAGACAAAGTATTAATGCTTCCTGGCTTAATCTTAATAATGAACTGGTAAACGCAGCTCTAGGAGGGGTATCTGGGCATTGGCTATGGTCATCGCCGATATGTGGGGATACAGAATATTTTAATCCGGAAACCCAAACGAACCTTTGCATTAAATGGTACCTAGCAGCAGTTTACTTACCAATTGTGAAAATACATTCCAAAGTGATCCCAAGACATCCTACTGCTTTCGTGGGTACTCATAAGACTTTGGCTATAGAGGCAATAGGTAGAAGATACAGTCTGTTGCCATATTATTACACTGTGCTCCAAGAAGGACCTTTACTGAGACCTATGTTCTATCAATATCCGGCATCACAAGCAATACGAGATTTAAGCTCTCAGTTTAATGTTGGTGATAGTCTTCTCATAGCTCCCAATTTACTGCCTCTTCAAAGTCATGTTCAAATTCGGAAACCTCCGGGCTCCTGGTATGAACTGTGGAGTGGCACCAAATTGCAAGGTCAGGAAGGTGACCTACTTACATTATCTACCACGGATGCTGACCTCATGACTTTTATCAAGGGAGGTTCTGTGATATTAATACAGAAGAAAACAGAGTTGTCCGCTTCTGATACATTGCTTACTGAATTTAACGCAATAATTGCTTTGGAATGTATCGAGGAAAACGTGTGCTCGGCGTCAGGGAAACAATTTGTCACTGACGGTCTAACATTGGTGTTCGAGGCTAATGCTCAAAATATGACGATATCTGCTATTGGTAACGATTTCATGCCTATGTGCGATTTCAACTCTGGCACATGGGGCTACGACATCAAACTCTATAGTATCTATGGTTTACCAGATGAGATTAACAATATGGATAATCAGAGGCAAGTGAGTCAATTCACAGATTTGTGCAATTTAGAATACGGCGACAACATCGTTATAAAATTTCTCACTTAA

Protein sequence:

>DPOGS204829-PA
MIPHAKTMEEALDSEIRYQNLKIINHITWYDRILLNRPIKVVLALFLAAVVSPLLLYRYSFFVSINLPPSDGFSIGSCLIPRSTRLPCGNGSSLQEHCHSQCCFDLNLHVCYHRLPSRFSYIMNQPWNENITLSPRIATEPYAFQNSIPAVRLSVDEVTATHMTLTFYNSRNISLNGRRLQNKEYSYTVTSPELNIVVRASNRTIFNTASGPFIASDNIWEMSFILTNEMMYGLGEIPLKKNNTKVIYSHKGGFSSVPLIFAKLNNSYHGLLFDANDPTEIFISLENHVVVRSITNFGLKFHLFSGPEPKDIMKDVMAITGKYKKLEYWMLGVHICSEVQGLELNAFLKNATAERMPFDSHCGVQPIVFTSDQCNSNDINNIDAINAGSKLLETAQKKFVPHVSPYPWNENITLSPRIATEPYAFQNSIPAVRLSVDEVTATHMTLTFYNSRNISLNGRRLQNKEYSYTVTSPELNIVVRASNRTIFNTASGPFIASDNIWEMSFILTNEMMYGLGEIPLKKNNTKVIYSHKGGFSSVPLIFAKLNNSYHGLLFDANDPTEIFISLENHVVVRSITNFGLKFHLFSGPEPKDIMKDVMAITGKYNKLEYWMLGVHICSEVQGLELNAFLKNATAERMPFDSHCGVQPIVFTSDQCNSNDINNIDAINAGSKLLETAQKKFVPHVSPYIRYEIKNDTDIQNTTTFTEYNVSCEIMPHFDKLMYRTPNAHEVYTGEINDFAVIYPNYEDAPPEFLESLWAYNKKIDGIVLENNFPLDEKEKDLEEMSLYLPYFSQHFKNAFNYTPPWNLTLADYNQSYLFQHNRYGNNFVDAFIKRSNDIPVWSSSLWLNSGTNINRQSINASWLNLNNELVNAALGGVSGHWLWSSPICGDTEYFNPETQTNLCIKWYLAAVYLPIVKIHSKVIPRHPTAFVGTHKTLAIEAIGRRYSLLPYYYTVLQEGPLLRPMFYQYPASQAIRDLSSQFNVGDSLLIAPNLLPLQSHVQIRKPPGSWYELWSGTKLQGQEGDLLTLSTTDADLMTFIKGGSVILIQKKTELSASDTLLTEFNAIIALECIEENVCSASGKQFVTDGLTLVFEANAQNMTISAIGNDFMPMCDFNSGTWGYDIKLYSIYGLPDEINNMDNQRQVSQFTDLCNLEYGDNIVIKFLT-