Monarch geneset OGS2.0

DPOGS212394
TranscriptDPOGS212394-TA3288 bp
ProteinDPOGS212394-PA1095 aa
Genomic positionDPSCF300019 + 1049799-1060768
RNAseq coverage569x (Rank: top 22%)
Annotation
HeliconiusHMEL0066240.070.83% 
BombyxBGIBMGA012077-TA0.057.90% 
DrosophilaCG14476-PE5e-6326.50% 
EBI UniRef50UniRef50_E0V9V30.047.73%Alpha glucosidase, putative n=5 Tax=Arthropoda RepID=E0V9V3_PEDHC
NCBI RefSeqXP_969694.10.048.20%PREDICTED: similar to acid alpha-glucosidase [Tribolium castaneum]
NCBI nr blastpgi|910793500.048.20%PREDICTED: similar to acid alpha-glucosidase [Tribolium castaneum]
NCBI nr blastxgi|910793500.048.20%PREDICTED: similar to acid alpha-glucosidase [Tribolium castaneum]
Group
Gene OntologyGO:00045533e-151hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059753e-151carbohydrate metabolic process
GO:00302462.6e-36carbohydrate binding
GO:00038242.6e-36catalytic activity
GO:00081523.1e-07metabolic process
KEGG pathwaytca:6581920.0 
 K12316 (GAA)maps-> Starch and sucrose metabolism
    Galactose metabolism
    Lysosome
InterPro domain[222-1094] IPR0003220Glycoside hydrolase, family 31
[505-878] IPR0178534.5e-91Glycoside hydrolase, superfamily
[264-504] IPR0110132.6e-36Glycoside hydrolase-type carbohydrate-binding
[240-289] IPR0005191.4e-10P-type trefoil
[723-789] IPR0137853.1e-07Aldolase-type TIM barrel
Orthology groupMCL10955 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212394-TA
ATGCCTAAGATACCTTACAAACCTGAAAAGGGTCGTGAAGAAGACGAGGATTATGAAATTGTATCTTTTGAAGACTTTTGCGATAAGCCTCCAGGAACGAGTACCGATTTATTGACTTTAAATGACAATATTAACTATCGACTTCATTATGAAACCGATAAGGCTTCAAATTTTCCTGAGGCCGGTGAACCATCAACTCGAGATATTGGGATGCACACGGAAACTTCTTTAAACGGTCCCAAAAATGCGTCGTACAAAAGAAAATTGTCCTTTGCACCGTTCGGCAGAACCGATAAAAATAGCCGCGGTCCATTATTCAGTGGAGTTATACCGAAACCTAGAAATGAAGGGGAATCTAGGGAACACAGATACGAACGTTTCTCTCCGCGTGGCGGTTGGCTAGCGCGTACCTGGGAACAGCTAGGGAATTTGTTACCTGGTATGCTAGCTACGGCGTTGTTGAGCGCGTTATGCGTGGGCGCATGGTGGGCGGTGGGCGGAGCTCTAAGCGGCAGCTGGGGAGATGATCATTACAGACGTCTCTACGAACGCGCTCATCCAGATGACATAAAAAAACCACTGTCGCCTGTAATTGAGAAGATTATTCCGACTGAGAACCGCTACCACGACCACAACAATTTGTCAACCAAGAACAAGAACATCACAGAAGCGTCTTATAAGAAGGATAATAAGAAAAGTGACTACGGCGATTTGGATCATCAGTGTGGTGATGTATCAGACAGCATGAGGTTCGACTGTCATCCGCAGGGGGGCGCCAGTGAGGAAGCTTGTACCAAACGGGGTTGTTGTTGGGGGGCGACCGCTGTGCAGGGTGCTCCATACTGCTACTACCCTAAACACTACCCGAGCTACCGCTTCATGAACAGCACAGAGAACAAGCACAGCATGACCGTGTACTACGCTCACGGTCTGGATACAGGGTACCCTGGACAGTGGGGAACTGTCATGGTGACCTTCAACTACCTGGCCGATGATGTCCTGCAGATTAAGATGACCGACGCTAACAACAAGAGGTTCGAACCCCCGTACCCCGAGGTGCCGGTGGTGTCGGGGCGGGTCACCAGTCTCCAGTACCGAGTGCTGGTGGACAGTGCCGCCGTCGGCTTCAAGGTCATCAGGACTGAGGACAACGTCACCATAGTCGACACTCAGAACGTGGGTGGTTTGATATTATCGGAGAAATTCCTTCAGTTATCGTCAGTACTGCCCACGGACCACGTGTACGGCTTGGGAGAAAAACAGGCGCCGCTCCTGAACAACTTCAATTGGAACACCTTCACGCTGTTCAACAGCGACATGCCGCCCATAGAGAATAAAAGTCTCTATGGGACTCATCCTTTTTATCTGGCCTTGGAGAGAAATGGGAAGAGTCATGGGATGCTCCTGTTGAATTCGAATGCTATGGACATAGTCCTCCAACCGTCTCCGGCTATAACGTACCGCGCCGTGGGCGGCGTCCTAGACTTCCTGGTGATGATGGGTCCTTCGCCCTCGCAAGTTGTATCTCAACTCACGAGCCTTATAGGCAGACCCTTCATGCCGCCGTACTGGGCGCTCGGATTCCATCTGTGCAAATACGACTACGGCAGCCTCAACACCACCCGCCAGGTCATGCAGAGGAACATCGACGCCGGGATACCGTTGGACGCCCAGTGGAATGACCTGGACTACATGAGCACTGCGAACGACTTCACGTACGACAAGAAGAAGTATGAAGGCCTGCCGCAGTTCGTTGACGACTTGCATCAGAAAGGAATGCACTACGTAGTGCTCGTCGATCCCGGGGTGAGTGCGTCCGAAACTCCGGGCAGCTACCCGCCTTTCGACCGGGGATTGGAAATGGACGTGTTCGTGAAGAACTCCACCGACCAGCCCTTCGTGGGGAAGGTCTGGAATCCAAAGTCGACGGTGTGGCCGGACTTCACCAACCCCAACGCGTCGGTTTACTGGAAGGAAATGTTGGAGGAGTTTTATAAGCTGGTTAAATTCGACGGAGTGTGGATCGATATGAACGAGCCGTCCAACTTCCTGTCAGGGTCTATGTACGGTGAATGTGACCCCGAGGACCTTCCCTATACCCCCGCGGAGACTCCTCAGGAGGGTCTCAAGTATAAGACCCTGTGTATGGACGCCAAGCATTACGCGGGGAAGCATTACGACGTGCACAACGTCTACGCCATGGCGGAGGCCGTGGTCACATTCAATGCTATGCGTGAGGTCCGTGGTAAGCGTCCGTTGGTGTTGTCTCGAGCGTCCAGTCCCGGCCTGGGAAGAGTTGCTGCCCACTGGAGTGGAGACGTCTACAGCAAATGGCACGACCTCAAGATGTCTATACCCGCCCTGCTGAGTTTTAGCTTGTTCGGCGTGCCACTGATGGGTTCCGACATCTGTGGCTTCATCGGCGATACTTCTGAAGAGCTTTGCAAGAGATGGATGCAGCTTGGAGCTTTCTATCCATTCTCACGGAACCATAATTCCAATGAAGCCAAGCCCCAGGATCCCGTGGCCATGGGAGCGGGCGTGGTGCGAGCGAGTAGAAATGCGCTCCGCACGAGGTACCGCATGCTGCCATACTATTACACACTCTTCTGGAAGGCCCACGTGGCGGGGGAAACGGTCGCCAGGCCGCTGTTCATGGAGTTCCCATCTCTGAGTAAAGTCCACTCAATCGATGAGCAGTTCATGTTGGGTCCGCACGTGTTGGTGAGCCCTATACTCATCCCCGGTAACTCGACCACGGCGTTGTTCCCCTCCACCACTTGGTACAGCTTCCTGGATGGAAGATACCTGGCCAGAGACCGATGGATGGAAATCGGAGAAGGGGATATCATATCCATCAGGGCGGGTGCGATCCTTCCACTCCAAGAGCCGCCGTCCAAGGGACCCGTGAACACGGTCGTGAGCCGCAGCGGCCCTCTCCAGCTGTTGGTGGTTCCCGATAAAGAAGGAGCGGCTCACGGGCAGCTCTACTGGGACGACGGAGACAGCATCAATACCTATGAAGAGAAAAAGTATAGCCACATCGATTTCATTGTGAAGAACAATGAGCTACAGAATATAGTACAGTGGTGGGGATATGGGGTTCCATCTCTCAACTCTATCTCCATACTGGGGATGAAGCCCTTAAAGTCCTTGACCATCAACGACATCCCCACCAAATACACATATATTAACAAAACCCAAGTGGTTACTATCTCTTCCATAAATCTGCCATTAGATAAAACATTTCGTGTAAAATGGACCTACCAAAAAACAGGAAAAATATAA

Protein sequence:

>DPOGS212394-PA
MPKIPYKPEKGREEDEDYEIVSFEDFCDKPPGTSTDLLTLNDNINYRLHYETDKASNFPEAGEPSTRDIGMHTETSLNGPKNASYKRKLSFAPFGRTDKNSRGPLFSGVIPKPRNEGESREHRYERFSPRGGWLARTWEQLGNLLPGMLATALLSALCVGAWWAVGGALSGSWGDDHYRRLYERAHPDDIKKPLSPVIEKIIPTENRYHDHNNLSTKNKNITEASYKKDNKKSDYGDLDHQCGDVSDSMRFDCHPQGGASEEACTKRGCCWGATAVQGAPYCYYPKHYPSYRFMNSTENKHSMTVYYAHGLDTGYPGQWGTVMVTFNYLADDVLQIKMTDANNKRFEPPYPEVPVVSGRVTSLQYRVLVDSAAVGFKVIRTEDNVTIVDTQNVGGLILSEKFLQLSSVLPTDHVYGLGEKQAPLLNNFNWNTFTLFNSDMPPIENKSLYGTHPFYLALERNGKSHGMLLLNSNAMDIVLQPSPAITYRAVGGVLDFLVMMGPSPSQVVSQLTSLIGRPFMPPYWALGFHLCKYDYGSLNTTRQVMQRNIDAGIPLDAQWNDLDYMSTANDFTYDKKKYEGLPQFVDDLHQKGMHYVVLVDPGVSASETPGSYPPFDRGLEMDVFVKNSTDQPFVGKVWNPKSTVWPDFTNPNASVYWKEMLEEFYKLVKFDGVWIDMNEPSNFLSGSMYGECDPEDLPYTPAETPQEGLKYKTLCMDAKHYAGKHYDVHNVYAMAEAVVTFNAMREVRGKRPLVLSRASSPGLGRVAAHWSGDVYSKWHDLKMSIPALLSFSLFGVPLMGSDICGFIGDTSEELCKRWMQLGAFYPFSRNHNSNEAKPQDPVAMGAGVVRASRNALRTRYRMLPYYYTLFWKAHVAGETVARPLFMEFPSLSKVHSIDEQFMLGPHVLVSPILIPGNSTTALFPSTTWYSFLDGRYLARDRWMEIGEGDIISIRAGAILPLQEPPSKGPVNTVVSRSGPLQLLVVPDKEGAAHGQLYWDDGDSINTYEEKKYSHIDFIVKNNELQNIVQWWGYGVPSLNSISILGMKPLKSLTINDIPTKYTYINKTQVVTISSINLPLDKTFRVKWTYQKTGKI-