Monarch geneset OGS2.0

DPOGS210178
TranscriptDPOGS210178-TA1308 bp
ProteinDPOGS210178-PA435 aa
Genomic positionDPSCF300393 + 77847-81649
RNAseq coverage6x (Rank: top 87%)
Annotation
HeliconiusHMEL0144687e-10849.27% 
BombyxBGIBMGA014144-TA1e-10548.56% 
DrosophilaCG9701-PA3e-7538.12% 
EBI UniRef50UniRef50_D9HQ544e-8759.68%Seminal fluid protein HACP047 (Fragment) n=1 Tax=Heliconius erato RepID=D9HQ54_9NEOP
NCBI RefSeqXP_970224.17e-8241.28%PREDICTED: similar to beta-glucosidase [Tribolium castaneum]
NCBI nr blastpgi|2999306531e-8659.68%seminal fluid protein HACP047 [Heliconius erato]
NCBI nr blastxgi|3640236136e-9343.80%seminal fluid protein CSSFP031 [Chilo suppressalis]
Group
Gene OntologyGO:00045535e-133hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059755e-133carbohydrate metabolic process
GO:00431694.6e-108cation binding
GO:00038244.6e-108catalytic activity
KEGG pathwaytca:6645778e-74 
 K05350 (bglB)maps-> Starch and sucrose metabolism
    Phenylpropanoid biosynthesis
    Cyanoamino acid metabolism
InterPro domain[23-401] IPR0013605e-133Glycoside hydrolase, family 1
[23-400] IPR0137814.6e-108Glycoside hydrolase, subgroup, catalytic core
[23-400] IPR0178533.3e-98Glycoside hydrolase, superfamily
Orthology groupMCL34826 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210178-TA
ATGTACTATAATCCGTGTATGATCTGCTCTGTTATTTGTAAAATATATACGTCTATACGTCTATCATCAGGTAAGGGTCCCAGCGTTTGGGACGATTACGTCCACGAGAATCGTGTGAAAATTAAGGATAATTCAAACGGAGATGTCGCGGCTGATTCCTACCATTTGTGGAAAGAAGACATAAAGATAACAAAGGAATTGGGTCTGCACTTTTATCGTTTCTCAATAAACTGGCCAAGGATTCTGCCAACTGGTTTTTCAAATAAAATAAACAAAGCTGGTGTGAAATATTACAATGAACTTATAGATGGTCTTGTGAGTGCTGGTGTTGAACCTGTCGTCACTCTCTATCATTGGGAGACGCCTATTATAATCCACAAACTTGGTGGGTGGACAAATCCTTTGATAGTGAAATGGTTTGCACATTACGCCAGAATCGTGTTTTCCCTTTTCGGTGACAGAGTTAAAACCTGGATAACAATAAATGAAGCGAACGTTCAATGTGATTATTTTTACAACTCTGGAATATTCATTACTGCTAAGGAAGATGTCTTTGCACCATTTCTGTGCAATAAACACATTTTAATGGCGCATGCGCATGCGTACAGGATATATGAAAAAGAGTTTAAACCTAAGTATGGAGGGAGTGTATCTTTGGCTAATAATTTTCTGTGGCTGGACCCATACATCTCGAATCACGAAGAACTTGCTGAGCTCGGCAGAGAACACGCGATTGGGAGATATTCCCATCCAATCTATTCCAAAAAGGGTGGTTGGCCTCCCCTACTAGAAAAAGTCCTACTGGAGTATAGTTTGAAACAAGGATACAAGGAATCCAGATTACCAACATTTACGAAACAAGAGAAGGAATTTGTAAGAGGCACGGCTGATTTTTACGGCGTGAACTATTATACGTCTAATTTGATCAGGCCAATTAAACCCGGCGAAGATCCCGGATATTTCTTCATAACAGGAGTACCGGAACTGAACGCCATTTTGGTACATCCGAATAACACTTGGTATGGGGCTCTAGATATATTACCGGTGTATCCGCTAGGTCTACGCCGCTCATTGTCTTGGTTGAAGAAAAGCTACGGTGATATCGATATTCTTATAACAGAATGTGGATTCTCAACCGCAGGATACGATCTCAAAGATTACAAAAGAACTAACTTCTACAGAGACCACTTAGAACAGGCGATAGTAAATTTGGTCTGTACGAAGTTAACTTTGAAGATCCTAAAAGAAGAAGGACTCCGAGAAACTCAGCACATTACTATTCGTGTGTGGCGAAAAATAGATCATTAA

Protein sequence:

>DPOGS210178-PA
MYYNPCMICSVICKIYTSIRLSSGKGPSVWDDYVHENRVKIKDNSNGDVAADSYHLWKEDIKITKELGLHFYRFSINWPRILPTGFSNKINKAGVKYYNELIDGLVSAGVEPVVTLYHWETPIIIHKLGGWTNPLIVKWFAHYARIVFSLFGDRVKTWITINEANVQCDYFYNSGIFITAKEDVFAPFLCNKHILMAHAHAYRIYEKEFKPKYGGSVSLANNFLWLDPYISNHEELAELGREHAIGRYSHPIYSKKGGWPPLLEKVLLEYSLKQGYKESRLPTFTKQEKEFVRGTADFYGVNYYTSNLIRPIKPGEDPGYFFITGVPELNAILVHPNNTWYGALDILPVYPLGLRRSLSWLKKSYGDIDILITECGFSTAGYDLKDYKRTNFYRDHLEQAIVNLVCTKLTLKILKEEGLRETQHITIRVWRKIDH-