Monarch geneset OGS2.0

DPOGS210283
TranscriptDPOGS210283-TA1449 bp
ProteinDPOGS210283-PA482 aa
Genomic positionDPSCF300216 + 248100-250912
RNAseq coverage30x (Rank: top 76%)
Annotation
HeliconiusHMEL0035190.070.04% 
BombyxBGIBMGA010536-TA2e-14953.47% 
DrosophilaCG9701-PA3e-12344.51% 
EBI UniRef50UniRef50_G6DAN32e-12749.23%Glycoside hydrolase n=10 Tax=Obtectomera RepID=G6DAN3_DANPL
NCBI RefSeqXP_001850321.14e-12845.53%glycoside hydrolase [Culex quinquefasciatus]
NCBI nr blastpgi|3640235859e-13850.42%seminal fluid protein CSSFP001 [Chilo suppressalis]
NCBI nr blastxgi|3640236137e-13950.84%seminal fluid protein CSSFP031 [Chilo suppressalis]
Group
Gene OntologyGO:00045535.2e-201hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059755.2e-201carbohydrate metabolic process
GO:00431699e-153cation binding
GO:00038249e-153catalytic activity
KEGG pathwaytca:6645772e-109 
 K05350 (bglB)maps-> Starch and sucrose metabolism
    Phenylpropanoid biosynthesis
    Cyanoamino acid metabolism
InterPro domain[2-477] IPR0013605.2e-201Glycoside hydrolase, family 1
[3-465] IPR0137819e-153Glycoside hydrolase, subgroup, catalytic core
[2-479] IPR0178537e-146Glycoside hydrolase, superfamily
Orthology groupMCL10040 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210283-TA
ATGAGGAAACTACCAAACGGGTTAAAGATAGGGGTTGCTACGGCATCTTATCAAATCGAAGGGGGTTGGAATGCTGGCGATAAAACACCGAGTATTTGGGACACTCATTGTCACAAAGAACCATGCCCCGTAAAAGATAATACCAGTGGAGATGACACTTGCGAATCATTCAAATACTACAAACGTGATCTGGAGATGATAAAATTTTTGGGATTCCACTTCTACAGATTTTCCATATCATGGCCAAGATTACTTCCTGACGGATTTACAAACAGAATCAGCGAGACCGGCCGCGAATACTACAATAATCTGATCAACGGATTACTTGAAAATAATATTGAACCGATAATTACTTTGTATCATTGGGATCTGCCGCAGACACTTCAAGAACTCGGAGGTTGGAGCAATCCTCTCATAGTGGACTGGTTCGGTGACTACGCCGCTGTCGCATACCAACTTTTTGGAGACAGAGTTAAAACCTGGATAACGATCAATGAACCGAAACAAATCGGTGTTTTCGGTTACGGAATGACCAGAATGGCTCCAGCCCTAAATATATCCGGGATAGCAGATTATATAGCTGCTAAAAATATGGTGTTAGCACATGCCCGAGCCTGGCATATATACGATAAACAATTTAGATCTACCCAAAAAGGAACATGCGGCATCACCATAGCAACCGATTTTCGTGTCGGACTATCTGACTCTCGTGATGATGTCGAAGCTGGTCTCGACGCTATGGATTTTGAAGTAGGATTATACAGCCATCCTATATTCACATCAAAGGGTGGTTTTCCTGAACGAGTTATCCAAAGAGTAGCAGAAAAAAGTAAAGAACAAGGTTACACTAGAAGTCGACTGCCAGATTTTAGTGACGAAGAAATTGAGTACGCTAAAGGAACCAGTGATTTTTATGGCTTCAATCATTATTCGACGAAATTTTTCACAAGGGACACTTACACGCCTGGAAAACATCCAATACCCTCGTATGATGATGATATTGGTGCAGATTTTACTTACTTGGACTATGAAAAAGGTGCAGTGCCTCATGTCACAGTAATTCCACACGGAATCAGAAAAGCCTTGAAATGGGTGAAAGAAAACTGTAACAATCCACCAATAATGATAACCGAGAATGGTTTCGCCACTTTTGGCGGTTTGGAAGATATGGATAGAATATTCTATTTTAGGAAATATCTTTACTCGATTTTGGACGCCATTGAAATTGACGGCTGCAATGTTACGTCATATACAGTGTGGAGTTTAATGGACAATTTTGAATGGGATAGTGGATTAAGTGTTAAATTTGGACTATTCGAAGTCGATTTTGAGGATGAAAAGAAGACCAGAACGGCAAGATTGTCGGCTTTGTGGTTTAAAAGACTCATAAAGACAAAATGTCTAGATCTGGAACACATACCGGAAATGGAAGAGAAAATCCACTTTTAA

Protein sequence:

>DPOGS210283-PA
MRKLPNGLKIGVATASYQIEGGWNAGDKTPSIWDTHCHKEPCPVKDNTSGDDTCESFKYYKRDLEMIKFLGFHFYRFSISWPRLLPDGFTNRISETGREYYNNLINGLLENNIEPIITLYHWDLPQTLQELGGWSNPLIVDWFGDYAAVAYQLFGDRVKTWITINEPKQIGVFGYGMTRMAPALNISGIADYIAAKNMVLAHARAWHIYDKQFRSTQKGTCGITIATDFRVGLSDSRDDVEAGLDAMDFEVGLYSHPIFTSKGGFPERVIQRVAEKSKEQGYTRSRLPDFSDEEIEYAKGTSDFYGFNHYSTKFFTRDTYTPGKHPIPSYDDDIGADFTYLDYEKGAVPHVTVIPHGIRKALKWVKENCNNPPIMITENGFATFGGLEDMDRIFYFRKYLYSILDAIEIDGCNVTSYTVWSLMDNFEWDSGLSVKFGLFEVDFEDEKKTRTARLSALWFKRLIKTKCLDLEHIPEMEEKIHF-