Monarch geneset OGS2.0

DPOGS201333
TranscriptDPOGS201333-TA1272 bp
ProteinDPOGS201333-PA423 aa
Genomic positionDPSCF300176 + 625198-629843
RNAseq coverage6x (Rank: top 87%)
Annotation
HeliconiusHMEL0123992e-13057.11% 
BombyxBGIBMGA010811-TA9e-13250.67% 
DrosophilaCG9701-PA4e-6149.79% 
EBI UniRef50UniRef50_G6DAN30.0100.00%Glycoside hydrolase n=10 Tax=Obtectomera RepID=G6DAN3_DANPL
NCBI RefSeqXP_001183226.13e-9043.40%PREDICTED: similar to lactase-phlorizin hydrolase [Strongylocentrotus purpuratus]
NCBI nr blastpgi|1157100206e-8943.40%PREDICTED: similar to lactase-phlorizin hydrolase [Strongylocentrotus purpuratus]
NCBI nr blastxgi|1157100205e-9043.40%PREDICTED: similar to lactase-phlorizin hydrolase [Strongylocentrotus purpuratus]
Group
Gene OntologyGO:00045534.9e-179hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059754.9e-179carbohydrate metabolic process
GO:00431697.6e-72cation binding
GO:00038247.6e-72catalytic activity
KEGG pathwayate:Athe_04581e-63 
 K05350 (bglB)maps-> Starch and sucrose metabolism
    Phenylpropanoid biosynthesis
    Cyanoamino acid metabolism
InterPro domain[29-416] IPR0013604.9e-179Glycoside hydrolase, family 1
[29-417] IPR0178533.6e-133Glycoside hydrolase, superfamily
[30-189] IPR0137817.6e-72Glycoside hydrolase, subgroup, catalytic core
Orthology group 
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201333-TA
ATGTTCGGCGCTGTGAGATTAGTATTCCTGGCATTTATATTGGCTGTACTTGCAAATAGCAAAAAAATCTCTCGACATGAAGCGAGAAAAATACCCGACCACTTACTTTTCGGAGCTGCTACGGCATCGTACCAAATAGAAGGCACTTGGAATGAAGACGGAAAATCTGAAAATATATGGGATCGCGTATCACACAGGGAACCTTGTGTTGTCGACAACTGCGACACAGGTGACCTTGCCGATGATTCGTATCATCAATATAAGCGTGATGTGGAAATGATGCGGGAACTAGGTCTCGACTTCTATAGGTTCTCTCTCTCCTGGACGAGAATATTACCAACGAGTTTTCCAGACCAAATAAATGAAAAAGGAGTACAATATTATAATAATTTGATAAATGAGATGCTCAAATACAACATACAACCCATGGTGACTCTTTATCACTGGGATTTACCTCAGAAGTTGCAAGATCTGGGAGGATGGACCAATCCCCATATCGTTGATTGGTTTACCGATTACTCCAGAGTAGTGTTCCAGTTATTTGGAGATAGGGTTAAGTATTGGTGGGGACTGTATGCAAATCCAATATTTTCCGAATTTGGGGACTATCCAGCAGTCATGAAAGATAGAATAGCAGCAAAGAGTAAGAAACAAGGATTTCCAAGATCGCGATTACCACAATTCACTCCTGAAGAAATAGATTTAATTAAAGGAAGTTCGGATTTCATTGGATTAAATCATTATACTACTAACATTGTTTATAGGAACGAATCTGTTTATGGATATTATAGTTCGCCATCTTTTTATGATGATATTGAAGTAATAAGTTATCAAGATAGTTCCTGGGAGTCAGCTGCTTCCAACTGGTTAAAGAGTGTACCCTGGGGATTCTATAAGTTATTAACAAAAATACGAGAGGACTACAACAACCCGCCAGTTTTCATCACTGAGAATGGATTCTCAACCCGAGGTGGTCTAATTGACGACGACCGCGTAAAGTATTACAGAACATACATAGATGCTATGCTCGATGCTATTGAAGATGGATCAGATATAAGAGTTTATGCAGCGTGGAGTTTGATGGACAATTTCGAATGGATGAGGGGATACAGCGAACGTTTCGGACTGTACGAGGTGGACTACGAGAGTCCTGACCGCACCCGAACTCCTCGCAAGTCTGCTTACGTATACAAAGAGATGCTGCGCACACGAACACTGGACTATCATTATGAACCTGATATGAGCTTGGGAATGAATGTCGATGATAATTAA

Protein sequence:

>DPOGS201333-PA
MFGAVRLVFLAFILAVLANSKKISRHEARKIPDHLLFGAATASYQIEGTWNEDGKSENIWDRVSHREPCVVDNCDTGDLADDSYHQYKRDVEMMRELGLDFYRFSLSWTRILPTSFPDQINEKGVQYYNNLINEMLKYNIQPMVTLYHWDLPQKLQDLGGWTNPHIVDWFTDYSRVVFQLFGDRVKYWWGLYANPIFSEFGDYPAVMKDRIAAKSKKQGFPRSRLPQFTPEEIDLIKGSSDFIGLNHYTTNIVYRNESVYGYYSSPSFYDDIEVISYQDSSWESAASNWLKSVPWGFYKLLTKIREDYNNPPVFITENGFSTRGGLIDDDRVKYYRTYIDAMLDAIEDGSDIRVYAAWSLMDNFEWMRGYSERFGLYEVDYESPDRTRTPRKSAYVYKEMLRTRTLDYHYEPDMSLGMNVDDN-