Monarch geneset OGS2.0

DPOGS204061
TranscriptDPOGS204061-TA1674 bp
ProteinDPOGS204061-PA557 aa
Genomic positionDPSCF300200 - 94882-98724
RNAseq coverage6x (Rank: top 87%)
Annotation
HeliconiusHMEL0131360.077.23% 
BombyxBGIBMGA010812-TA0.066.33% 
DrosophilaCG9701-PA5e-13046.77% 
EBI UniRef50UniRef50_O615940.068.51%Beta-glucosidase n=5 Tax=Obtectomera RepID=O61594_SPOFR
NCBI RefSeqXP_557100.27e-13948.95%AGAP006426-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|29706870.068.51%beta-glucosidase precursor [Spodoptera frugiperda]
NCBI nr blastxgi|29706870.066.73%beta-glucosidase precursor [Spodoptera frugiperda]
Group
Gene OntologyGO:00045536.7e-215hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059756.7e-215carbohydrate metabolic process
GO:00431692e-164cation binding
GO:00038242e-164catalytic activity
KEGG pathwaytca:6645771e-117 
 K05350 (bglB)maps-> Starch and sucrose metabolism
    Phenylpropanoid biosynthesis
    Cyanoamino acid metabolism
InterPro domain[72-548] IPR0013606.7e-215Glycoside hydrolase, family 1
[74-538] IPR0137812e-164Glycoside hydrolase, subgroup, catalytic core
[73-551] IPR0178534.1e-154Glycoside hydrolase, superfamily
Orthology groupMCL10040 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204061-TA
ATGTTGAGGGTGAGGTCGTTGATCGTTGGGTCGGATGGTTTGAGACCTAGAGTAGAAGCAGAATTATCGTCGATTGATTGTTTCGCGTTATCCCTGATTTGCGTGTTTTCATCGTCCCTGTCTTTGTCAGAACTTTCTGCTTTAAGGTCTTCGCTTTATTGGGCGAAGCCAATGGTTGCTTTGGAAATCTTGTTAATCGGCAGCCATGCTCAAGAAAGAAGATTTCCTGAGGACTTCATGTTCGGGGCTGCCACATCAGCATATCAGATAGAAGGAGGATGGAGCGCTGATGACAAAGGAGAGAATATATGGGATCGTTTGACTCACACCAAACCTAACGTAATCAAGGATGTGAGCAATGGTGATGTTGCAGCCGACACATACAATAACTACAAACGTGATGTGGAGATGATGAGGGAGTTGGGGCTAGATGCTTACAGGTTCTCTCTCTCCTGGTCTAGAATACTACCCAATGGCCTGGCCAACAAAGTCAGCGATGCCGGGGTTGAGTTTTACAACAACTATATAGATGAAATGATCAAATACGGTATAAAGCCCATGGTCACTCTGTACCACTGGGACTTGCCACAGAAGTTACAAGATTTGGGAGGATTCATGAATCCATTATTCCCCGAGTGGTTTGAAGATTACGCCCGGGTGGTCTTTGAAAAGTTTGGAGACAGAGTCAAGCACTGGATTACTTTCAATGAACCCAGAGAAATCTGTTTCGAAGGCTATGGTTCAGCAACCAAAGCGCCTATCCTAAATGCAACCGACGTCGGTGTTTATTACTGTGCCAAAAATCTGGTTATGGGTCACGCTAGAGCTTATTACGCATATGTCAATGACTTCAAGCCGAGCCAAGAAGGTGTCTGTGGTATCACAATAAGTGTGAATTGGTTCGGGGCGTTGACAGATTCCGAGGAAGATCAATTTGCTGCCGAAATGAAGAGACAAGCAGAATGGGGGCTCTATGCTGAACCTATTTTCTCTGAAGAGGGTGGGTTTCCTAAGGAATTAGCAGAAATTGTGGCCAAAAAAAGCGCTGAACAGGGTTATCCTCAATCTCGTATGCCAGCATTCTCTGATGAAGAGAAGGATTTCGTAAAGGGCGCTTTTGATTTCTTTGGAGTAAATCATTACTCAGGCAGCTTTGTATCTGCAACTGAATATAAGACTAACCACCCAGTGCCGTCTTTATATGATGATGTTGATGTTGGAAGCTACACTCCGCCGGAGTGGCCAAAATCTGCTTCTTCGTGGTTAGTTCAAGCACCAAACAGTGTTTACAATGCCCTCACTCACCTTCACAAGAAGTACAACGGTCCCATACTCTACATCACGGAGAACGGCTGGTCCTCGTCTCCGGAAGCTGATATCCTTGATGATGATAGGATTAGATACTACCGAGCGGCTTTGAACAGTGTGCTCGATACCTTGGAGGCTGGAGTGGATCTACGAGGGTACATGGCATGGAGTCTGATGGACAACTTTGAGTGGAATGCTGGTTACACAGAACTTCTTGGCCTGTACCGTGTCAACTTCTCGGACCCAGGTCGTGAGAGAACTCCTCGTAAGTCAGCCTTCGTTTACAAACAGATCATCAAGAGTCGGATGATTGATGAAGAATATGAACCTGATACCCTGGACATGACCATTGATGAAGGGAACTGA

Protein sequence:

>DPOGS204061-PA
MLRVRSLIVGSDGLRPRVEAELSSIDCFALSLICVFSSSLSLSELSALRSSLYWAKPMVALEILLIGSHAQERRFPEDFMFGAATSAYQIEGGWSADDKGENIWDRLTHTKPNVIKDVSNGDVAADTYNNYKRDVEMMRELGLDAYRFSLSWSRILPNGLANKVSDAGVEFYNNYIDEMIKYGIKPMVTLYHWDLPQKLQDLGGFMNPLFPEWFEDYARVVFEKFGDRVKHWITFNEPREICFEGYGSATKAPILNATDVGVYYCAKNLVMGHARAYYAYVNDFKPSQEGVCGITISVNWFGALTDSEEDQFAAEMKRQAEWGLYAEPIFSEEGGFPKELAEIVAKKSAEQGYPQSRMPAFSDEEKDFVKGAFDFFGVNHYSGSFVSATEYKTNHPVPSLYDDVDVGSYTPPEWPKSASSWLVQAPNSVYNALTHLHKKYNGPILYITENGWSSSPEADILDDDRIRYYRAALNSVLDTLEAGVDLRGYMAWSLMDNFEWNAGYTELLGLYRVNFSDPGRERTPRKSAFVYKQIIKSRMIDEEYEPDTLDMTIDEGN-