Monarch geneset OGS2.0

DPOGS204060
TranscriptDPOGS204060-TA1266 bp
ProteinDPOGS204060-PA421 aa
Genomic positionDPSCF300200 - 102654-104589
RNAseq coverage23x (Rank: top 78%)
Annotation
HeliconiusHMEL0131360.081.84% 
BombyxBGIBMGA010812-TA1e-18068.41% 
DrosophilaCG9701-PA3e-10345.54% 
EBI UniRef50UniRef50_O615940.070.92%Beta-glucosidase n=5 Tax=Obtectomera RepID=O61594_SPOFR
NCBI RefSeqXP_001850321.16e-11347.57%glycoside hydrolase [Culex quinquefasciatus]
NCBI nr blastpgi|3640235830.073.87%seminal fluid protein CSSFP016 [Chilo suppressalis]
NCBI nr blastxgi|3640235830.073.87%seminal fluid protein CSSFP016 [Chilo suppressalis]
Group
Gene OntologyGO:00045535.1e-179hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059755.1e-179carbohydrate metabolic process
GO:00431692.3e-137cation binding
GO:00038242.3e-137catalytic activity
KEGG pathwaytca:6645771e-95 
 K05350 (bglB)maps-> Starch and sucrose metabolism
    Phenylpropanoid biosynthesis
    Cyanoamino acid metabolism
InterPro domain[1-412] IPR0013605.1e-179Glycoside hydrolase, family 1
[1-402] IPR0137812.3e-137Glycoside hydrolase, subgroup, catalytic core
[1-415] IPR0178532.7e-125Glycoside hydrolase, superfamily
Orthology groupMCL10040 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204060-TA
ATGATGAGGGAGTTGGGGCTAGATGCTTACAGGTTCTCTCTCTCCTGGTCTAGAATACTACCCAATGGCCTGGCCAACAAAGTCAGCGATGCCGGGGTTGAGTTTTACAACAACTATATAGATGAAATGATCAAATACGGTATAAAGCCCATGGTCACTCTGTACCACTGGGACTTGCCACAGAAGTTACAAGATTTGGGAGGATTCATGAATCCATTATTCCCCGAGTGGTTTGAAGATTACGCCCGGGTGGTCTTTGAAAAGTTTGGAGACAGAGTCAAGCACTGGATTACTTTCAATGAACCCAGAGAAATCTGTTTCGAAGGCTATGGTTCAGCAACCAAAGCGCCTATCCTAAATGCAACCGACGTCGGTGTTTATTACTGTGCCAAAAATCTGGTTATGGGTCACGCTAGAGCTTATTACGCATATGTCAATGACTTCAAGCCGAGCCAAGAAGGTGTCTGTGGTATCACAATAAGTGTGAATTGGTTCGGGGCGTTGACAGATTCCGAGGAAGATCAATTTGCTGCCGAAATGAAGAGACAAGCAGAATGGGGGCTCTATGCTGAACCTATTTTCTCTGAAGAGGGTGGTTTTCCTAAGGAATTAGCTGAAATTGTGGCCAAAAAAAGCGCTGAACAGGGTTATCCTCGATCTCGTATGCCAGAATTCTCTGATGAAGAGAAGGATTTCGTAAAAGGCACTGCTGACTTTTTAGGAGTAAATCATTACACAGCCGGCTTAGTATCTGCAACTGAATATAAGACTCACCACCCAGTGCCGTCTTTATATGATGATATTGATGTAGGAAGCTACACTCCGCCGGAGTGGCCAAAATCTGCTTCATCTTGGTTAAAATTAGCACCAAACAGTATTTACAATGCCCTCACTCACCTTCACAAGAAGTACAACGGTCCCATATTCTACATCACGGAGAACGGCTGGTCCTCGCCTCCGGAAGCTGATATCCTTGATGATGACAGGATTAGATACTACCGAGCGGCTTTGAACAGTGTGCTCGATACCTTGGAGGCTGGAGTGGATCTACGGGGGTACATGGCATGGAGTCTGATGGACAACTTTGAGTGGATGGAGGGTTACACGGAACGTTTTGGGCTGTACCGCGTTAACTTCTCGGACCCAGGTCGTGAGAGAACTCCTCGTAAGTCAGCCTTCGTTTACAAACAGATCATCAAGAGTCGGATGATTGATGAAGAATATGAACCTGATACCCTGGACATGACCATTGATGAAGGAAACTGA

Protein sequence:

>DPOGS204060-PA
MMRELGLDAYRFSLSWSRILPNGLANKVSDAGVEFYNNYIDEMIKYGIKPMVTLYHWDLPQKLQDLGGFMNPLFPEWFEDYARVVFEKFGDRVKHWITFNEPREICFEGYGSATKAPILNATDVGVYYCAKNLVMGHARAYYAYVNDFKPSQEGVCGITISVNWFGALTDSEEDQFAAEMKRQAEWGLYAEPIFSEEGGFPKELAEIVAKKSAEQGYPRSRMPEFSDEEKDFVKGTADFLGVNHYTAGLVSATEYKTHHPVPSLYDDIDVGSYTPPEWPKSASSWLKLAPNSIYNALTHLHKKYNGPIFYITENGWSSPPEADILDDDRIRYYRAALNSVLDTLEAGVDLRGYMAWSLMDNFEWMEGYTERFGLYRVNFSDPGRERTPRKSAFVYKQIIKSRMIDEEYEPDTLDMTIDEGN-