Monarch geneset OGS2.0

DPOGS210278
TranscriptDPOGS210278-TA1557 bp
ProteinDPOGS210278-PA518 aa
Genomic positionDPSCF300216 + 174796-177463
RNAseq coverage124x (Rank: top 57%)
Annotation
HeliconiusHMEL0169810.066.67% 
BombyxBGIBMGA014178-TA0.070.00% 
DrosophilaCG9701-PA2e-13647.28% 
EBI UniRef50UniRef50_G6DAN36e-16055.68%Glycoside hydrolase n=10 Tax=Obtectomera RepID=G6DAN3_DANPL
NCBI RefSeqXP_557100.23e-15350.60%AGAP006426-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3640235930.072.94%seminal fluid protein CSSFP021 [Chilo suppressalis]
NCBI nr blastxgi|3640235930.072.94%seminal fluid protein CSSFP021 [Chilo suppressalis]
Group
Gene OntologyGO:00045534e-235hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059754e-235carbohydrate metabolic process
GO:00431696.7e-170cation binding
GO:00038246.7e-170catalytic activity
KEGG pathwaytca:6645776e-131 
 K05350 (bglB)maps-> Starch and sucrose metabolism
    Phenylpropanoid biosynthesis
    Cyanoamino acid metabolism
InterPro domain[13-515] IPR0013604e-235Glycoside hydrolase, family 1
[26-489] IPR0137816.7e-170Glycoside hydrolase, subgroup, catalytic core
[25-502] IPR0178536.4e-161Glycoside hydrolase, superfamily
Orthology groupMCL10040 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210278-TA
ATGAAGCGTTTTCTAGCTTGCTGCATTCTATTACTTATGATAAATGATCCCGTATTATCTAAGAGGTCTATACGCAAGTTCCCTAAGGGATTCAAGTTTGGAGCATCGACGGCTGCCTATCAGATCGAAGGAGGTTGGAATGAAGATGGCAAAGGTATTTCCATTTGGGACGTAGCTACACATATGGAAACTACACCAATCCGCGATGGAAGCAATGGTAATATCGCAGCAGATTCCTACCATTTATATAAAAAGGATGTAGAAATATTGAAAGAACTTGGTGTAGATTTCTATCGTTTTTCCGTATCATGGACCAGAATACTCCCCCAGGGTTTTTCCAATTACATCAATCAAGCTGGCATCAATTATTACAATAATTTAATAAACGAATTGATCCAAAATAATATTGTACCCTTCTTGACAATTTATCACTGGGATTTGCCCCAAGAGCTGCAGAAGTTGGGTGGTTGGACCAACCCTTATATTATTGATGTTTTTGCTGACTATGCCAAAATCCTTTTCGATCACTTCGGTGATAGAGTCAAATTTTGGATAACAATTAACGAACCGAAACAAATATGCTACGAAGGATATGGATCAGATTTGAAAGCTCCACTCGTTAATATGACTGGGATAGCAGAGTATATGTGCGCCAAGAATGTTTTGCTGGCTCATGCTAAAGTTTATCGCATATACGATGAGGAGTATAGAAAGAAGCAGAACGGTAAAATTGGAATATCTATCAGCTGTACGTGGTATGAACCAGCTTCTGATACAATCGATGATCACCAAGCTGCTTTAGACGCGAGACAATTCGATTGGGGTCAATACGCTCATCCGATATTCTCAAAAGAAGGGGACTTTCCGCATGAACTTAAACACAACGTGGCGGCGAAGAGTGCGGAACAGGGATATTCATATTCACGTCTCCCGGAACTGTCGGCTTCTGAAGTTGCATTTATTAGAGGCACGTCTGATTTCTTTGGAATGAACACTTATACAACGAAGATGGCTTATAGGGATGCGTCTGTTGATGGAATGTTCCCCGTGCCATCGTACAGAGATGACATGGGGTCCGTCCTCGTCAAGGATCCCACTTGGCCGCAGGCGCAGTCTTCTTGGTTACAGGAAGTTCCCTGGGGATTTCATAAATTACTCAAAGAGGTCAATAAATTGTACGACAATCCGCCGGTTTATATCACAGAAAACGGCTGGTCAAGTTCCGGTGGTCTACTTGACGAAGATCGGATACAATTCTTGAGAAATTATCTGAACGCATTACTAGACGCTTTAGACGAAGGGTGCAATATAAAAGCATATACAGTATGGAGTCTGATAGATAACTTTGAATGGTTAAACGGATACACAGAAAAATTTGGACTATACGAAGTAGAATTTTCGTCTCCAGATCGTACTAGAACACCCAGGAAATCAGCTTTTATATACAAAGAGATTATACGATCCAGAATTTTGGATCCGAATTTTGAACCTGAAAAATATGTAGAAGAAAGAAAAGATAGTCAAGAAAAAGAAAAGTTTGACAGTGATTTATATTAA

Protein sequence:

>DPOGS210278-PA
MKRFLACCILLLMINDPVLSKRSIRKFPKGFKFGASTAAYQIEGGWNEDGKGISIWDVATHMETTPIRDGSNGNIAADSYHLYKKDVEILKELGVDFYRFSVSWTRILPQGFSNYINQAGINYYNNLINELIQNNIVPFLTIYHWDLPQELQKLGGWTNPYIIDVFADYAKILFDHFGDRVKFWITINEPKQICYEGYGSDLKAPLVNMTGIAEYMCAKNVLLAHAKVYRIYDEEYRKKQNGKIGISISCTWYEPASDTIDDHQAALDARQFDWGQYAHPIFSKEGDFPHELKHNVAAKSAEQGYSYSRLPELSASEVAFIRGTSDFFGMNTYTTKMAYRDASVDGMFPVPSYRDDMGSVLVKDPTWPQAQSSWLQEVPWGFHKLLKEVNKLYDNPPVYITENGWSSSGGLLDEDRIQFLRNYLNALLDALDEGCNIKAYTVWSLIDNFEWLNGYTEKFGLYEVEFSSPDRTRTPRKSAFIYKEIIRSRILDPNFEPEKYVEERKDSQEKEKFDSDLY-