Monarch geneset OGS2.0

DPOGS201330
TranscriptDPOGS201330-TA1455 bp
ProteinDPOGS201330-PA484 aa
Genomic positionDPSCF300176 + 600122-602177
RNAseq coverage12x (Rank: top 83%)
Annotation
HeliconiusHMEL0123990.069.53% 
BombyxBGIBMGA010537-TA0.062.81% 
DrosophilaCG9701-PA6e-14048.86% 
EBI UniRef50UniRef50_G6DAN38e-17062.08%Glycoside hydrolase n=10 Tax=Obtectomera RepID=G6DAN3_DANPL
NCBI RefSeqXP_316461.38e-14752.82%AGAP006425-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3640235851e-18061.95%seminal fluid protein CSSFP001 [Chilo suppressalis]
NCBI nr blastxgi|3640236130.061.28%seminal fluid protein CSSFP031 [Chilo suppressalis]
Group
Gene OntologyGO:00045532e-228hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059752e-228carbohydrate metabolic process
GO:00431691.8e-168cation binding
GO:00038241.8e-168catalytic activity
KEGG pathwaytca:6645772e-126 
 K05350 (bglB)maps-> Starch and sucrose metabolism
    Phenylpropanoid biosynthesis
    Cyanoamino acid metabolism
InterPro domain[2-474] IPR0013602e-228Glycoside hydrolase, family 1
[3-465] IPR0137811.8e-168Glycoside hydrolase, subgroup, catalytic core
[2-477] IPR0178533.6e-159Glycoside hydrolase, superfamily
Orthology groupMCL10040 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201330-TA
ATGAGGAAATTTCCGGCGAGATTAATTTTTGGAACAGCGACAGCCTCTTATCAAGTGGAAGGTGCCTGGGACGTCGACGGAAAATCGGAAAATATCTGGGACCATTTAACTCATACAAATCCTTGCAAAGTGCTGGACTGCTCTAATGGTGATGTCGCTGACAACTCTTATTATCTCTATAAAAGAGATGTGGAAATGATGCGCGAGTTAGGACTCGACACTTACAGGTTTTCTATCTCTTGGACCAGAATCCTTCCTACTGGTTTTCCAGATTACATCAATAAAGCTGGAGTAGCATATTACAACAACTTAATTGATGAAATGCTAAAATATAACATTCAGCCGATAGTAACTTTATACCATTGGGACCTACCACAGAAAATACAAGAGATGGGAGGCTGGACGAATAGTGAAATTGTTAATTGGTTTGGAGACTACGCACGAGTTATATTTAATTTTTTTGGTGATAGAGTAAAATATTTTATCACTATTAATGAACCTCATCAAATTTGCGAGTTTGGCTATGGAAAAGATATATTTGCACCAGCATTAAAGATACAAGGTATAGCTGACTATTTATGCATGAAGAATGTACTATTAGGTCACGCTAGAGCTTATCACATTTATGATAAAGAATTTCGGGTGAATCAAAATGGAAAAATATTCATTACAATAAACGCCGAATGGCATCAACCCAAAACAGTAAATGACGAGGAAGCAGCCCGGGATGCTAGACAATTTTATTGGGAGGTTTATGCTCATCCAATATTTTCAAAAAGTGGAAATTTTCCTCCGGAAATGATAAAGAGGATAGCGGATAAAAGTGCTGCACAAGGTTTTCTCAGATCCAGATTACCAGAATTATCTAGAGCGGAAGTTAAATTTGTACATGGAACCTCTGATTTCTTTGGACTGAATCATTATTCAACAAGTATTGTCTATAGAAATGAGAGCGCACCTGAAATTCATCCTGTACCATCATTCGGTGACGATCTGGATATAATAGCATATCAGTTACCCGAATGGAAAATTGGAAGTTCAAATTTTACTAAGTACGTTCCATGGGGCTTTCGGTCATTATTTAACTACATCAGCCATCAATACGGAAATCCACCTATCTTGGTGACTGAGAACGGATTTGCAACAAATGGTGGTATTATCGACGAAGACCGAGTGACATACTTCAGAGGCTACTTGAACGCTGTCTTAGATGCCATCGACGATGGTGTTGATATAAGAGGTTATATTGCCTGGAGTCTCATGGATAATTTCGAGTGGTCAAAAGGATACACTGAACGCTTCGGTCTGTATGAAGTCGACTACAACGACCCAAACCGTACTCGCACGCCTCGCAAGTCCGCTTATGTACTGAAGGAGATTATAAGGACACGATCTATTGATCCCAACTATGAACCTGACATGAGCCAACCCCTGACCATTGATGATGGACTCTAA

Protein sequence:

>DPOGS201330-PA
MRKFPARLIFGTATASYQVEGAWDVDGKSENIWDHLTHTNPCKVLDCSNGDVADNSYYLYKRDVEMMRELGLDTYRFSISWTRILPTGFPDYINKAGVAYYNNLIDEMLKYNIQPIVTLYHWDLPQKIQEMGGWTNSEIVNWFGDYARVIFNFFGDRVKYFITINEPHQICEFGYGKDIFAPALKIQGIADYLCMKNVLLGHARAYHIYDKEFRVNQNGKIFITINAEWHQPKTVNDEEAARDARQFYWEVYAHPIFSKSGNFPPEMIKRIADKSAAQGFLRSRLPELSRAEVKFVHGTSDFFGLNHYSTSIVYRNESAPEIHPVPSFGDDLDIIAYQLPEWKIGSSNFTKYVPWGFRSLFNYISHQYGNPPILVTENGFATNGGIIDEDRVTYFRGYLNAVLDAIDDGVDIRGYIAWSLMDNFEWSKGYTERFGLYEVDYNDPNRTRTPRKSAYVLKEIIRTRSIDPNYEPDMSQPLTIDDGL-