Monarch geneset OGS2.0

DPOGS201335
TranscriptDPOGS201335-TA1272 bp
ProteinDPOGS201335-PA423 aa
Genomic positionDPSCF300176 + 643238-648124
RNAseq coverage9x (Rank: top 85%)
Annotation
HeliconiusHMEL0123992e-15865.05% 
BombyxBGIBMGA010536-TA3e-13759.41% 
DrosophilaCG9701-PA5e-10447.45% 
EBI UniRef50UniRef50_G6DAN33e-16381.23%Glycoside hydrolase n=10 Tax=Obtectomera RepID=G6DAN3_DANPL
NCBI RefSeqXP_001237813.11e-11354.96%AGAP006424-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3640235859e-13960.75%seminal fluid protein CSSFP001 [Chilo suppressalis]
NCBI nr blastxgi|3640236132e-13960.92%seminal fluid protein CSSFP031 [Chilo suppressalis]
Group
Gene OntologyGO:00045532.3e-170hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059752.3e-170carbohydrate metabolic process
GO:00431692e-118cation binding
GO:00038242e-118catalytic activity
KEGG pathwaytca:6645777e-95 
 K05350 (bglB)maps-> Starch and sucrose metabolism
    Phenylpropanoid biosynthesis
    Cyanoamino acid metabolism
InterPro domain[17-379] IPR0013602.3e-170Glycoside hydrolase, family 1
[30-377] IPR0137812e-118Glycoside hydrolase, subgroup, catalytic core
[29-417] IPR0178534.8e-115Glycoside hydrolase, superfamily
Orthology groupMCL10040 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201335-TA
ATGATCGGCCTTTTGGGTTACACTTACCTGGCTACAATTTTGGTCGTAGTGGTCAATAGCAAAAACACGTCTAAACATGAAGCAAGGAAACTACCCGACGACTTACTTTTCGGAGCATCTACGGCGTCGTACCAAATAGAAGGGGCTTGGAACCTAGACGGTAAATCTGAAAGTACTTGGGATCGGTTATCACACCATCAACCTTGTGTTATTCACAACTGTGACACGGGCGATATCGCCGCTGATTCCTATCATCAATACAAGCGAGACGTGGAAATGATACGGGAACTAGGCCTCGACTTTTACAGGTTCTCTCTCTCCTGGACGAGAATATTACCAACGAGCTTTCCAGATCAAATCAATGAAAAAGGAGTGCAATATTACAATAATTTGATAAATGAGATGCTCAAATACAACATACAACCCATGGTGACCATTTATCACTTTGATTTACCTCAAAAGTTGCAAGATCTGGGAGGATGGAACAATCCCCATATAGTTGATTGGTTTACCGATTATTCAAGAGTAGTTTTTGAGTTGTTTGGAGACAGAGTTAAGTATTGGATATCTTTTAATGAACCTCGAGAGATATGTGCTCATTCAACCCTAGAACCAGCACTAAGTTCATCTTATAGTGTTTCTGGATATGCTAATTACATGTGTGCCAAAAATCTGCTAGTAGCACATGCTAACGTCTACCATTTGTACAACAATGAATTTCGTAAAGTCCAAGGTGGTCAAGTCGGTATAACAATAAGTTCCGCGTGGTATGAACCTGAATCAGAAAAGGATATAGAAGCTGCTGAAGATATCATACAATTCGAGATGGGAATTTATGCAAATCCGATATTTTCGGAATCTGGAGATTATCCGTCAATCGTGAAAGAAAGGATAGCAGCAAAAAGTAAGGAACAAGGATTTCCAAGATCACGATTACCACAATTCACTCCAGAGGAAGTTGATTTAATTAAAGGAAGCTACGACTTCTTTGGGTTGAATCATTATACTACTTATATGGTTTATAGAAATGAATCAGTATATGGACATTATAGTTCTCCATCTTTTGATGATGATATCGAAGTGATAAGTTATCAAGACGATTCCTGGGATTCAGGTGCTTCATTGTGGATGAAGGTGGACTACGAGAGTCCTGAACGCACCCGCACTCCTCGCAAGTCTGCTTACGTGTACAAAGAGTTGCTGCGCACACGAACACTGGACTATCATTATGAACCTGACATGAGCTTGGGAATGCATGTCGATGATAATTAA

Protein sequence:

>DPOGS201335-PA
MIGLLGYTYLATILVVVVNSKNTSKHEARKLPDDLLFGASTASYQIEGAWNLDGKSESTWDRLSHHQPCVIHNCDTGDIAADSYHQYKRDVEMIRELGLDFYRFSLSWTRILPTSFPDQINEKGVQYYNNLINEMLKYNIQPMVTIYHFDLPQKLQDLGGWNNPHIVDWFTDYSRVVFELFGDRVKYWISFNEPREICAHSTLEPALSSSYSVSGYANYMCAKNLLVAHANVYHLYNNEFRKVQGGQVGITISSAWYEPESEKDIEAAEDIIQFEMGIYANPIFSESGDYPSIVKERIAAKSKEQGFPRSRLPQFTPEEVDLIKGSYDFFGLNHYTTYMVYRNESVYGHYSSPSFDDDIEVISYQDDSWDSGASLWMKVDYESPERTRTPRKSAYVYKELLRTRTLDYHYEPDMSLGMHVDDN-