Monarch geneset OGS2.0

DPOGS204062
TranscriptDPOGS204062-TA1497 bp
ProteinDPOGS204062-PA498 aa
Genomic positionDPSCF300200 - 86091-90644
RNAseq coverage512x (Rank: top 24%)
Annotation
HeliconiusHMEL0123993e-14550.83% 
BombyxBGIBMGA010735-TA5e-17359.50% 
DrosophilaCG9701-PA2e-12544.68% 
EBI UniRef50UniRef50_G9F9H20.061.84%Seminal fluid protein CSSFP020 n=1 Tax=Chilo suppressalis RepID=G9F9H2_9NEOP
NCBI RefSeqXP_001850321.11e-13747.53%glycoside hydrolase [Culex quinquefasciatus]
NCBI nr blastpgi|3640235910.061.84%seminal fluid protein CSSFP020 [Chilo suppressalis]
NCBI nr blastxgi|3640235910.061.96%seminal fluid protein CSSFP020 [Chilo suppressalis]
Group
Gene OntologyGO:00045531.1e-213hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059751.1e-213carbohydrate metabolic process
GO:00431694.6e-168cation binding
GO:00038244.6e-168catalytic activity
KEGG pathwaytca:6645771e-118 
 K05350 (bglB)maps-> Starch and sucrose metabolism
    Phenylpropanoid biosynthesis
    Cyanoamino acid metabolism
InterPro domain[13-483] IPR0013601.1e-213Glycoside hydrolase, family 1
[14-474] IPR0137814.6e-168Glycoside hydrolase, subgroup, catalytic core
[13-487] IPR0178532.2e-156Glycoside hydrolase, superfamily
Orthology groupMCL30775 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204062-TA
ATGTTTCAAGTTGAAGGCTGGTCGGATCTCAAGGTTCGAAGATTCCCCGATGGCTTTTTGTTTGGCGCGGGGACGTCGGCTTATCAGGTCGAAGGGGCGTGGAATGAAGATGGAAAAGGTGAAAGCATCTGGGACAAATACCTCCACGATAACCCAGACATTATATCCGATGGCAGAAATGGTGATGTAGCATCCAACTCCTACCACCAGTACAAGAGAGATGTGGAAATGTTGAGGGAATTGGGTGTGGACTACTACAGGTTCTCAATATCCTGGAGCAGAGTATTGCCTAGAGGATTCTCGAATGAAATAAATGAAAAAGGTCTCGAATACTACGACAAATTGATAGATGAATTATTGAAATACAACATAAAGCCAATGATAACTTTATACCACTTTGATTTGCCACAAACTCTCCAAGACTTTGGAGGTTGGGCCAATCCGCTGTCAACAGAATGGTTTGAAGATTATGCGGCTGTGATCTTTAAGGCATTCGCTCACAAGGTTCCTTATTGGATAACCGTCAATCAGCCAAATTCCATATGCGTGGAAGGTTATGGTCAAGGTTTGATGGCACCAGCGATCAGCTCGAGTGGAATCGGTGATTACATGTGTATAAAGAATGTGCTGGTGGCACATGCGAGGGCATACAGGTTATATGAGAGGGAATATAAAAAGAAATTTAAGGGATCAGTTGGCATAGCGCTTGCATTAAACTGGGCAGACCCCGTCAATAACAGCACAAAAAATGTCGAAGCTACGGACGTTTACAGAGAATTTATGATCGGTCTCTACATGCATCCCATATGGTCGAAAGATGGTGGGTTCCCTAAAATGGTCAAAGAAAGAGTCCATCAGAACAGCATAAAGCAAGGATTCAAGAAATCTAGACTGCCTGCCCTTAGCAAGGAAGAAGTTACTCTTTTGAAAGGGTCCTCAGACTTCGTGGGAGTGAATCATTATACAACTGTCCTAGTGAAGAGCACGGACAGGGGGATGTCAGCGCCATCTTTCGATGACGACGTTCACGTGGAGCTCACCTACAGGCCGGAGTGGAAGAACGCCACATCTAGCTGGCTGAAGAGCGTGCCCTACGGTATATACAGGGTGTGCGTATATCTCAATACAAAGTACGACTACCCTCAAATGTTTGTGACGGAGCACGGCTGGTCCACGAGGCCAGGGTTGAAGGATGACACGAGGGTTGAGAACCTGAGGCTGTACCTGAAGGCTATACTGTTTGCTATAGAAGATGGCACGGACTTGAAAGGTTACACCACATGGAGCCTAATGGATAATGTGGAGTGGGTCGCTGGAACCAGTGAAAGATTCGGTCTTTATGAAGTAGACTTCGAATCAGAGGATAAAAATAGAACAGCGAGATTGTCAGCTCTGGTGTATAAACGAATCATAGACAAGAGGATCGTTGAAGACGATTATAAACCGAACAATTTAAAAATGTCGATAACTAACAGAAATGTTAAGACGGAACTTTGA

Protein sequence:

>DPOGS204062-PA
MFQVEGWSDLKVRRFPDGFLFGAGTSAYQVEGAWNEDGKGESIWDKYLHDNPDIISDGRNGDVASNSYHQYKRDVEMLRELGVDYYRFSISWSRVLPRGFSNEINEKGLEYYDKLIDELLKYNIKPMITLYHFDLPQTLQDFGGWANPLSTEWFEDYAAVIFKAFAHKVPYWITVNQPNSICVEGYGQGLMAPAISSSGIGDYMCIKNVLVAHARAYRLYEREYKKKFKGSVGIALALNWADPVNNSTKNVEATDVYREFMIGLYMHPIWSKDGGFPKMVKERVHQNSIKQGFKKSRLPALSKEEVTLLKGSSDFVGVNHYTTVLVKSTDRGMSAPSFDDDVHVELTYRPEWKNATSSWLKSVPYGIYRVCVYLNTKYDYPQMFVTEHGWSTRPGLKDDTRVENLRLYLKAILFAIEDGTDLKGYTTWSLMDNVEWVAGTSERFGLYEVDFESEDKNRTARLSALVYKRIIDKRIVEDDYKPNNLKMSITNRNVKTEL-