Monarch geneset OGS2.0

DPOGS205116
TranscriptDPOGS205116-TA2163 bp
ProteinDPOGS205116-PA720 aa
Genomic positionDPSCF300172 + 36792-47790
RNAseq coverage1810x (Rank: top 7%)
Annotation
HeliconiusHMEL0045403e-9444.10% 
BombyxBGIBMGA005899-TA0.078.50% 
DrosophilaHexo1-PB1e-15259.55% 
EBI UniRef50UniRef50_P490100.078.25%Chitooligosaccharidolytic beta-N-acetylglucosaminidase n=10 Tax=Endopterygota RepID=HEXC_BOMMO
NCBI RefSeqNP_001037466.10.078.25%chitooligosaccharidolytic beta-N-acetylglucosaminidase precursor [Bombyx mori]
NCBI nr blastpgi|627224760.080.40%beta-N-acetylglucosaminidase [Choristoneura fumiferana]
NCBI nr blastxgi|627224760.080.40%beta-N-acetylglucosaminidase [Choristoneura fumiferana]
Group
Gene OntologyGO:00431694.5e-128cation binding
GO:00059754.5e-128carbohydrate metabolic process
GO:00038244.5e-128catalytic activity
GO:00045531.4e-97hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00045631.4e-23beta-N-acetylhexosaminidase activity
KEGG pathwayaag:AaeL_AAEL0046616e-153 
 K12373 (HEX)maps-> Lysosome
    Glycosaminoglycan degradation
    Amino sugar and nucleotide sugar metabolism
    Glycosphingolipid biosynthesis - globo series
    Other glycan degradation
    Glycosphingolipid biosynthesis - ganglio series
InterPro domain[334-719] IPR0137814.5e-128Glycoside hydrolase, subgroup, catalytic core
[336-691] IPR0178533.2e-113Glycoside hydrolase, superfamily
[336-677] IPR0158831.4e-97Glycoside hydrolase, family 20, catalytic core
[75-207] IPR0158821.4e-23Acetylhexosaminidase, subunit a/b
[165-185] IPR0015407.5e-22Glycoside hydrolase, family 20
Orthology groupMCL16459 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205116-TA
ATGTTTGCTGGTTGCGCTACAGAGTATGTGACGTCATGTTTTGCGTCAATCAGCGTGAGGTCTGAGGAACATCCGCCATGGAAGTGGTCCTGTGAGGATGGAGCGTGTAAGAAGATGAAACATGAACCGGGTAGTTCGGGATCAGCGCTGTCACTCGAGGCCTGCAAGATGTTCTGCAACGAATACGGTCTTCTGTGGCCCAGGCCGACGGGTTATACCGACTTAGGTCGATTCCTGTCCAAGATCAATATAAATAATATAGAATTACAAATCGAGAGGCGTGGAAGATCAGATAATCTGATGAACGCTGCCGGTCAGCGTTTCAAGAAGTTGGTGTCACTCGCAGTGCCATCCGGGATAATACCAAAAGCGACCGGGAAATCCGTCTATGTGTACCTGGTTAACGAAAAACCCGACGTCACGGACTTCTCAATCGATTTCGACGAGAGCTACAGGTTGAGAGTGTCCCCGGAGTCTAACGACCGTATAAACGCTACAATCGTTGGGAATAACTTCTTCGGTATCAGACACGGTCTGGAGACGCTATCGCAGCTCATCGTCCACGATGAGATTAAAAATCATTTGTTGATGGTCCGTGACGTCACAATCGATGATAAACCCGTGTACCCTTACAGAGGAGTGTTGTTGGACACCGCGAGGAACTATTTCTCTATCGACTCCATCAAAGAGACTATCGAGGCCATGAGTAGCGTGAAACTGAACACTTTCCACTGGCACATCACAGACAGCCAAAGTTTCCCCTTCGTATCCAAGAGACGGCCAGAACTCACTAAATACGGAGCTTACAGTCCCAGTAAAGCTGAACCTTGGGCGTCGTACTGCGTGGAACCTCCGTGCGGTCAACTGAACCCTACCAAGGAGGAGTTGTACGATGTTCTACAAGACATCTACACGGATATGGCCGTTTTAATTTCTTCGTGTTCGTTTAGTAATGAGAGTCCCTTACAGATGGTCCGTGACGTCACAATCGATGATAAACCCGTGTACCCTTACAGAGGAGTGTTGTTGGACACCGCGAGGAACTATTTCTCTATCGACTCCATCAAAGAGACTATCGAGGCCATGAGTAGCGTGAAACTGAACACTTTCCACTGGCACATCACAGACAGCCAAAGTTTCCCCTTCGTATCCAAGAGACGGCCAGAACTCACTAAATACGGAGCTTACAGTCCCAGTAAAATCTACACTGAAGAGATGATCCGTGATGTGGTGGAGTTCGCTCGTGTCCGCGGAGTCCGAGTGCTGCCCGAGTTTGACGCTCCAGCACACGTGGGCGAGGGCTGGCAGGAGACAGACCTCACTGTTTGCTTCAAGGCTGAACCTTGGGCGTCGTACTGCGTGGAACCTCCGTGCGGTCAATTGAACCCTACCAAGGAGGAGCTGTACGATGTTCTACAAGACATCTACACGGATATGGCCGATGTTTTCCCGTCGGACCTCTTCCACATGGGTGGAGACGAGGTGTCGGAGCGCTGCTGGAACTCGTCGCGCCAGGTGCAGCAGTTTATGGAGGAGAACCGCTGGGGACTGGACAAGGCCAGCTATTTACAACTGTGGAACTACTTCCAGAATAAAGCCCAAGATAGGGTGTACAAGGCATTTGGTAAAAGGATCCCACTGATTCTATGGACCAGCACGCTAACTGATTACAGTCACGTCGACAAGTTCTTAAACAAAGACGATTACATTATTCAAGTGTGGACTACTGGCGAAGACCCTCAAATATCAGGTCTCCTGCAGAAGGGTTATCGTCTCATCATGTCCAACTACGACGCCCTGTATTTCGACTGTGGTTTCGGTGCTTGGGTTGGAACTGGCAACAACTGGTGCTCTCCGTACATCGGATGGCAGAAAGTTTATGAAAATAGTCCTAAACAGATGGCGAGAGACCACCAAGATCAAATCCTAGGTGGTGAAGCAGCGCTGTGGTCTGAGCAGTCTGACTCAGCGACCCTGGACAGTCGCCTGTGGCCGCGGGCCGCCGCCCTCGCTGAGAGGTTGTGGGCGGAGCCCGCGACCAGCTGGAGGGAGGCCGAGCGGCGGATGTTGAACGTACGCGAGCGTCTCGTCCGTAAAGGCATCAAAGCGGAGTCCCTGGAGCCCGAGTGGTGCTATCAGAACGACGGCTACTGCTACGCCTGA

Protein sequence:

>DPOGS205116-PA
MFAGCATEYVTSCFASISVRSEEHPPWKWSCEDGACKKMKHEPGSSGSALSLEACKMFCNEYGLLWPRPTGYTDLGRFLSKININNIELQIERRGRSDNLMNAAGQRFKKLVSLAVPSGIIPKATGKSVYVYLVNEKPDVTDFSIDFDESYRLRVSPESNDRINATIVGNNFFGIRHGLETLSQLIVHDEIKNHLLMVRDVTIDDKPVYPYRGVLLDTARNYFSIDSIKETIEAMSSVKLNTFHWHITDSQSFPFVSKRRPELTKYGAYSPSKAEPWASYCVEPPCGQLNPTKEELYDVLQDIYTDMAVLISSCSFSNESPLQMVRDVTIDDKPVYPYRGVLLDTARNYFSIDSIKETIEAMSSVKLNTFHWHITDSQSFPFVSKRRPELTKYGAYSPSKIYTEEMIRDVVEFARVRGVRVLPEFDAPAHVGEGWQETDLTVCFKAEPWASYCVEPPCGQLNPTKEELYDVLQDIYTDMADVFPSDLFHMGGDEVSERCWNSSRQVQQFMEENRWGLDKASYLQLWNYFQNKAQDRVYKAFGKRIPLILWTSTLTDYSHVDKFLNKDDYIIQVWTTGEDPQISGLLQKGYRLIMSNYDALYFDCGFGAWVGTGNNWCSPYIGWQKVYENSPKQMARDHQDQILGGEAALWSEQSDSATLDSRLWPRAAALAERLWAEPATSWREAERRMLNVRERLVRKGIKAESLEPEWCYQNDGYCYA-