Monarch geneset OGS2.0

DPOGS214332
TranscriptDPOGS214332-TA1905 bp
ProteinDPOGS214332-PA634 aa
Genomic positionDPSCF300020 - 511205-513565
RNAseq coverage372x (Rank: top 32%)
Annotation
HeliconiusHMEL0045400.084.57% 
BombyxBGIBMGA003990-TA0.077.01% 
Drosophilafdl-PB8e-15042.18% 
EBI UniRef50UniRef50_B1P8680.079.69%Beta-N-acetylglucosaminidase n=1 Tax=Spodoptera frugiperda RepID=B1P868_SPOFR
NCBI RefSeqNP_001165928.10.077.01%fused lobes [Bombyx mori]
NCBI nr blastpgi|1688125950.079.69%beta-N-acetylglucosaminidase [Spodoptera frugiperda]
NCBI nr blastxgi|1688125950.079.69%beta-N-acetylglucosaminidase [Spodoptera frugiperda]
Group
Gene OntologyGO:00431696.1e-100cation binding
GO:00059756.1e-100carbohydrate metabolic process
GO:00038246.1e-100catalytic activity
GO:00045531.7e-79hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00045634.7e-48beta-N-acetylhexosaminidase activity
KEGG pathwaytca:6560277e-169 
 K12373 (HEX)maps-> Lysosome
    Glycosaminoglycan degradation
    Amino sugar and nucleotide sugar metabolism
    Glycosphingolipid biosynthesis - globo series
    Other glycan degradation
    Glycosphingolipid biosynthesis - ganglio series
InterPro domain[247-633] IPR0137816.1e-100Glycoside hydrolase, subgroup, catalytic core
[249-604] IPR0178531.7e-88Glycoside hydrolase, superfamily
[249-591] IPR0158831.7e-79Glycoside hydrolase, family 20, catalytic core
[205-225] IPR0015404.7e-48Glycoside hydrolase, family 20
[126-247] IPR0158822.2e-11Acetylhexosaminidase, subunit a/b
Orthology groupMCL14551 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214332-TA
ATGAAGTCGTGGGGGGAAACGTTATGGCGGAGTGCCTCCTCGCACTTGTACCGAGTCGGCAGGTTGCGGCGAGCTCTGCTGCTCTTGGCTGCCGCTGCCTGTACTGCTGCGGGCCTTCTCTATTGGAGACAACAGACGGACGACTCTGCTAGACGACCGCTACACTTGTACGCTGGTGTAGAGCCACAGTGGTCTTGGGTATGTCGGAACGATCGTTGTGAGCGACTCCTAGCATCAGAGACCTCTATACTTCAGTCACTTCCTACGTGCAACATGCTCTGCGCGTCCACTCAATTATGGCCGCAGCCAACCGGCCCCGTCAGTCTCGCTACGGCCTCAGTTCATGTCAGATCCAGCGGCTTCTCACTACAAGTCATATCCTCTCCATCAAGAGAAGTGACAGAAAACCTCAACGATGCCTTCCAATTAATGCGCGACGACTTGAAAATTCTGGAGAAAAACGCGGGCGTAGAGAACAGGAGATCAGATAGTGGAACTCCCCGTGAAGTTGTTGTAAGGGTCGCTGTGAACGGCAGCGCTGATCCACGCATGCGACAAGACACCGATGAAACCTACAAGCTCTCTCTCAGACCGTCGGGGAAGTCCCTCGTCGCTGATATAACAGCGCATTCCTTCTGTGGAGCTCGGCACGGCTTTGAAACTCTGTCCCAACTAGTGTGGTTGGATCCTTACGCTGAATCTCTCTTAATACTCGAAGCTGCCACCGTGGACGACGGCCCTCGGTTTAGATATCGTGGTTTGTTATTGGATACAGCCAGGAATTTCTTCCCCGTAACTGACATATTGCGTACAATCGATGCTATGGGAGCGTGCAAGCTGAACACGTTCCATTGGCATGTGAGTGACTCGCAGTCCTTTCCTTTGAGACTGAACAGCGCTCCTCAACTAGCTCAGCACGGAGCTTATGGGCCTGGTGCTATATACACGACTGACGATGTAAGGGCTATAGTACGCCGAGCTAGATTGAGAGGAATACGTGTCTTGATAGAAGTAGATGCGCCGGCGCATGTTGGACGAGCGTGGTCGTGGGGCCCTCCTGCTGGGTTAGGACACTTAGCGCATTGTGTTGAAGTAGAACCTTGGAGTACTTATTGTGGTGAACCGCCTTGTGGGCAATTAAACCCACGAAATCCACACGTTTACTCACTTCTTGAACAGATTTATGCCGAAATCATTCAACTGACCGAAGTGGACGATATCTTCCATTTAGGCGGGGACGAGGTCTCGGAGCGGTGTTGGGCTCAACACTTTAACGACACGGATCCCATGGAGTTATGGTTTGAGTTCACTCGTCGCGCCATGTCCTCCCTCGAACGTGCCAATGGCGGTAAACTGCCAGATCTAACGTTACTGTGGTCTTCTCGGCTAACTCATACACCGTACCTGGAACGTTTAGATAAGAAGAGACACGGCGTGCAGGTGTGGGGCTCGTCCCGGTGGCCGGAATCTCGCGCGGTATTGGACGCGGGCTACAGAACGATCATATCTCACGTAGACGCTTGGTACTTAGACTGCGGCTTCGGGTCCTGGCGAGATAGTTCCGACGGTCACTGTGGACCTTACCGGTCTTGGCAGCAAATTTACGAGCACAGACCCTGGATAGAGGAAATGCCGGCCATGTCTACTGGAGTCGAACCATGGCAAGTGGAAGGCGGCGCGACGTGTCAGTGGACGGAACAGCTGGGTTCCGGAGGTTTGGATGCTAGAGTGTGGCCGAGGACTGCGGCGGTCGCGGAGCGTCTCTGGTCGGACCGCGCCGAGGGCGCCACCGCCGACGTCTACCTGCGACTCGACACACAACGATCACGACTCCTAGATAAAGGGATCCAAGCCGCTCCTCTCTGGCCGCGGTGGTGCTCTCACAACCCTCACGCCTGCCTTTAG

Protein sequence:

>DPOGS214332-PA
MKSWGETLWRSASSHLYRVGRLRRALLLLAAAACTAAGLLYWRQQTDDSARRPLHLYAGVEPQWSWVCRNDRCERLLASETSILQSLPTCNMLCASTQLWPQPTGPVSLATASVHVRSSGFSLQVISSPSREVTENLNDAFQLMRDDLKILEKNAGVENRRSDSGTPREVVVRVAVNGSADPRMRQDTDETYKLSLRPSGKSLVADITAHSFCGARHGFETLSQLVWLDPYAESLLILEAATVDDGPRFRYRGLLLDTARNFFPVTDILRTIDAMGACKLNTFHWHVSDSQSFPLRLNSAPQLAQHGAYGPGAIYTTDDVRAIVRRARLRGIRVLIEVDAPAHVGRAWSWGPPAGLGHLAHCVEVEPWSTYCGEPPCGQLNPRNPHVYSLLEQIYAEIIQLTEVDDIFHLGGDEVSERCWAQHFNDTDPMELWFEFTRRAMSSLERANGGKLPDLTLLWSSRLTHTPYLERLDKKRHGVQVWGSSRWPESRAVLDAGYRTIISHVDAWYLDCGFGSWRDSSDGHCGPYRSWQQIYEHRPWIEEMPAMSTGVEPWQVEGGATCQWTEQLGSGGLDARVWPRTAAVAERLWSDRAEGATADVYLRLDTQRSRLLDKGIQAAPLWPRWCSHNPHACL-