Monarch geneset OGS2.0

DPOGS210840
TranscriptDPOGS210840-TA1395 bp
ProteinDPOGS210840-PA464 aa
Genomic positionDPSCF300027 + 88615-93878
RNAseq coverage245x (Rank: top 42%)
Annotation
HeliconiusHMEL0213052e-5548.87% 
BombyxBGIBMGA003916-TA9e-14053.00% 
DrosophilaCG7985-PA4e-7234.06% 
EBI UniRef50UniRef50_E0VEV01e-8937.26%Putative uncharacterized protein n=1 Tax=Pediculus humanus corporis RepID=E0VEV0_PEDHC
NCBI RefSeqXP_001607502.15e-11241.47%PREDICTED: similar to hexosaminidase (glycosyl hydrolase family 20, catalytic domain) containing [Nasonia vitripennis]
NCBI nr blastpgi|1565490761e-11041.47%PREDICTED: hexosaminidase D-like [Nasonia vitripennis]
NCBI nr blastxgi|3838656259e-11242.00%PREDICTED: hexosaminidase D-like [Megachile rotundata]
Group
Gene OntologyGO:00431691.8e-29cation binding
GO:00059751.8e-29carbohydrate metabolic process
GO:00038241.8e-29catalytic activity
GO:00045539.7e-16hydrolase activity, hydrolyzing O-glycosyl compounds
KEGG pathwaytca:6611882e-105 
 K04678 (SMURF)maps-> Ubiquitin mediated proteolysis
    Endocytosis
    TGF-beta signaling pathway
InterPro domain[1-322] IPR0178533e-43Glycoside hydrolase, superfamily
[2-341] IPR0137811.8e-29Glycoside hydrolase, subgroup, catalytic core
[48-225] IPR0158839.7e-16Glycoside hydrolase, family 20, catalytic core
Orthology groupMCL17769 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210840-TA
ATGCAGAACAGGATTGTGCATTTCGATTTAAAAGGTGCTCCCCCTAAACTGTGCTATTTAGAAAAGATCTTCAAGATAATAAAGAAATGGGGCGCTACTGGTGTTTTATTAGAGTGGGAGGACACATTCCCATATAGCGGAGAGCTGGTAGATATTGGCAGTGTCCTTGGTTGCGGTGGTGACGGCATGTATTCTATGGATGAAGTGCGACAAATACTACAGTTAGCTAGGAACTGTGGACTTGAAGTCATTCAACTAATTCAAACTATTGGACACATGGAGTTTGTTCTGAAGCACCCTTTGTTCCAAGATCTCAGGGAATTGCCATATTCTCCGGCTGTTTTGTGTCCATCACAGCACCGTTCTCAATTACTAGTGAGAGAGATGTTGAGGCAGGTTTTGGAGGTACAGCCGGATGCTAGATATATACACATTGGGGCAGATGAGGTTTGGCACAGAGGGGAATGTGAACTTTGTAAATATAAAGCATCAACGAACGAACACAAATTACACTCAATTTATTTAGAACACATACGAGATTTAGCCTTATTTATAAAGCAGTTGAGACCGGATTTGATTGTTCTCATGTGGGATGACATGCTGCGGTCTATAAGTGTAGATGTATTGAAAAATTACAGCCTGGGTGAGTTAGTTCAGCCAGTGGTGTGGAACTACAGTCCGCTGCATTTGTTCCATGTTGAAGTGCAATTATGGACATGTTACAGTCAGGTGTTCCCAAGTGTTTGGGCTGCTTCAGCTTACAAAGGAGCCAGCGGAAGTTGTGAGATCTGGCCGGTGGTATCCCGTTACGCCAGCAACCAACAAGCCTGGTTGAAGACAGTCAAAGAGTATTCCTCGGCTGTTAACTTTGTTGGAGTCGTCCTTACTGGTTGGTCGAGGTTCGATCATTACGCCACTCTATGTGAACTGTTGCCGCCATCTTTGCCAAGTTTGTCTATCTGTCTGAAGATGTGGATGACTATGGACGAATGTTTTGACAACTCGGAGTCGTTGCCGCTGGAGGAGTGGCCGGGAGTAGAACTCGCACTCAGCATACGAAACTTCGCTTCGTTGAGGGAACGCGCGCATAACGTCATGTACAGAGAGCTCGTTCCCACGTGGCTGAACCCCTGGCAGCTGCAGCACGCGTACACCAGCCCCATACAACTACGTGGCATCGTGGCTACTATGACGCAAATAATAGCGGATATAAAGGCGATACATAGCGAACTTCTAACGCAATTTCCTTTATATACGGGGGAGAGGAGTGCTCAGGAGTGGCTCGGCTCTCTGGTGACGCCTTTGTTGAGGAAGGTTACGGAGGTACACGACGTAGCTGCTATAAGGACGGACATGCAGGCCGGGGTCACACCGGGGATGACAGCCACTCGTTAA

Protein sequence:

>DPOGS210840-PA
MQNRIVHFDLKGAPPKLCYLEKIFKIIKKWGATGVLLEWEDTFPYSGELVDIGSVLGCGGDGMYSMDEVRQILQLARNCGLEVIQLIQTIGHMEFVLKHPLFQDLRELPYSPAVLCPSQHRSQLLVREMLRQVLEVQPDARYIHIGADEVWHRGECELCKYKASTNEHKLHSIYLEHIRDLALFIKQLRPDLIVLMWDDMLRSISVDVLKNYSLGELVQPVVWNYSPLHLFHVEVQLWTCYSQVFPSVWAASAYKGASGSCEIWPVVSRYASNQQAWLKTVKEYSSAVNFVGVVLTGWSRFDHYATLCELLPPSLPSLSICLKMWMTMDECFDNSESLPLEEWPGVELALSIRNFASLRERAHNVMYRELVPTWLNPWQLQHAYTSPIQLRGIVATMTQIIADIKAIHSELLTQFPLYTGERSAQEWLGSLVTPLLRKVTEVHDVAAIRTDMQAGVTPGMTATR-