Monarch geneset OGS2.0

DPOGS214376
TranscriptDPOGS214376-TA1233 bp
ProteinDPOGS214376-PA410 aa
Genomic positionDPSCF300020 + 882111-885750
RNAseq coverage171x (Rank: top 51%)
Annotation
HeliconiusHMEL0142373e-12076.47% 
BombyxBGIBMGA004000-TA2e-13761.83% 
DrosophilaCG7997-PA6e-14557.14% 
EBI UniRef50UniRef50_E0VBI52e-12453.63%Alpha-N-acetylgalactosaminidase, putative n=5 Tax=Bilateria RepID=E0VBI5_PEDHC
NCBI RefSeqNP_001040191.17e-16466.01%alpha-N-acetylgalactosaminidase [Bombyx mori]
NCBI nr blastpgi|1140519161e-16266.01%alpha-N-acetylgalactosaminidase precursor [Bombyx mori]
NCBI nr blastxgi|1140519164e-16366.17%alpha-N-acetylgalactosaminidase precursor [Bombyx mori]
Group
Gene OntologyGO:00081521.4e-115metabolic process
GO:00038241.4e-115catalytic activity
GO:00045532.6e-52hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059752.6e-52carbohydrate metabolic process
GO:00431692.8e-18cation binding
KEGG pathwayder:Dere_GG222721e-143 
 K01189 (GLA)maps-> Galactose metabolism
    Lysosome
    Glycerolipid metabolism
    Sphingolipid metabolism
    Glycosphingolipid biosynthesis - globo series
InterPro domain[16-306] IPR0137851.4e-115Aldolase-type TIM barrel
[18-305] IPR0178532.3e-99Glycoside hydrolase, superfamily
[20-39] IPR0022412.6e-52Glycoside hydrolase, family 27
[307-399] IPR0137802.8e-18Glycosyl hydrolase, family 13, all-beta
[51-132] IPR0001111.7e-12Glycoside hydrolase, clan GH-D
Orthology groupMCL17256 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214376-TA
ATGGGAACTCATTTAATCGCCATTTTTGCAATAATACCATATGTCTTGGCTCTCGATAATGGACTAGCGCTCACTCCGCCAATGGGGTGGTTGACCTGGCAGCGATTTCGATGTATAACAGATTGCGATAAATATCCAAATGAGTGTATAAGTGAATCTCTCATTAAACGGATGGCAGACATTATGGTCAACGAGGGATATTCCCACGCTGGGTACAAATACGTCGGCATCGACGACTGTTGGCTCGAGAAAACACGTGACGCAAACGGTCGATTGGTTCCCGATAGGAAACGGTTTCCGAACGGTATGAAGGCTGTCGCAGATTATCTGCATGATCTCGGTTTAAAATTCGCGTTATACCAGGATTACGGTACAAAAACCTGCGCTGGTTACCCCGGGGTACTAGGGCATGAGGCTGTTGACGTTCAGACATTCGCCGAATGGGAAGTGGATTATATTAAATTAGACGGATGTAATGTCAACGTTTCCAAGATGGACACCGGTTATCCGGAATTTGGAAAATTGATGAATGAAAGCGGTCGGCCCATGGTATACTCATGTAGCTGGCCAGCGTATCAGAATAAACCTGATTATGCATCGATATCGAAGCACTGTAACATGTGGCGTAACTGGGACGATATCCAGGACTCGTGGGCTTCACTCACCACGATCATGAGCTGGTTTGCGGAAAAACAGGAAGAAATCGCCAAATACGCCGGACCCGGAAGATGGAATGACCCGGATATGTTGCTCATAGGAAATTTTGGATTATCACTGGACCAGGCGAGAGTTCAAATGGCCGTGTGGTCGATACTGGCCGCCCCACTGCTCATGAGTGTAGATCTGGCCACCATCCGACCGGAGTTTAAGGAGGTGTTGCTTAACAAAGACATCATAGCCATAGATCAAGACGAGCTGGGCAAGCAAGGGTTAATGGTGTGGAATAAAGCGAAATGCGAGATCTGGACACGCGAATTAGTGGACGGTATAGCGGTAGCGTTTGTCAGTAAAAGAGATGATGGAGCGCCTCACACTGTTGATGTTACAACTGAGGATATGAAAATACCACCGACGACGTATCATATACAGGATCTGTACAAAGATGGACATAATTTCAAATTTGATTGCAAAGGAAACTTCACAACCAGAATCAATCCGTCAGGCGTCAGATTCTACAAGTTCATCCCCATAAAAGGCAATGAGGTTGATAGCCCTTCTATCACCTATATATAG

Protein sequence:

>DPOGS214376-PA
MGTHLIAIFAIIPYVLALDNGLALTPPMGWLTWQRFRCITDCDKYPNECISESLIKRMADIMVNEGYSHAGYKYVGIDDCWLEKTRDANGRLVPDRKRFPNGMKAVADYLHDLGLKFALYQDYGTKTCAGYPGVLGHEAVDVQTFAEWEVDYIKLDGCNVNVSKMDTGYPEFGKLMNESGRPMVYSCSWPAYQNKPDYASISKHCNMWRNWDDIQDSWASLTTIMSWFAEKQEEIAKYAGPGRWNDPDMLLIGNFGLSLDQARVQMAVWSILAAPLLMSVDLATIRPEFKEVLLNKDIIAIDQDELGKQGLMVWNKAKCEIWTRELVDGIAVAFVSKRDDGAPHTVDVTTEDMKIPPTTYHIQDLYKDGHNFKFDCKGNFTTRINPSGVRFYKFIPIKGNEVDSPSITYI-