Monarch geneset OGS2.0

DPOGS213239
TranscriptDPOGS213239-TA369 bp
ProteinDPOGS213239-PA122 aa
Genomic positionDPSCF300124 - 389993-390553
RNAseq coverage214x (Rank: top 46%)
Annotation
HeliconiusHMEL0047052e-6294.87% 
BombyxBGIBMGA009518-TA2e-5176.42% 
DrosophilaCG5731-PA2e-4065.18% 
EBI UniRef50UniRef50_E0VBI51e-4169.44%Alpha-N-acetylgalactosaminidase, putative n=5 Tax=Bilateria RepID=E0VBI5_PEDHC
NCBI RefSeqXP_001606799.13e-4483.51%PREDICTED: similar to ENSANGP00000020847 [Nasonia vitripennis]
NCBI nr blastpgi|3320255512e-4988.30%Alpha-N-acetylgalactosaminidase [Acromyrmex echinatior]
NCBI nr blastxgi|3227997813e-4777.06%hypothetical protein SINV_01689 [Solenopsis invicta]
Group
Gene OntologyGO:00081523.2e-31metabolic process
GO:00038243.2e-31catalytic activity
GO:00045531.3e-11hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059751.3e-11carbohydrate metabolic process
KEGG pathwayame:7258995e-41 
 K01204 (NAGA)maps-> Lysosome
    Glycosphingolipid biosynthesis - globo series
InterPro domain[24-117] IPR0137853.2e-31Aldolase-type TIM barrel
[24-116] IPR0178532.1e-30Glycoside hydrolase, superfamily
[26-45] IPR0022411.3e-11Glycoside hydrolase, family 27
[31-118] IPR0001112.1e-07Glycoside hydrolase, clan GH-D
Orthology groupMCL25617 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213239-TA
ATGGGGTCATGGAAACTGTGCGTGTGGATATTGTTTGTGGTTGTGTGTATCCTTGGGAAGATATCAGGTCTTGATAATGGGTTGGCATTGACCCCTCCCATGGGATGGCTGGCTTGGGAGAGGTTCAGATGTAACACGGACTGTAAAAATGATCCTGATAATTGTATAAGCGATCGTCTGTTCAGAACGATGACGGATATTCTGGTAGCCGAGGGTTATGCTGCAGCTGGTTACGAATACGTGAACGTGGACGACTGCTGGCCTGAAAGAGAGAGAGATCCGAGGGGGAAGCTTGTACCTGACAGAGAACGCTTCCCTTATGGAATGAAGAGCCTGTCTGATTACGTGAGTAATATATATATAAATTAA

Protein sequence:

>DPOGS213239-PA
MGSWKLCVWILFVVVCILGKISGLDNGLALTPPMGWLAWERFRCNTDCKNDPDNCISDRLFRTMTDILVAEGYAAAGYEYVNVDDCWPERERDPRGKLVPDRERFPYGMKSLSDYVSNIYIN-