Monarch geneset OGS2.0

DPOGS206993
TranscriptDPOGS206993-TA1380 bp
ProteinDPOGS206993-PA459 aa
Genomic positionDPSCF300001 + 843194-847745
RNAseq coverage1139x (Rank: top 11%)
Annotation
HeliconiusHMEL0143710.080.26% 
BombyxBGIBMGA012952-TA0.068.79% 
DrosophilaCG18278-PA2e-11745.95% 
EBI UniRef50UniRef50_Q17CP83e-12647.79%Sulfatase n=4 Tax=Endopterygota RepID=Q17CP8_AEDAE
NCBI RefSeqXP_001649229.16e-12747.79%sulfatase [Aedes aegypti]
NCBI nr blastpgi|1571062301e-12547.79%sulfatase [Aedes aegypti]
NCBI nr blastxgi|1571062307e-12647.79%sulfatase [Aedes aegypti]
Group
Gene OntologyGO:00084494.7e-146N-acetylglucosamine-6-sulfatase activity
GO:00302034.7e-146glycosaminoglycan metabolic process
GO:00057641.8e-145lysosome
GO:00081522.8e-82metabolic process
GO:00038242.8e-82catalytic activity
GO:00084849.2e-57sulfuric ester hydrolase activity
KEGG pathwayaag:AaeL_AAEL0044602e-126 
 K01137 (GNS)maps-> Lysosome
    Glycosaminoglycan degradation
InterPro domain[1-458] IPR0122514.7e-146N-acetylglucosamine-6-sulfatase
[1-424] IPR0159811.8e-145N-acetylglucosamine-6-sulfatase, eukaryotic
[373-417] IPR0178492.8e-82Alkaline phosphatase-like, alpha/beta/alpha
[13-423] IPR0178508.1e-76Alkaline-phosphatase-like, core domain
[11-419] IPR0009179.2e-57Sulfatase
Orthology groupMCL12683 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206993-TA
ATGAACCCAATGCAGAGTGTCAAACGGTTCATCGGAAACGAAGGAACCACGTTTACGAACTCTTACGTTACTTCACCAATATGCTGTCCAAGTAGAGCCAGCTTCCTAACAGGGCTTCACGTCCACAACCATATGACGTGGAATAACAGCATCAGCGGCGGTTGTTATAGTCGTGTTTGGAGGAAATTCGAAAAACGCACTTTCGCCACGGCACTAAAGGACGCGGGATACAATACGTTTTACGCGGGAAAATATTTAAACGAGTACGGCGTCCATGCGTCTGGCGGTCCTGAACAAGTTCCTCCGGGCTGGTCAGAGTGGCACGGACTCGTTGGAAACTCTGTGTATTACAACTACACTATATCTAATAATGGTGTACCAACATTTTCAACAGATCTATATCTTACTGATATAATACGTGATCTAAGTTTGAATTATATCGAGAATCAAACTGAGTCGCGTCCTTTCCTGATGGTTTTGGCGCCGCCTGCACCCCACCAGCCCGCGACACCGGCTGAGAGACACCGCGGCGTCTACGACAACACCACAGTACTAAAAACGCCGAACTTTAACATAGCTGACGATAACAAACATTGGCTCATAAGAATGCCACCTTCGCCTCTACCGGAAAAAATTATGCCTGAATTAGACAGAGTTTACCGTTCGAGGTGGGAGAGTCTGTTGGCTGTCGATGAAATGGTAGCTGATGTGGTAGAATCATTGGACTCAAGTGGCCTCTTGCAGAACACATATCTAATATTCACATCGGATAATGGTTATCATATTGGTCAGTTCTCGCAAGTGTATGATAAACGGCAGCCCTACGAGGCGGATGTCAAAGTCCCGTTGCTCATACGTGGACCAACATTCCCCAGGAACTACACTGACAGTCAGCCGGTATTGAACATTGACATAGCTCCAACTATTATGGCATTGGCTGGTTTGTCCCCGCCGAGGACTATGGACGGAAGACAGATAACGGTCGCTCAGGAAGTAGAGAGATACATGCTGGTAGAATACTACGGAGAGGGCAGAGACGACTCAGTAGATCCAAGCTGCCCTTGGAAATATGACAGCGAACATCTAGCGCAATGTTATCCCCAATACGATTGCAAGTGCCAGGACGCTAGGAACAACACATTCGCTTGCTTGAGACACATTTCGCAACGATTCAACATGAAATACTGCAGCTTCGCAGACTCTGAGAACTTCACAGAAATGTATGACTTGAGCACAGACTTGTATGAACTGGACAACATAGTGGACAAAGTTCTACCATCTATAAAACACTGGTACAAATTAACTCTGTCCCAAATGCTGACGTGCAAAGGATACAAGAATTGCGATAACCCTTTGGAGAACCCTAAAGTTTATGGCTGA

Protein sequence:

>DPOGS206993-PA
MNPMQSVKRFIGNEGTTFTNSYVTSPICCPSRASFLTGLHVHNHMTWNNSISGGCYSRVWRKFEKRTFATALKDAGYNTFYAGKYLNEYGVHASGGPEQVPPGWSEWHGLVGNSVYYNYTISNNGVPTFSTDLYLTDIIRDLSLNYIENQTESRPFLMVLAPPAPHQPATPAERHRGVYDNTTVLKTPNFNIADDNKHWLIRMPPSPLPEKIMPELDRVYRSRWESLLAVDEMVADVVESLDSSGLLQNTYLIFTSDNGYHIGQFSQVYDKRQPYEADVKVPLLIRGPTFPRNYTDSQPVLNIDIAPTIMALAGLSPPRTMDGRQITVAQEVERYMLVEYYGEGRDDSVDPSCPWKYDSEHLAQCYPQYDCKCQDARNNTFACLRHISQRFNMKYCSFADSENFTEMYDLSTDLYELDNIVDKVLPSIKHWYKLTLSQMLTCKGYKNCDNPLENPKVYG-