DPGLEAN15540 in OGS1.0

New model in OGS2.0DPOGS206993 
Genomic Positionscaffold1:+ 442472-447134
See gene structure
CDS Length1491
Paired RNAseq reads  2387
Single RNAseq reads  5530
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA012952 (0.0)
Best Drosophila hit  CG18278 (2e-112)
Best Human hitN-acetylglucosamine-6-sulfatase precursor (2e-99)
Best NR hit (blastp)  sulfatase [Aedes aegypti] (3e-135)
Best NR hit (blastx)  sulfatase [Aedes aegypti] (9e-128)
GeneOntology terms


  
GO:0006044 N-acetylglucosamine metabolic process
GO:0008449 N-acetylglucosamine-6-sulfatase activity
GO:0005764 lysosome
GO:0030203 glycosaminoglycan metabolic process
InterPro families



  
IPR012251 N-acetylglucosamine-6-sulfatase
IPR000917 Sulfatase
IPR015981 N-acetylglucosamine-6-sulfatase, eukaryotic
IPR017850 Alkaline-phosphatase-like, core domain
IPR017849 Alkaline phosphatase-like, alpha/beta/alpha
Orthology groupMCL12965

Nucleotide sequence:

ATGTATAGTTATATCTTACTTGTGTTATTTTTTGCTCCGTTTAGTGCGTGTCAGGATAAA
CCGAATTTCGTAGTGATTCTAACCGACGATCAGGATGTAGTTTTGGATGGTATGAACCCA
ATGCAGAGTGTCAAACGGTTCATCGGAAACGAAGGAACCACGTTTACGAACTCTTACGTT
ACTTCACCAATATGCTGTCCAAGTAGAGCCAGCTTCCTAACAGGGCTTCACGTCCACAAC
CATATGACGTGGAATAACAGCATCAGCGGCGGTTGTTATAGTCGTGTTTGGAGGAAATTC
GAAAAACGCACTTTCGCCACGGCACTAAAGGACGCGGGATACAATACGTTTTACGCGGGA
AAATATTTAAACGAGTACGGCGTCCATGCGTCTGGCGGTCCTGAACAAGTTCCTCCGGGC
TGGTCAGAGTGGCACGGACTCGTTGGAAACTCTGTGTATTACAACTACACTATATCTAAT
AATGGTGTACCAACATTTTCAACAGATCTATATCTTACTGATATAATACGTGATCTAAGT
TTGAATTATATCGAGAATCAAACTGAGTCGCGTCCTTTCCTGATGGTTTTGGCGCCGCCT
GCACCCCACCAGCCCGCGACACCGGCTGAGAGACACCGCGGCGTCTACGACAACACCACA
GTACTAAAAACGCCGAACTTTAACATAGCTGACGATAACAAACATTGGCTCATAAGAATG
CCACCTTCGCCTCTACCGGAAAAAATTATGCCTGAATTAGACAGAGTTTACCGTTCGAGG
TGGGAGAGTCTGTTGGCTGTCGATGAAATGGTAGCTGATGTGGTAGAATCATTGGACTCA
AGTGGCCTCTTGCAGAACACATATCTAATATTCACATCGGATAATGGTTATCATATTGGT
CAGTTCTCGCAAGTGTATGATAAACGGCAGCCCTACGAGGCGGATGTCAAAGTCCCGTTG
CTCATACGTGGACCAACATTCCCCAGGAACTACACTGACAGTCAGCCGGTATTGAACATT
GACATAGCTCCAACTATTATGGCATTGGCTGGTTTGTCCCCGCCGAGGACTATGGACGGA
AGACAGATAACGGTCGCTCAGGAAGTAGAGAGATACATGCTGGTAGAATACTACGGAGAG
GGCAGAGACGACTCAGTAGATCCAAGCTGCCCTTGGAAATATGACAGCGAACATCTAGCG
CAATGTTATCCCCAATACGATTGCAAGTGCCAGGACGCTAGGAACAACACATTCGCTTGC
TTGAGACACATTTCGCAACGATTCAACATGAAATACTGCAGCTTCGCAGACTCTGAGAAC
TTCACAGAAATGTATGACTTGAGCACAGACTTGTATGAACTGGACAACATAGTGGACAAA
GTTCTACCATCTATAAAACACTGGTACAAATTAACTCTGTCCCAAATGCTGACGTGCAAA
GGATACAAGAATTGCGATAACCCTTTGGAGAACCCTAAAGTTTATGGCTGA

Protein sequence:

MYSYILLVLFFAPFSACQDKPNFVVILTDDQDVVLDGMNPMQSVKRFIGNEGTTFTNSYV
TSPICCPSRASFLTGLHVHNHMTWNNSISGGCYSRVWRKFEKRTFATALKDAGYNTFYAG
KYLNEYGVHASGGPEQVPPGWSEWHGLVGNSVYYNYTISNNGVPTFSTDLYLTDIIRDLS
LNYIENQTESRPFLMVLAPPAPHQPATPAERHRGVYDNTTVLKTPNFNIADDNKHWLIRM
PPSPLPEKIMPELDRVYRSRWESLLAVDEMVADVVESLDSSGLLQNTYLIFTSDNGYHIG
QFSQVYDKRQPYEADVKVPLLIRGPTFPRNYTDSQPVLNIDIAPTIMALAGLSPPRTMDG
RQITVAQEVERYMLVEYYGEGRDDSVDPSCPWKYDSEHLAQCYPQYDCKCQDARNNTFAC
LRHISQRFNMKYCSFADSENFTEMYDLSTDLYELDNIVDKVLPSIKHWYKLTLSQMLTCK
GYKNCDNPLENPKVYG