DPGLEAN19329 in OGS1.0

New model in OGS2.0DPOGS204292 
Genomic Positionscaffold388:- 52661-61493
See gene structure
CDS Length3354
Paired RNAseq reads  2815
Single RNAseq reads  7395
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA007570 (1e-167)
Best Drosophila hit  sulfated (1e-110)
Best Human hitextracellular sulfatase Sulf-2 isoform a precursor (7e-74)
Best NR hit (blastp)  PREDICTED: similar to CG6725-PA [Nasonia vitripennis] (2e-149)
Best NR hit (blastx)  AGAP003374-PA [Anopheles gambiae str. PEST] (8e-115)
GeneOntology terms





  
GO:0008449 N-acetylglucosamine-6-sulfatase activity
GO:0007389 pattern specification process
GO:0018741 alkyl sulfatase activity
GO:0008152 metabolic process
GO:0005783 endoplasmic reticulum
GO:0009986 cell surface
GO:0005795 Golgi stack
InterPro families

  
IPR017850 Alkaline-phosphatase-like, core domain
IPR017849 Alkaline phosphatase-like, alpha/beta/alpha
IPR000917 Sulfatase
Orthology groupMCL10950

Nucleotide sequence:

ATGAATGGAAAGAGAATAAAACACGGAGATGATTATAATAAAGATTATTATCCGGATCTA
ATAGCGAATGATTCGATAGCGTTCTTGCGTGCTTCAAAGCGAAGATTTTCAAGAAAACCG
GTCCTCCTCGTGATGTCTTTCCCCGCACCTCATGGACCCGAGGATTCAGCTCCGCAGTAC
TCTCATCTCTTCTTTAATGTTACAACCCATCACACACCAACTTACGATATGGCGCCAAAT
CCAGATAAACAATGGATCCTGCGAGTGACAGAGAAAATGAAACCTATTCATAGACAGTTC
ACGGACCTGTTAATGACAAAGCGTTTGCAGACTTTGCAAAGTGTTGATGTGGCTGTGGAA
CGAGTGTACCAGGAGCTTAAGGCTCTCGGGGAGTTAGATAACACCTATCTGGTGTACACA
TCAGATCACGGATACCACCTTGGACAGTTCGGACTGGTTAAGGGCAAGAGCTTTCCCTTC
GAATTCGATATAAGAGTGCCGTTTTTAGTACGCGGCCCGGGAGTCGAACCTGGAACTGTC
GTGGACGATATAATTCTCAACATCGATCTGGCGCCCACATTTCTGGATATGGGAGGAGTT
CAGCCCCCGCCTCATATGGACGGCAGGTCGCTGCTGCCGCTGCTGCAGCCACGGAGGCGA
CGAGCGACAGCACATTGGCCAGATACATTCCTAGTCGAGAGCTCTGGACGTCGCGAGACC
CAAGCTCATTTAATGGAAGAACGTTTGCGAGCACAAAAATACAGTAAAGAAATGAATGCA
AGAACAACGACTATTATGCCGCTACAGTCGTCGTCCGAGAGCGGAGACTTCGAGGACGAG
TCTGACGATGACTTCCTGGAACTTGATGATATTATGCCCCTACAGTCGTCGTCCGAGAGC
GGAGACTTCGAGGATGAGTCTGATGATGACTTCCTGGAACTTGATGATGACGAAGATGAT
GAGGACAATGAGAGCACTGAGGATACATCGAACAAATCAAATCAACCTCTCATATCAAAT
GAAAGTCACAATCCCATACTGGAGGCGAGTCTCGATAAGATTCTTGGAGGTGACGGTGCT
GTCAATAATCAATATAATTACCTCAGCCAATCAGAAATGGATGTCATTAATGGGAAGGCA
GCACGTATAGCGGCTGAATGTTCCAAAGCTGAACTCCGGGCTCCCTGCTCCGTCGGACGG
AAGTGGAAATGTGTGCTTGTTAATGGACGATGGAGGAAACACAAATGTAAATATGAGGAT
ATAACTATTCCACAACCGAAAATGAGCACAAAGAAATGTGCTTGTTTCACTCCAAGTGGC
CTTGTTTATACAAGACTGGAAACAGATGGTACAATCGCTAGACGACCCGCAGATTTACAG
AAAGATAATAACACAAGATCACGGAGGTCTACAGATAATGATGTATTTGAACCGAACACT
GTGGACACAATTCTTGAGGAAAATCCTAGTATTGGACATCTAAGTTTTAACAATGAGCCT
ATTGATGAAATAGAGAAGAGGAACATTGAAAACAAAGTCGATAAACTCATTAGGGAAACT
GAAGCTTTCCTCGAGGCGTACGAACGAACCAAAGATAATATAGATCATAAGAGAAGTAAG
AGGCGTGCTCAGCATTGGGGTCACAAACACAAACCACACAAAAACGACCCATTGTTGAAC
ATGAATGAATCGTCTCTAGAATGTAAGATAGATAAAGACGGCACTGTTAATTGTTCGCAA
GTTATATACAATGATTTGAAAGCTTGGCACACCAACAGACTGAGTCTAGAAGACCAAATA
AGAGAATTGAAAACAAAGTTGGAAGACTTAAAAGAAATTAAGAGGCATTTAAAAATAAGC
AAACCTGTTGTCGAAGTACAAACGGTAACGCCATCGTACGTCAACACGCATTTACACAAT
AAAACACAAACACCTGATAGCACGAAGGACAGCTTTAGGAGATCACGTTTCCATAGAATT
AAAACCAAGCACAGGAACAGCACAGTGATTGATAAAAAATTCAGACAACTCAACGAATAC
ATCCTACCGACTGTGAACGGTCACACCAGAGACGACATATTTAACACTCAACTCAGGAAC
GAAACGTCCACAGAAGCAGTCGTAAAACAATTGAGTACAATTGATCTGGTCGAAATTGAT
TCCAATCAGACATACGTTTTAGGAAAATTACCAAAACCACAAGTAACTACAATCATAACC
GAAAGTAACTTCTATGATCAGAATTTTGCAGCGGAGAAGAGCACGCAGCAAGCAATTACC
ACGTCGACTGATGATACAGCGACTATATACAGCGATATTTCCGGGATAAATAACACATCC
AGTAAAACACCACAAACGGAAAAACCGACGAGCCCGAGTCAAGAAACTTCCACGGACATT
CTGTCCACATTGCAGTATTACAGTTCCGAAGCTAATAAAGTAATATTGACAATGACAACG
ACACCAACTCCTGTGACAAGACGGACAACAGCATCTCATCAAACATATAACCGGACATAC
CACACAAAACCATCGAACAGACCAAAGTCATCGTCTCTAGGACCAACAAGATTCGATGCG
TCGGAATATGAACAAAGAAATCCTAATAAAGGAAATTCTAACAATCACGGAGTATTCAGC
AAGCCGATGGACGTGTTCCAAAGAAGATTACATCCTTTGTTTATAGAGAATGAGGATAAA
CATGTCTGTTACTGTGAAGAGAGTCGCAAAATGAAACCAGTAGGTAACTCGTATTTGGAA
GCCACTCAAAGAGCCAGAGAGGAACGAAGGAAATTGAAAGAACAGAGATTGAGAAAGAAG
CTTAGGAAAGCGAAGAAGAAGGCGGAATTGGAAAGGTTATGTGAATCAGAGCGTATGAAT
TGCTTCCGACACGACAATGACCATTGGCGCACAGCCCCGCTATGGACCGCCGGACCTTTC
TGTTTCTGTATGAGCGCCTCAAACAATACATACAATTGTGTGAGAACTATTAACTCGACC
CACAACCTGCTCTACTGTGAGTTCGTCACTGGTTTGATAACGTACTACAATCTGCGTATA
GATCCGTTTGAAACACAAAACAGAGTTAAATATTTATCGTCAGCTGAAAAGGAATATTTC
CACAATCAGTTGCAACAGCTTTTGACATGTCGGGGACCGTCGTGTAGAAGATTCTCGCAT
TCAAATGTTGGAGGTATTAAAGATGATGTCAGCAGACGGACTGAAGATGACCAACTCATG
TATAGAGGGGAGCCAATTGGTTACAGTGAAAGGGCATGGCGATGGAGTGGCTATGGTCGT
AGATATGCAAGAGCCAGAGAGTTGCACCGGCGTCGACATACCGCGGCCTTCTAG

Protein sequence:

MNGKRIKHGDDYNKDYYPDLIANDSIAFLRASKRRFSRKPVLLVMSFPAPHGPEDSAPQY
SHLFFNVTTHHTPTYDMAPNPDKQWILRVTEKMKPIHRQFTDLLMTKRLQTLQSVDVAVE
RVYQELKALGELDNTYLVYTSDHGYHLGQFGLVKGKSFPFEFDIRVPFLVRGPGVEPGTV
VDDIILNIDLAPTFLDMGGVQPPPHMDGRSLLPLLQPRRRRATAHWPDTFLVESSGRRET
QAHLMEERLRAQKYSKEMNARTTTIMPLQSSSESGDFEDESDDDFLELDDIMPLQSSSES
GDFEDESDDDFLELDDDEDDEDNESTEDTSNKSNQPLISNESHNPILEASLDKILGGDGA
VNNQYNYLSQSEMDVINGKAARIAAECSKAELRAPCSVGRKWKCVLVNGRWRKHKCKYED
ITIPQPKMSTKKCACFTPSGLVYTRLETDGTIARRPADLQKDNNTRSRRSTDNDVFEPNT
VDTILEENPSIGHLSFNNEPIDEIEKRNIENKVDKLIRETEAFLEAYERTKDNIDHKRSK
RRAQHWGHKHKPHKNDPLLNMNESSLECKIDKDGTVNCSQVIYNDLKAWHTNRLSLEDQI
RELKTKLEDLKEIKRHLKISKPVVEVQTVTPSYVNTHLHNKTQTPDSTKDSFRRSRFHRI
KTKHRNSTVIDKKFRQLNEYILPTVNGHTRDDIFNTQLRNETSTEAVVKQLSTIDLVEID
SNQTYVLGKLPKPQVTTIITESNFYDQNFAAEKSTQQAITTSTDDTATIYSDISGINNTS
SKTPQTEKPTSPSQETSTDILSTLQYYSSEANKVILTMTTTPTPVTRRTTASHQTYNRTY
HTKPSNRPKSSSLGPTRFDASEYEQRNPNKGNSNNHGVFSKPMDVFQRRLHPLFIENEDK
HVCYCEESRKMKPVGNSYLEATQRAREERRKLKEQRLRKKLRKAKKKAELERLCESERMN
CFRHDNDHWRTAPLWTAGPFCFCMSASNNTYNCVRTINSTHNLLYCEFVTGLITYYNLRI
DPFETQNRVKYLSSAEKEYFHNQLQQLLTCRGPSCRRFSHSNVGGIKDDVSRRTEDDQLM
YRGEPIGYSERAWRWSGYGRRYARARELHRRRHTAAF