New model in OGS2.0 | DPOGS204292 |
---|---|
Genomic Position | scaffold388:- 52661-61493 |
See gene structure | |
CDS Length | 3354 |
Paired RNAseq reads | 2815 |
Single RNAseq reads | 7395 |
Migratory profiles | Query via corresponding ESTs |
Best Bmobyx hit | BGIBMGA007570 (1e-167) |
Best Drosophila hit | sulfated (1e-110) |
Best Human hit | extracellular sulfatase Sulf-2 isoform a precursor (7e-74) |
Best NR hit (blastp) | PREDICTED: similar to CG6725-PA [Nasonia vitripennis] (2e-149) |
Best NR hit (blastx) | AGAP003374-PA [Anopheles gambiae str. PEST] (8e-115) |
GeneOntology terms | GO:0008449 N-acetylglucosamine-6-sulfatase activity GO:0007389 pattern specification process GO:0018741 alkyl sulfatase activity GO:0008152 metabolic process GO:0005783 endoplasmic reticulum GO:0009986 cell surface GO:0005795 Golgi stack |
InterPro families | IPR017850 Alkaline-phosphatase-like, core domain IPR017849 Alkaline phosphatase-like, alpha/beta/alpha IPR000917 Sulfatase |
Orthology group | MCL10950 |
Nucleotide sequence:
ATGAATGGAAAGAGAATAAAACACGGAGATGATTATAATAAAGATTATTATCCGGATCTA
ATAGCGAATGATTCGATAGCGTTCTTGCGTGCTTCAAAGCGAAGATTTTCAAGAAAACCG
GTCCTCCTCGTGATGTCTTTCCCCGCACCTCATGGACCCGAGGATTCAGCTCCGCAGTAC
TCTCATCTCTTCTTTAATGTTACAACCCATCACACACCAACTTACGATATGGCGCCAAAT
CCAGATAAACAATGGATCCTGCGAGTGACAGAGAAAATGAAACCTATTCATAGACAGTTC
ACGGACCTGTTAATGACAAAGCGTTTGCAGACTTTGCAAAGTGTTGATGTGGCTGTGGAA
CGAGTGTACCAGGAGCTTAAGGCTCTCGGGGAGTTAGATAACACCTATCTGGTGTACACA
TCAGATCACGGATACCACCTTGGACAGTTCGGACTGGTTAAGGGCAAGAGCTTTCCCTTC
GAATTCGATATAAGAGTGCCGTTTTTAGTACGCGGCCCGGGAGTCGAACCTGGAACTGTC
GTGGACGATATAATTCTCAACATCGATCTGGCGCCCACATTTCTGGATATGGGAGGAGTT
CAGCCCCCGCCTCATATGGACGGCAGGTCGCTGCTGCCGCTGCTGCAGCCACGGAGGCGA
CGAGCGACAGCACATTGGCCAGATACATTCCTAGTCGAGAGCTCTGGACGTCGCGAGACC
CAAGCTCATTTAATGGAAGAACGTTTGCGAGCACAAAAATACAGTAAAGAAATGAATGCA
AGAACAACGACTATTATGCCGCTACAGTCGTCGTCCGAGAGCGGAGACTTCGAGGACGAG
TCTGACGATGACTTCCTGGAACTTGATGATATTATGCCCCTACAGTCGTCGTCCGAGAGC
GGAGACTTCGAGGATGAGTCTGATGATGACTTCCTGGAACTTGATGATGACGAAGATGAT
GAGGACAATGAGAGCACTGAGGATACATCGAACAAATCAAATCAACCTCTCATATCAAAT
GAAAGTCACAATCCCATACTGGAGGCGAGTCTCGATAAGATTCTTGGAGGTGACGGTGCT
GTCAATAATCAATATAATTACCTCAGCCAATCAGAAATGGATGTCATTAATGGGAAGGCA
GCACGTATAGCGGCTGAATGTTCCAAAGCTGAACTCCGGGCTCCCTGCTCCGTCGGACGG
AAGTGGAAATGTGTGCTTGTTAATGGACGATGGAGGAAACACAAATGTAAATATGAGGAT
ATAACTATTCCACAACCGAAAATGAGCACAAAGAAATGTGCTTGTTTCACTCCAAGTGGC
CTTGTTTATACAAGACTGGAAACAGATGGTACAATCGCTAGACGACCCGCAGATTTACAG
AAAGATAATAACACAAGATCACGGAGGTCTACAGATAATGATGTATTTGAACCGAACACT
GTGGACACAATTCTTGAGGAAAATCCTAGTATTGGACATCTAAGTTTTAACAATGAGCCT
ATTGATGAAATAGAGAAGAGGAACATTGAAAACAAAGTCGATAAACTCATTAGGGAAACT
GAAGCTTTCCTCGAGGCGTACGAACGAACCAAAGATAATATAGATCATAAGAGAAGTAAG
AGGCGTGCTCAGCATTGGGGTCACAAACACAAACCACACAAAAACGACCCATTGTTGAAC
ATGAATGAATCGTCTCTAGAATGTAAGATAGATAAAGACGGCACTGTTAATTGTTCGCAA
GTTATATACAATGATTTGAAAGCTTGGCACACCAACAGACTGAGTCTAGAAGACCAAATA
AGAGAATTGAAAACAAAGTTGGAAGACTTAAAAGAAATTAAGAGGCATTTAAAAATAAGC
AAACCTGTTGTCGAAGTACAAACGGTAACGCCATCGTACGTCAACACGCATTTACACAAT
AAAACACAAACACCTGATAGCACGAAGGACAGCTTTAGGAGATCACGTTTCCATAGAATT
AAAACCAAGCACAGGAACAGCACAGTGATTGATAAAAAATTCAGACAACTCAACGAATAC
ATCCTACCGACTGTGAACGGTCACACCAGAGACGACATATTTAACACTCAACTCAGGAAC
GAAACGTCCACAGAAGCAGTCGTAAAACAATTGAGTACAATTGATCTGGTCGAAATTGAT
TCCAATCAGACATACGTTTTAGGAAAATTACCAAAACCACAAGTAACTACAATCATAACC
GAAAGTAACTTCTATGATCAGAATTTTGCAGCGGAGAAGAGCACGCAGCAAGCAATTACC
ACGTCGACTGATGATACAGCGACTATATACAGCGATATTTCCGGGATAAATAACACATCC
AGTAAAACACCACAAACGGAAAAACCGACGAGCCCGAGTCAAGAAACTTCCACGGACATT
CTGTCCACATTGCAGTATTACAGTTCCGAAGCTAATAAAGTAATATTGACAATGACAACG
ACACCAACTCCTGTGACAAGACGGACAACAGCATCTCATCAAACATATAACCGGACATAC
CACACAAAACCATCGAACAGACCAAAGTCATCGTCTCTAGGACCAACAAGATTCGATGCG
TCGGAATATGAACAAAGAAATCCTAATAAAGGAAATTCTAACAATCACGGAGTATTCAGC
AAGCCGATGGACGTGTTCCAAAGAAGATTACATCCTTTGTTTATAGAGAATGAGGATAAA
CATGTCTGTTACTGTGAAGAGAGTCGCAAAATGAAACCAGTAGGTAACTCGTATTTGGAA
GCCACTCAAAGAGCCAGAGAGGAACGAAGGAAATTGAAAGAACAGAGATTGAGAAAGAAG
CTTAGGAAAGCGAAGAAGAAGGCGGAATTGGAAAGGTTATGTGAATCAGAGCGTATGAAT
TGCTTCCGACACGACAATGACCATTGGCGCACAGCCCCGCTATGGACCGCCGGACCTTTC
TGTTTCTGTATGAGCGCCTCAAACAATACATACAATTGTGTGAGAACTATTAACTCGACC
CACAACCTGCTCTACTGTGAGTTCGTCACTGGTTTGATAACGTACTACAATCTGCGTATA
GATCCGTTTGAAACACAAAACAGAGTTAAATATTTATCGTCAGCTGAAAAGGAATATTTC
CACAATCAGTTGCAACAGCTTTTGACATGTCGGGGACCGTCGTGTAGAAGATTCTCGCAT
TCAAATGTTGGAGGTATTAAAGATGATGTCAGCAGACGGACTGAAGATGACCAACTCATG
TATAGAGGGGAGCCAATTGGTTACAGTGAAAGGGCATGGCGATGGAGTGGCTATGGTCGT
AGATATGCAAGAGCCAGAGAGTTGCACCGGCGTCGACATACCGCGGCCTTCTAG
Protein sequence:
MNGKRIKHGDDYNKDYYPDLIANDSIAFLRASKRRFSRKPVLLVMSFPAPHGPEDSAPQY
SHLFFNVTTHHTPTYDMAPNPDKQWILRVTEKMKPIHRQFTDLLMTKRLQTLQSVDVAVE
RVYQELKALGELDNTYLVYTSDHGYHLGQFGLVKGKSFPFEFDIRVPFLVRGPGVEPGTV
VDDIILNIDLAPTFLDMGGVQPPPHMDGRSLLPLLQPRRRRATAHWPDTFLVESSGRRET
QAHLMEERLRAQKYSKEMNARTTTIMPLQSSSESGDFEDESDDDFLELDDIMPLQSSSES
GDFEDESDDDFLELDDDEDDEDNESTEDTSNKSNQPLISNESHNPILEASLDKILGGDGA
VNNQYNYLSQSEMDVINGKAARIAAECSKAELRAPCSVGRKWKCVLVNGRWRKHKCKYED
ITIPQPKMSTKKCACFTPSGLVYTRLETDGTIARRPADLQKDNNTRSRRSTDNDVFEPNT
VDTILEENPSIGHLSFNNEPIDEIEKRNIENKVDKLIRETEAFLEAYERTKDNIDHKRSK
RRAQHWGHKHKPHKNDPLLNMNESSLECKIDKDGTVNCSQVIYNDLKAWHTNRLSLEDQI
RELKTKLEDLKEIKRHLKISKPVVEVQTVTPSYVNTHLHNKTQTPDSTKDSFRRSRFHRI
KTKHRNSTVIDKKFRQLNEYILPTVNGHTRDDIFNTQLRNETSTEAVVKQLSTIDLVEID
SNQTYVLGKLPKPQVTTIITESNFYDQNFAAEKSTQQAITTSTDDTATIYSDISGINNTS
SKTPQTEKPTSPSQETSTDILSTLQYYSSEANKVILTMTTTPTPVTRRTTASHQTYNRTY
HTKPSNRPKSSSLGPTRFDASEYEQRNPNKGNSNNHGVFSKPMDVFQRRLHPLFIENEDK
HVCYCEESRKMKPVGNSYLEATQRAREERRKLKEQRLRKKLRKAKKKAELERLCESERMN
CFRHDNDHWRTAPLWTAGPFCFCMSASNNTYNCVRTINSTHNLLYCEFVTGLITYYNLRI
DPFETQNRVKYLSSAEKEYFHNQLQQLLTCRGPSCRRFSHSNVGGIKDDVSRRTEDDQLM
YRGEPIGYSERAWRWSGYGRRYARARELHRRRHTAAF