DPGLEAN01002 in OGS1.0

New model in OGS2.0DPOGS204835 
Genomic Positionscaffold135:- 96684-103946
See gene structure
CDS Length3393
Paired RNAseq reads  1092
Single RNAseq reads  2886
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA011732 (0.0)
Best Drosophila hit  CG4998, isoform B (2e-122)
Best Human hitPREDICTED: LOW QUALITY PROTEIN: serine protease 42 (5e-28)
Best NR hit (blastp)  CLIP-domain serine protease subfamily A (AGAP006954-PA) [Anopheles gambiae str. PEST] (0.0)
Best NR hit (blastx)  PREDICTED: similar to CLIP-domain serine protease subfamily A (AGAP006954-PA) [Tribolium castaneum] (0.0)
GeneOntology terms
  
GO:0004252 serine-type endopeptidase activity
GO:0006508 proteolysis
InterPro families


  
IPR001254 Peptidase S1/S6, chymotrypsin/Hap
IPR018114 Peptidase S1/S6, chymotrypsin/Hap, active site
IPR009003 Peptidase cysteine/serine, trypsin-like
IPR001314 Peptidase S1A, chymotrypsin-type
Orthology groupMCL14121

Nucleotide sequence:

ATGGACCCCATGATAATGCGTCGGAAATGGCGCGTCGCGTTCCACAAGCGCCTCAAAGAG
CCATCAGATCCCCATGGATGGGGCCCTAAAGCCGATGTTAAGCCGGGTAATGGGCGCAAA
CGTAATAATTTACCGCGGCTTGGGTCGAAGCAGGAGAAGATCATGCGCTGGTTTATATGT
TTCTGTGTGCTGCTATTATCTATAGCGAGTACATCAGCAGACTGGTCATGGGGTGGAGAC
GATTCTGATAAAAAAGATGAACCATCCAATGCTAAACCAACTGATCTATTGCAAGGCGAG
GAGATAGATGTCGCACAAGCCAAAAACTTCAATTCGAATGGAACTATTCTTGATGATATC
GTAGATGAATTAGTCAGCAATAAGCAAGGCAGAAGTTTGAGCGGATTTGACGATGTGTAC
AGTGACCCCACCATCAAGGAAGCGTTAGACTCGGGTGACGATAGCGAAGCAAGAAATTTG
ATAAAGGGCCGACTGTGTACCTTGGGATTAATACAATGCGAAGATGATGACACTCAAGAA
AAGAGATTTTTATCACCAGACGAACTCATTTATGCTCAACCCGTTGACATTAAGCCTATT
GGCAAACCTATCGCTTCCATACCTGTCCGTGGACCTCCAAGAGCTTATGGACCGCCAAAA
CCTATGTTGTACCCCCCACGCCCCCCGAAGATTCCATTAAAGAGGCCTGGATATGGAAAT
GTACGCCCTGGATTTTCGGAGAAGTATGGAGTAGCCGGTAATAACTATCAATTTTCACAA
AGCAGCGGCTCGTTTAATGGATATGAAGCTAATTACGTAACAAAACCACCAAGTTTTGCA
AACAATGAACCTTACAATTTTGAACATTCAAAACCAACCTACAACAAAATACCCTCAGGA
AGCAACAATATTAAATCTGAATCTATTGTACAGCAACACGTTCATCACCACTATGTTCAC
GATGATTCTAATAAAGAACCTAAGGTTATAATCAAATCTGTGGCTATACCAGTTGGATCT
GTAGGCCATTTAGCATCTCAAGTAAACACTCAATCATCATCCAACATCCTAACAGCATCT
GGAGGAGATTTTAACACATTTAATTCAGGAGGATTCAAACCTATGACAGGCGGTTTTTCT
CCCAGTAGCAAACCTGTCTACGAAACTGATACTATTTATGGATCTCAATACAGCCACAAT
AATTATAACAAGGGCAGTTCTAACGTTTTCAATCAAGGCTTACCAAATCAATTCGGTAGC
AATACTTTTGAAGAGCAAAAATATGGCAATTCGCTAGGTTCTTATGCATCTCAGAATGAG
TTTTATAAAAAAGAACTAAATGTAGGTTCCACCGGCAACTTATACACCCAAGGTCCAGCA
ACATTTTCACAAAATAATTTGTACCAACAAAATCAACATGAAGCCAAAGCTCAAGGTTTT
GAATGTGTATGTGTAAAATACGACCAATGTCCAAGCCAAGAAATTATTGGGCGCAGAGAC
GACTTGTATTTACCAATTGATCCGCGTAACAAAGGTTCCGAAATATTAGCATTAACTGAA
GAACAACTAGATGGAGTCAATAAAACTTCTGAGGAGATCAATGTCAGTCAAAATTCCACT
GAGGCAAAGAAAATTAGTAAACGAGACGTTGATGAGGCCAAAACTAAAGATGCTGCAAAG
GAAATTGAACCGCGCCTTCTGGGGTTAGCTGGATATGGAGGCAATGGCGGTAATAGCGAC
AAAAAAGTGCAGCCCACGTTTGGTGTTTCTTTTGGGTTGCCCCAGCCCTCCCATAGTTAT
CCCATTAATCCTTTCAACTCAAATCCTTTACACAATCCATACGGGCCCGCTCTAAATGGA
GGCGGTCTTAATTTAGGCTTGGTCTCAGTTAATCCACTTTTAGCTGTTCAAGTGACGAAA
AATGATTACGGGGAAAAGGTAGTAAAACCTTTTGTAAATTTGCATGTGACTCCAAATGAA
CACGTAGTTAATAAACTGGGTCATATATTCCACGAAAAAAAGCAATACCTTTTGAATAAA
CATGAACACTACCATCATTATAACCCCCATCCTTACCAGCCTTATCCTCATAGGCCATAT
GTCCCGCATCCTGTAGGTTATTCAGACCATTATGGTTTACATAGGCCACATTCTTACTCC
CCACATTTTCCGCATTACGATGCTGGTCACTATCGCGTTAATCCATACAATGCACCTTCT
GATAATGATGACTATTATGATGATGATGATGATAACAGTTACAACGCTGCTATAAGCTAT
AATGATCAAAATTACAATTTTGCCAAATCTGCTCAAACTAAAGAGAGCAATGGTCAAAAT
GATAATTATGCAAATAGATATTCATATTCGCGTTCCCTAACTATTCCTTCACAATCTGGG
GCAAACCAGAAAAGCCAGACTGTAAGATTCCCAACAAACAGAAGGAAAAGGGAAGCCTCT
CTAGCATCTGAAAAGATTAGTATACAAGAGCGTCAAGGTTATTTCGGTGGACCATCAATT
CCCCAATGTAATCAAAATCAAGTTTGTTGCCGCCGGCCACTTAGACCGCAGGCATCAAAT
CGCGGTCAGTGTGGTATCAGACATTCCCAGGGAATCAATGGTAGAATAAAGACTCCATCA
TACATCGACGGGGATAGTGAATTCGGAGAGTATCCTTGGCAGGCTGCTATTTTGAAGAAG
GACCCTAAAGAATCAGTTTACGTTTGTGGAGGCACACTTATTGATGGACTTCATATTATG
ACGGCGGCTCATTGCATCAAATCATACAAAGGATTCGAGCTGAGAGTTCGTCTAGGTGAA
TGGGACGTTAACCACGATGTTGAATTTTACCCATACATTGAACGAGATGTTATATCTGTT
CATGTACACCCACAATACTACGCCGGCACATTAGACAACGACCTTGCTATTTTAAAATTA
GAGCATCCAGTTGATTGGACCAAATATCCTCATATAAGTCCTGCATGTCTCCCTGATAAA
TACACCGATTACGCTGGACAAAGATGTTGGACAACTGGTTGGGGCAAGGATGCGTTTGGA
TCTAACGGAAAGTACCAAAATATTCTTAAGGAAGTAGATGTACCAATTCTACCCCATGGT
CAATGCCAACAACAATTAAGACAAACTCGTTTGGGCTACAACTATGAGTTAAATCCTGGT
TTTGTCTGTGCCGGTGGCGAAGATGGCAAAGATGCATGCAAAGGGGACGGTGGTGGCCCA
TTGGTTTGCGAGCGCAGCGGAACCTGGCAACTTGTAGGTGTTGTGTCTTGGGGAATCGGA
TGCGGTCAGGCTGGTGTACCAGGAGTTTACGTAAAAGTAGCTCATTATTTGGACTGGATC
TCTCAAGTGACTGGGAAATTTTCCCAATTCTAA

Protein sequence:

MDPMIMRRKWRVAFHKRLKEPSDPHGWGPKADVKPGNGRKRNNLPRLGSKQEKIMRWFIC
FCVLLLSIASTSADWSWGGDDSDKKDEPSNAKPTDLLQGEEIDVAQAKNFNSNGTILDDI
VDELVSNKQGRSLSGFDDVYSDPTIKEALDSGDDSEARNLIKGRLCTLGLIQCEDDDTQE
KRFLSPDELIYAQPVDIKPIGKPIASIPVRGPPRAYGPPKPMLYPPRPPKIPLKRPGYGN
VRPGFSEKYGVAGNNYQFSQSSGSFNGYEANYVTKPPSFANNEPYNFEHSKPTYNKIPSG
SNNIKSESIVQQHVHHHYVHDDSNKEPKVIIKSVAIPVGSVGHLASQVNTQSSSNILTAS
GGDFNTFNSGGFKPMTGGFSPSSKPVYETDTIYGSQYSHNNYNKGSSNVFNQGLPNQFGS
NTFEEQKYGNSLGSYASQNEFYKKELNVGSTGNLYTQGPATFSQNNLYQQNQHEAKAQGF
ECVCVKYDQCPSQEIIGRRDDLYLPIDPRNKGSEILALTEEQLDGVNKTSEEINVSQNST
EAKKISKRDVDEAKTKDAAKEIEPRLLGLAGYGGNGGNSDKKVQPTFGVSFGLPQPSHSY
PINPFNSNPLHNPYGPALNGGGLNLGLVSVNPLLAVQVTKNDYGEKVVKPFVNLHVTPNE
HVVNKLGHIFHEKKQYLLNKHEHYHHYNPHPYQPYPHRPYVPHPVGYSDHYGLHRPHSYS
PHFPHYDAGHYRVNPYNAPSDNDDYYDDDDDNSYNAAISYNDQNYNFAKSAQTKESNGQN
DNYANRYSYSRSLTIPSQSGANQKSQTVRFPTNRRKREASLASEKISIQERQGYFGGPSI
PQCNQNQVCCRRPLRPQASNRGQCGIRHSQGINGRIKTPSYIDGDSEFGEYPWQAAILKK
DPKESVYVCGGTLIDGLHIMTAAHCIKSYKGFELRVRLGEWDVNHDVEFYPYIERDVISV
HVHPQYYAGTLDNDLAILKLEHPVDWTKYPHISPACLPDKYTDYAGQRCWTTGWGKDAFG
SNGKYQNILKEVDVPILPHGQCQQQLRQTRLGYNYELNPGFVCAGGEDGKDACKGDGGGP
LVCERSGTWQLVGVVSWGIGCGQAGVPGVYVKVAHYLDWISQVTGKFSQF