New model in OGS2.0 | DPOGS204835  |
---|---|
Genomic Position | scaffold135:- 96684-103946 |
See gene structure | |
CDS Length | 3393 |
Paired RNAseq reads   | 1092 |
Single RNAseq reads   | 2886 |
Migratory profiles | Query via corresponding ESTs |
Best Bmobyx hit | BGIBMGA011732 (0.0) |
Best Drosophila hit   | CG4998, isoform B (2e-122) |
Best Human hit | PREDICTED: LOW QUALITY PROTEIN: serine protease 42 (5e-28) |
Best NR hit (blastp)   | CLIP-domain serine protease subfamily A (AGAP006954-PA) [Anopheles gambiae str. PEST] (0.0) |
Best NR hit (blastx)   | PREDICTED: similar to CLIP-domain serine protease subfamily A (AGAP006954-PA) [Tribolium castaneum] (0.0) |
GeneOntology terms    | GO:0004252 serine-type endopeptidase activity GO:0006508 proteolysis |
InterPro families    | IPR001254 Peptidase S1/S6, chymotrypsin/Hap IPR018114 Peptidase S1/S6, chymotrypsin/Hap, active site IPR009003 Peptidase cysteine/serine, trypsin-like IPR001314 Peptidase S1A, chymotrypsin-type |
Orthology group | MCL14121 |
Nucleotide sequence:
ATGGACCCCATGATAATGCGTCGGAAATGGCGCGTCGCGTTCCACAAGCGCCTCAAAGAG
CCATCAGATCCCCATGGATGGGGCCCTAAAGCCGATGTTAAGCCGGGTAATGGGCGCAAA
CGTAATAATTTACCGCGGCTTGGGTCGAAGCAGGAGAAGATCATGCGCTGGTTTATATGT
TTCTGTGTGCTGCTATTATCTATAGCGAGTACATCAGCAGACTGGTCATGGGGTGGAGAC
GATTCTGATAAAAAAGATGAACCATCCAATGCTAAACCAACTGATCTATTGCAAGGCGAG
GAGATAGATGTCGCACAAGCCAAAAACTTCAATTCGAATGGAACTATTCTTGATGATATC
GTAGATGAATTAGTCAGCAATAAGCAAGGCAGAAGTTTGAGCGGATTTGACGATGTGTAC
AGTGACCCCACCATCAAGGAAGCGTTAGACTCGGGTGACGATAGCGAAGCAAGAAATTTG
ATAAAGGGCCGACTGTGTACCTTGGGATTAATACAATGCGAAGATGATGACACTCAAGAA
AAGAGATTTTTATCACCAGACGAACTCATTTATGCTCAACCCGTTGACATTAAGCCTATT
GGCAAACCTATCGCTTCCATACCTGTCCGTGGACCTCCAAGAGCTTATGGACCGCCAAAA
CCTATGTTGTACCCCCCACGCCCCCCGAAGATTCCATTAAAGAGGCCTGGATATGGAAAT
GTACGCCCTGGATTTTCGGAGAAGTATGGAGTAGCCGGTAATAACTATCAATTTTCACAA
AGCAGCGGCTCGTTTAATGGATATGAAGCTAATTACGTAACAAAACCACCAAGTTTTGCA
AACAATGAACCTTACAATTTTGAACATTCAAAACCAACCTACAACAAAATACCCTCAGGA
AGCAACAATATTAAATCTGAATCTATTGTACAGCAACACGTTCATCACCACTATGTTCAC
GATGATTCTAATAAAGAACCTAAGGTTATAATCAAATCTGTGGCTATACCAGTTGGATCT
GTAGGCCATTTAGCATCTCAAGTAAACACTCAATCATCATCCAACATCCTAACAGCATCT
GGAGGAGATTTTAACACATTTAATTCAGGAGGATTCAAACCTATGACAGGCGGTTTTTCT
CCCAGTAGCAAACCTGTCTACGAAACTGATACTATTTATGGATCTCAATACAGCCACAAT
AATTATAACAAGGGCAGTTCTAACGTTTTCAATCAAGGCTTACCAAATCAATTCGGTAGC
AATACTTTTGAAGAGCAAAAATATGGCAATTCGCTAGGTTCTTATGCATCTCAGAATGAG
TTTTATAAAAAAGAACTAAATGTAGGTTCCACCGGCAACTTATACACCCAAGGTCCAGCA
ACATTTTCACAAAATAATTTGTACCAACAAAATCAACATGAAGCCAAAGCTCAAGGTTTT
GAATGTGTATGTGTAAAATACGACCAATGTCCAAGCCAAGAAATTATTGGGCGCAGAGAC
GACTTGTATTTACCAATTGATCCGCGTAACAAAGGTTCCGAAATATTAGCATTAACTGAA
GAACAACTAGATGGAGTCAATAAAACTTCTGAGGAGATCAATGTCAGTCAAAATTCCACT
GAGGCAAAGAAAATTAGTAAACGAGACGTTGATGAGGCCAAAACTAAAGATGCTGCAAAG
GAAATTGAACCGCGCCTTCTGGGGTTAGCTGGATATGGAGGCAATGGCGGTAATAGCGAC
AAAAAAGTGCAGCCCACGTTTGGTGTTTCTTTTGGGTTGCCCCAGCCCTCCCATAGTTAT
CCCATTAATCCTTTCAACTCAAATCCTTTACACAATCCATACGGGCCCGCTCTAAATGGA
GGCGGTCTTAATTTAGGCTTGGTCTCAGTTAATCCACTTTTAGCTGTTCAAGTGACGAAA
AATGATTACGGGGAAAAGGTAGTAAAACCTTTTGTAAATTTGCATGTGACTCCAAATGAA
CACGTAGTTAATAAACTGGGTCATATATTCCACGAAAAAAAGCAATACCTTTTGAATAAA
CATGAACACTACCATCATTATAACCCCCATCCTTACCAGCCTTATCCTCATAGGCCATAT
GTCCCGCATCCTGTAGGTTATTCAGACCATTATGGTTTACATAGGCCACATTCTTACTCC
CCACATTTTCCGCATTACGATGCTGGTCACTATCGCGTTAATCCATACAATGCACCTTCT
GATAATGATGACTATTATGATGATGATGATGATAACAGTTACAACGCTGCTATAAGCTAT
AATGATCAAAATTACAATTTTGCCAAATCTGCTCAAACTAAAGAGAGCAATGGTCAAAAT
GATAATTATGCAAATAGATATTCATATTCGCGTTCCCTAACTATTCCTTCACAATCTGGG
GCAAACCAGAAAAGCCAGACTGTAAGATTCCCAACAAACAGAAGGAAAAGGGAAGCCTCT
CTAGCATCTGAAAAGATTAGTATACAAGAGCGTCAAGGTTATTTCGGTGGACCATCAATT
CCCCAATGTAATCAAAATCAAGTTTGTTGCCGCCGGCCACTTAGACCGCAGGCATCAAAT
CGCGGTCAGTGTGGTATCAGACATTCCCAGGGAATCAATGGTAGAATAAAGACTCCATCA
TACATCGACGGGGATAGTGAATTCGGAGAGTATCCTTGGCAGGCTGCTATTTTGAAGAAG
GACCCTAAAGAATCAGTTTACGTTTGTGGAGGCACACTTATTGATGGACTTCATATTATG
ACGGCGGCTCATTGCATCAAATCATACAAAGGATTCGAGCTGAGAGTTCGTCTAGGTGAA
TGGGACGTTAACCACGATGTTGAATTTTACCCATACATTGAACGAGATGTTATATCTGTT
CATGTACACCCACAATACTACGCCGGCACATTAGACAACGACCTTGCTATTTTAAAATTA
GAGCATCCAGTTGATTGGACCAAATATCCTCATATAAGTCCTGCATGTCTCCCTGATAAA
TACACCGATTACGCTGGACAAAGATGTTGGACAACTGGTTGGGGCAAGGATGCGTTTGGA
TCTAACGGAAAGTACCAAAATATTCTTAAGGAAGTAGATGTACCAATTCTACCCCATGGT
CAATGCCAACAACAATTAAGACAAACTCGTTTGGGCTACAACTATGAGTTAAATCCTGGT
TTTGTCTGTGCCGGTGGCGAAGATGGCAAAGATGCATGCAAAGGGGACGGTGGTGGCCCA
TTGGTTTGCGAGCGCAGCGGAACCTGGCAACTTGTAGGTGTTGTGTCTTGGGGAATCGGA
TGCGGTCAGGCTGGTGTACCAGGAGTTTACGTAAAAGTAGCTCATTATTTGGACTGGATC
TCTCAAGTGACTGGGAAATTTTCCCAATTCTAA
Protein sequence:
MDPMIMRRKWRVAFHKRLKEPSDPHGWGPKADVKPGNGRKRNNLPRLGSKQEKIMRWFIC
FCVLLLSIASTSADWSWGGDDSDKKDEPSNAKPTDLLQGEEIDVAQAKNFNSNGTILDDI
VDELVSNKQGRSLSGFDDVYSDPTIKEALDSGDDSEARNLIKGRLCTLGLIQCEDDDTQE
KRFLSPDELIYAQPVDIKPIGKPIASIPVRGPPRAYGPPKPMLYPPRPPKIPLKRPGYGN
VRPGFSEKYGVAGNNYQFSQSSGSFNGYEANYVTKPPSFANNEPYNFEHSKPTYNKIPSG
SNNIKSESIVQQHVHHHYVHDDSNKEPKVIIKSVAIPVGSVGHLASQVNTQSSSNILTAS
GGDFNTFNSGGFKPMTGGFSPSSKPVYETDTIYGSQYSHNNYNKGSSNVFNQGLPNQFGS
NTFEEQKYGNSLGSYASQNEFYKKELNVGSTGNLYTQGPATFSQNNLYQQNQHEAKAQGF
ECVCVKYDQCPSQEIIGRRDDLYLPIDPRNKGSEILALTEEQLDGVNKTSEEINVSQNST
EAKKISKRDVDEAKTKDAAKEIEPRLLGLAGYGGNGGNSDKKVQPTFGVSFGLPQPSHSY
PINPFNSNPLHNPYGPALNGGGLNLGLVSVNPLLAVQVTKNDYGEKVVKPFVNLHVTPNE
HVVNKLGHIFHEKKQYLLNKHEHYHHYNPHPYQPYPHRPYVPHPVGYSDHYGLHRPHSYS
PHFPHYDAGHYRVNPYNAPSDNDDYYDDDDDNSYNAAISYNDQNYNFAKSAQTKESNGQN
DNYANRYSYSRSLTIPSQSGANQKSQTVRFPTNRRKREASLASEKISIQERQGYFGGPSI
PQCNQNQVCCRRPLRPQASNRGQCGIRHSQGINGRIKTPSYIDGDSEFGEYPWQAAILKK
DPKESVYVCGGTLIDGLHIMTAAHCIKSYKGFELRVRLGEWDVNHDVEFYPYIERDVISV
HVHPQYYAGTLDNDLAILKLEHPVDWTKYPHISPACLPDKYTDYAGQRCWTTGWGKDAFG
SNGKYQNILKEVDVPILPHGQCQQQLRQTRLGYNYELNPGFVCAGGEDGKDACKGDGGGP
LVCERSGTWQLVGVVSWGIGCGQAGVPGVYVKVAHYLDWISQVTGKFSQF