DPGLEAN18066 in OGS1.0

New model in OGS2.0DPOGS206449 
Genomic Positionscaffold901:- 1411-5457
See gene structure
CDS Length1215
Paired RNAseq reads  344
Single RNAseq reads  1134
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA005377 (1e-143)
Best Drosophila hit  Ssl1 (3e-116)
Best Human hitgeneral transcription factor IIH subunit 2 (9e-97)
Best NR hit (blastp)  TFIIH basal transcription factor complex p44 subunit [Culex quinquefasciatus] (4e-139)
Best NR hit (blastx)  TFIIH basal transcription factor complex p44 subunit [Culex quinquefasciatus] (4e-134)
GeneOntology terms




  
GO:0005675 holo TFIIH complex
GO:0016251 general RNA polymerase II transcription factor activity
GO:0006367 transcription initiation from RNA polymerase II promoter
GO:0006281 DNA repair
GO:0045449 regulation of transcription
GO:0008270 zinc ion binding
InterPro families


  
IPR007198 Ssl1-like
IPR004595 TFIIH C1-like, C-terminal
IPR002035 von Willebrand factor, type A
IPR012170 TFIIH basal transcription factor complex, subunit SSL1
Orthology groupMCL10910

Nucleotide sequence:

ATGGCAGATGATGAGCAGGATCCTAAAGAATATAGATGGGAGACGGGTTATGAGAAAACT
TGGGAAGCTATCAAGGAAGATGAAGATGGTTTGGTGGAAGGCCTTGTCGCCGAATTCGCA
CAAAAAGCAGCTCGAAAAGCTATTGCTCCCAAACGTGGGCCGGTTCGTCTTGGTATGATG
AGACATTTGCTGGTGGCAATAGACTGTTCCGAAGCCATGACATCTCAGGACTTAAAGCCT
ACGAGATTTCTTTGCACCTTAAAGCTTCTTGAAAAATTTGTGGAAGAATTTTTTGATCAA
AATCCTTTAAGTCAACTTGGTATAGTTACAATGAAAAATAAAAGGGCTGAAAAAATTACA
GAATTATCAGGAAATGTGAGAAAGCATATCAAAGCTGTTCAAGGTCTGTCAAATCTTGCA
TTAACGGGAGAGCCCTCTCTGCAGAACACTCTGGAGTTAGCAGGAAGAACATTGAGGCCG
TTACCGGGACATGCATCAAGGGAGCTCCTGGTTTTATTTGGTTCACTTACAACATGTGAC
CCAGGAGATATAACTACCACCATACAGACTTTAAAAACGGATGGCATTAGATGTTCCATA
ATTGGTCTTGCAGCGGAGGTCAGGATATGTAAGAAATTGTGCCAAGATACTGGTGGTGAA
TACGGGGTGGTATTAGATGATGTGCATTATCGCTCGTTGCTGTTAGAACAGACCTCCCCT
CCAGCCAGAGCGCGCGCTCTTGACGCAGGTCTTGTTAAAATGGGCTTTCCTCATACACCT
CAACCCTCATCTGACCAGCCCGAGTCAGATCCACCGATAACATTATGCATGTGTCACCTT
GAGGAGGGCGAGGGTGTGGTTGGCGAGGGTCATCTTTGTCCTCAATGTCGCAGCAAGTAC
TGTTCACTGCCAGCACAGTGTCGAACATGTGGACTGACACTCGCGTCAGCACCACATCTA
GCCAGGTCATACCACCATCTATTCCCTGTGGAGCCTTTCGAGGAGTTACCAAATGAGGGT
CAAGCTCAATTCTGTTTTGGCTGCCTCAGATCATTCACGGAAAATGATAAACAGATATAT
CGTTGTCGTCGTTGTACCGAATTCTATTGCTGGGAGTGTGAAGGTGTGGTGTCAAGTACA
CTTCACGTTTGTGGTTCCTGTGCCTCCAGACCGCAGCTCTACCAGAGACTACCAACGGCT
GAAGGGAGAGACTGA

Protein sequence:

MADDEQDPKEYRWETGYEKTWEAIKEDEDGLVEGLVAEFAQKAARKAIAPKRGPVRLGMM
RHLLVAIDCSEAMTSQDLKPTRFLCTLKLLEKFVEEFFDQNPLSQLGIVTMKNKRAEKIT
ELSGNVRKHIKAVQGLSNLALTGEPSLQNTLELAGRTLRPLPGHASRELLVLFGSLTTCD
PGDITTTIQTLKTDGIRCSIIGLAAEVRICKKLCQDTGGEYGVVLDDVHYRSLLLEQTSP
PARARALDAGLVKMGFPHTPQPSSDQPESDPPITLCMCHLEEGEGVVGEGHLCPQCRSKY
CSLPAQCRTCGLTLASAPHLARSYHHLFPVEPFEELPNEGQAQFCFGCLRSFTENDKQIY
RCRRCTEFYCWECEGVVSSTLHVCGSCASRPQLYQRLPTAEGRD