DPGLEAN14392 in OGS1.0

New model in OGS2.0DPOGS210520 
Genomic Positionscaffold296:+ 47589-52955
See gene structure
CDS Length1812
Paired RNAseq reads  353
Single RNAseq reads  1890
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA012623 (0.0)
Best Drosophila hit  marionette (1e-167)
Best Human hitgeneral transcription factor IIH subunit 4 (4e-132)
Best NR hit (blastp)  PREDICTED: similar to TFIIH basal transcription factor complex p52 subunit [Tribolium castaneum] (0.0)
Best NR hit (blastx)  PREDICTED: similar to TFIIH basal transcription factor complex p52 subunit [Tribolium castaneum] (0.0)
GeneOntology terms



  
GO:0016251 general RNA polymerase II transcription factor activity
GO:0005675 holo TFIIH complex
GO:0006367 transcription initiation from RNA polymerase II promoter
GO:0006355 regulation of transcription, DNA-dependent
GO:0006281 DNA repair
InterPro families  IPR004598 Transcription factor Tfb2
Orthology groupMCL10752

Nucleotide sequence:

ATGACGGAGGCGCGGTTTCTACGAGCGCCCGAGATCTCACATCGATCCCTCCGCGGGCAT
ATCGAAAGCGAAACACTAGATCGAACCGAGTCGTACACTGTTTTCGTGTTTCTAAGACGA
TTATTCACGGAGCACATATACACAATGTCCCGTTCCGAAGGATATACACAAAATGGTGCT
GAGCGAAACGGAATAGCACTGTACTCGACGAGTTGTCTATGCGCGACTGACGGAAGGCGT
CTACTGTATCCGCCATCTGCCGACGAGAATGAATGCGGTGAACTCGTCGCCTGGAGGTAC
TGTGTGCGTGTGTTTGTACACGGCCGTGACGTGAAAAGTAATTTGACTATTGCATTGAAT
ATAAGTAAAGTGTTCCAAACAATGTCAGAGTCATCTAAGTCAAAGTCTTCTTCAAGTACA
AATCCTTCCCCGACACTACAATGCAAGGATCTTCACGAATATTTGAAAAGTCGCTCGCCT
CAGTTCCTCGAAACCTTATACAATTATCCCACAATATGCTTAGCGGTATATCGAGAACTG
CCTGAGTTAGCCCGACATTTTGTGATCCGATTGCTGTTTGTAGAACAACCTGTCCCTCAA
GCAGTGGTTGCTTCTTGGGTCACACAGACTCATGCCAAAGAGCAACACAAAGCTTGCGAG
GCTCTGTCGGAGCTGTCTGTGTGGCAGGAGGCTCCTATCCCTGGAGGTCTGCCGGGATGG
ATGCTTTCACAGTCTTTTAAGAAGAATCTGAAAGTTGCTCTTCTTGGAGGCGGTCGTCCA
TGGAGCATGTCATCCTCTCTGGAACCGGATGGCAAGGCTCGTGATGTGTCATTCCTGGAC
GCGTACGCTCTTGAACGCTGGGAATGTGTTCTGCATTACATGGTGGGATCAACACAGACC
GAGGGGATCAGTGCTGATGCTGTCAGGATACTGCTACAGGCCGGACTCATGAACAGAGAT
GCAGAAGATGGCACAGCTGTTATAACCAGAGCTGGATTCCAGTTTCTGCTCCTCAGCACA
GCTAAACAGGTATGGCTGTTTCTCCAACACTACCTGCACACGGCGGAAAAACGCAGTCTC
AGCGCCGCGGAGTGCCTCGCCTTCCTGTACCAGCTCAGCTTCAGCACACTCGGCAAGGAT
TACAGCACGGAGGGTATGAGTAACAACATGCTGGTGTTCCTCCAGCACCTGAGGGAGTTT
GGGCTCGTCTACCAGAGGAAGCGTAAAGCGGGTCGATTCTACCCGACCCGTCTGGCGCTG
AACATCACGTGTGTGAAGGACGGCGTGGCTCCCCTGCAGACGGCCGCCAGCTCGGGCTAC
ATCATCGCGGAGACCAACTACCGGGTGTACGCCTACACCACCAGCGCCCTGCAGGTGGCG
CTGCTGGGACTCTTCACTGAACTAGTCTACAGGTTCCCTAACGTGGTGGTAGGCGTCCTG
ACCCGCGAGTCCGTGCGCGCTGCCCTCCGCGGCGGTATCTCCGCCCAGCAGATCATCACC
TACCTGGAGCAGCACTCTCACCCCCAGATGCTGAAATCCGACCAGGGCGGCATCAGGTCG
TCGTCGTCGCTGCCGCCGACCGTCCTCGACCAGATCCGTCTGTGGGAGTCGGAGAGGAAC
CGCTTCACGTACACGGAGGGCGTCGTCTACAACCAGTTCCTGTCCCAGGCCGAGTTCAAC
GTGCTGAGGGACTACGGCCGCTCGTCCGGCGCGCTAGTGTGGGCGGCGGACAGGACGCGC
ACCATGGTGGTGGCCAGGGCGGCGCACGACGACGTCAAGAGATACTGGAAGAGATACTCC
AAGGCCACTTAG

Protein sequence:

MTEARFLRAPEISHRSLRGHIESETLDRTESYTVFVFLRRLFTEHIYTMSRSEGYTQNGA
ERNGIALYSTSCLCATDGRRLLYPPSADENECGELVAWRYCVRVFVHGRDVKSNLTIALN
ISKVFQTMSESSKSKSSSSTNPSPTLQCKDLHEYLKSRSPQFLETLYNYPTICLAVYREL
PELARHFVIRLLFVEQPVPQAVVASWVTQTHAKEQHKACEALSELSVWQEAPIPGGLPGW
MLSQSFKKNLKVALLGGGRPWSMSSSLEPDGKARDVSFLDAYALERWECVLHYMVGSTQT
EGISADAVRILLQAGLMNRDAEDGTAVITRAGFQFLLLSTAKQVWLFLQHYLHTAEKRSL
SAAECLAFLYQLSFSTLGKDYSTEGMSNNMLVFLQHLREFGLVYQRKRKAGRFYPTRLAL
NITCVKDGVAPLQTAASSGYIIAETNYRVYAYTTSALQVALLGLFTELVYRFPNVVVGVL
TRESVRAALRGGISAQQIITYLEQHSHPQMLKSDQGGIRSSSSLPPTVLDQIRLWESERN
RFTYTEGVVYNQFLSQAEFNVLRDYGRSSGALVWAADRTRTMVVARAAHDDVKRYWKRYS
KAT