New model in OGS2.0 | DPOGS210520  |
---|---|
Genomic Position | scaffold296:+ 47589-52955 |
See gene structure | |
CDS Length | 1812 |
Paired RNAseq reads   | 353 |
Single RNAseq reads   | 1890 |
Migratory profiles | Query via corresponding ESTs |
Best Bmobyx hit | BGIBMGA012623 (0.0) |
Best Drosophila hit   | marionette (1e-167) |
Best Human hit | general transcription factor IIH subunit 4 (4e-132) |
Best NR hit (blastp)   | PREDICTED: similar to TFIIH basal transcription factor complex p52 subunit [Tribolium castaneum] (0.0) |
Best NR hit (blastx)   | PREDICTED: similar to TFIIH basal transcription factor complex p52 subunit [Tribolium castaneum] (0.0) |
GeneOntology terms    | GO:0016251 general RNA polymerase II transcription factor activity GO:0005675 holo TFIIH complex GO:0006367 transcription initiation from RNA polymerase II promoter GO:0006355 regulation of transcription, DNA-dependent GO:0006281 DNA repair |
InterPro families   | IPR004598 Transcription factor Tfb2 |
Orthology group | MCL10752 |
Nucleotide sequence:
ATGACGGAGGCGCGGTTTCTACGAGCGCCCGAGATCTCACATCGATCCCTCCGCGGGCAT
ATCGAAAGCGAAACACTAGATCGAACCGAGTCGTACACTGTTTTCGTGTTTCTAAGACGA
TTATTCACGGAGCACATATACACAATGTCCCGTTCCGAAGGATATACACAAAATGGTGCT
GAGCGAAACGGAATAGCACTGTACTCGACGAGTTGTCTATGCGCGACTGACGGAAGGCGT
CTACTGTATCCGCCATCTGCCGACGAGAATGAATGCGGTGAACTCGTCGCCTGGAGGTAC
TGTGTGCGTGTGTTTGTACACGGCCGTGACGTGAAAAGTAATTTGACTATTGCATTGAAT
ATAAGTAAAGTGTTCCAAACAATGTCAGAGTCATCTAAGTCAAAGTCTTCTTCAAGTACA
AATCCTTCCCCGACACTACAATGCAAGGATCTTCACGAATATTTGAAAAGTCGCTCGCCT
CAGTTCCTCGAAACCTTATACAATTATCCCACAATATGCTTAGCGGTATATCGAGAACTG
CCTGAGTTAGCCCGACATTTTGTGATCCGATTGCTGTTTGTAGAACAACCTGTCCCTCAA
GCAGTGGTTGCTTCTTGGGTCACACAGACTCATGCCAAAGAGCAACACAAAGCTTGCGAG
GCTCTGTCGGAGCTGTCTGTGTGGCAGGAGGCTCCTATCCCTGGAGGTCTGCCGGGATGG
ATGCTTTCACAGTCTTTTAAGAAGAATCTGAAAGTTGCTCTTCTTGGAGGCGGTCGTCCA
TGGAGCATGTCATCCTCTCTGGAACCGGATGGCAAGGCTCGTGATGTGTCATTCCTGGAC
GCGTACGCTCTTGAACGCTGGGAATGTGTTCTGCATTACATGGTGGGATCAACACAGACC
GAGGGGATCAGTGCTGATGCTGTCAGGATACTGCTACAGGCCGGACTCATGAACAGAGAT
GCAGAAGATGGCACAGCTGTTATAACCAGAGCTGGATTCCAGTTTCTGCTCCTCAGCACA
GCTAAACAGGTATGGCTGTTTCTCCAACACTACCTGCACACGGCGGAAAAACGCAGTCTC
AGCGCCGCGGAGTGCCTCGCCTTCCTGTACCAGCTCAGCTTCAGCACACTCGGCAAGGAT
TACAGCACGGAGGGTATGAGTAACAACATGCTGGTGTTCCTCCAGCACCTGAGGGAGTTT
GGGCTCGTCTACCAGAGGAAGCGTAAAGCGGGTCGATTCTACCCGACCCGTCTGGCGCTG
AACATCACGTGTGTGAAGGACGGCGTGGCTCCCCTGCAGACGGCCGCCAGCTCGGGCTAC
ATCATCGCGGAGACCAACTACCGGGTGTACGCCTACACCACCAGCGCCCTGCAGGTGGCG
CTGCTGGGACTCTTCACTGAACTAGTCTACAGGTTCCCTAACGTGGTGGTAGGCGTCCTG
ACCCGCGAGTCCGTGCGCGCTGCCCTCCGCGGCGGTATCTCCGCCCAGCAGATCATCACC
TACCTGGAGCAGCACTCTCACCCCCAGATGCTGAAATCCGACCAGGGCGGCATCAGGTCG
TCGTCGTCGCTGCCGCCGACCGTCCTCGACCAGATCCGTCTGTGGGAGTCGGAGAGGAAC
CGCTTCACGTACACGGAGGGCGTCGTCTACAACCAGTTCCTGTCCCAGGCCGAGTTCAAC
GTGCTGAGGGACTACGGCCGCTCGTCCGGCGCGCTAGTGTGGGCGGCGGACAGGACGCGC
ACCATGGTGGTGGCCAGGGCGGCGCACGACGACGTCAAGAGATACTGGAAGAGATACTCC
AAGGCCACTTAG
Protein sequence:
MTEARFLRAPEISHRSLRGHIESETLDRTESYTVFVFLRRLFTEHIYTMSRSEGYTQNGA
ERNGIALYSTSCLCATDGRRLLYPPSADENECGELVAWRYCVRVFVHGRDVKSNLTIALN
ISKVFQTMSESSKSKSSSSTNPSPTLQCKDLHEYLKSRSPQFLETLYNYPTICLAVYREL
PELARHFVIRLLFVEQPVPQAVVASWVTQTHAKEQHKACEALSELSVWQEAPIPGGLPGW
MLSQSFKKNLKVALLGGGRPWSMSSSLEPDGKARDVSFLDAYALERWECVLHYMVGSTQT
EGISADAVRILLQAGLMNRDAEDGTAVITRAGFQFLLLSTAKQVWLFLQHYLHTAEKRSL
SAAECLAFLYQLSFSTLGKDYSTEGMSNNMLVFLQHLREFGLVYQRKRKAGRFYPTRLAL
NITCVKDGVAPLQTAASSGYIIAETNYRVYAYTTSALQVALLGLFTELVYRFPNVVVGVL
TRESVRAALRGGISAQQIITYLEQHSHPQMLKSDQGGIRSSSSLPPTVLDQIRLWESERN
RFTYTEGVVYNQFLSQAEFNVLRDYGRSSGALVWAADRTRTMVVARAAHDDVKRYWKRYS
KAT