New model in OGS2.0 | DPOGS200103  |
---|---|
Genomic Position | scaffold1123:- 13867-30758 |
See gene structure | |
CDS Length | 3306 |
Paired RNAseq reads   | 7968 |
Single RNAseq reads   | 18427 |
Migratory profiles | Query via corresponding ESTs |
Best Bmobyx hit | BGIBMGA004554 (3e-130) |
Best Drosophila hit   | Thd1 (2e-97) |
Best Human hit | G/T mismatch-specific thymine DNA glycosylase (4e-58) |
Best NR hit (blastp)   | PREDICTED: similar to Thd1 [Tribolium castaneum] (2e-116) |
Best NR hit (blastx)   | PREDICTED: similar to Thd1 [Tribolium castaneum] (1e-103) |
GeneOntology terms    | GO:0008263 pyrimidine-specific mismatch base pair DNA N-glycosylase activity GO:0006289 nucleotide-excision repair GO:0003690 double-stranded DNA binding GO:0006298 mismatch repair GO:0006974 response to DNA damage stimulus |
InterPro families    | IPR020478 AT hook-like IPR015637 DNA glycosylase, G/T mismatch IPR005122 Uracil-DNA glycosylase-like IPR017956 AT hook, DNA-binding motif |
Orthology group | MCL11280 |
Nucleotide sequence:
ATGACGCAGCATCACCAACACCCGCATCAACAACTACATCATAACCAATACGCTAAGAGA
GACGATGAGATATATAACGGTGTGAAACGGGAGTGCGAGGATCCGTACTCGTTCGTCGAG
GAGGAGGCGATGTGCGCGATGCTCGGGCAACAGCACCACCTGCAGCACCTGCAGCATCAC
GACCAGCACCACCACCAGCACATGCAGATCCACCCGCAGCAGATGATGCTCAACCAGCCC
AAGAAAAGGGGCCGCAAGAAGAAAATAAAAGATGAAAACGGGTTGGAACTTAAAGTGGAC
GGAGTTTTAGACAGCGTGTCGGGTATCGTGCGTCCGGTGAAGGAGCGGAAGAAACATGAC
CGGTTCAACGGGATGAGTGAGGAGGAGGTGTCGCGACGCACACTGCCCGACCACCTCGCG
GAGAACCTCGACATCATCATTATCGGTATCAACCCGGGTTTGTTCGCCGCCTACAAGGGT
CATCACTACGCTGGTCCCGGGAACCACTTCTGGAAATGTCTCTACCTATCCGGACTCACG
CGGGAACAGATGAGCGCTGACGAGGATTACAAGCTTCTAAACTTTGGCATCGGCTTCACG
AACATGGTGTCCCGTCCTACCAAAGGCTCAGCGGATCTCACGAGGAGGGAGATCAAGGAA
GGATCCGCCATTTTGTTGGAGAAGCTTCAGACCTTCCGACCCAAGGTGGCCGTGTTCAAC
GGGAAATTGATCTACGAAGTGTTCTCCGGGAAGAAGGACTTCTGTTTCGGGAAACAACCC
GACTGTATCGCCGGGACTAACACTTACATGTGGGTGATGCCGTCGTCGTCGGCTCGTTGC
GCCCAGTTGCCGCGGGCCGCCGACAAGGTCCCGTTCTACGCGGCCTTGAAGAAGTTCAGG
GACTACCTGAACGGTCTCCTGCCGCGGCTGGACGACGCCGAGCTGGTCTTCCCCGACACC
ACCTCCAGGCGGCCGCACGAGGAGATGGAGATACGCAGACTGACGATGGAGCCCGAGCCG
GGCGACACCATTATACTAGAGGACGGCACGGAAGTCCCGCTCAAGAAGAAACGCGGCCGA
CCCAAGAAGGTGAAGTTGGAGAACGGGGAGACGGTGCCCGCGGTCCCCCGGGCGCCGCGG
CAGCCCCGGCCTCCGCCCTCCATGGAGACCGGGGACCAGCCCGCTAAGAAGAAGAGAGGC
CGCCCCAAGAAAATAAGACCCGAGGAGCAGCAGTTTCTCCTGCAGCAACAGCAACAACAA
CAGCAGCAGCAGCAACAGCAACAACAACAACAGCAACAGCAAAACAATTCGATGCTGCAG
CCGCAGCTATCGTCCATGACACAGCTGCCGCACGAGCAGTTCCTCCACAACTCTAGCGGG
GACTTCCAGCAGATGTCGTCGCCGTTGGGAGTGGGCGGTGTGGGCGTGGGCGTGGGCAAC
GTGGGCGTCGGCAGTGTGGGGGGAGTGAGCAACGTCAGCAGTGGTGTCATGTACGGTGTG
CAGCATCAGCACCAGCAGCAGATGTCCGACTCGTCGCCGTACTACCAACCTAACAATAGC
GGCGGTATGGATTCTCCGTTAGACGTGGGCGGAGGTCTGGGCATGTCCCGTGGTTACGGT
TCACCTGGCGGCGTGGGCGTGGGCGGCGTGGGCTTCGCGGCGTCACCACGGCACGCGCAC
TCGTACGCGTCACCGCGCTCACAGCCATACTCACCTGGACCACAGAGACTCGCTGCTACG
CCGCAACCACAGCAACAGTTTCCATCGAGCCCGGCGGCATTTTCGGCGCCATCGCCGCCA
CGTGCTGGGTACGGTTCACCCGGAGTGGGCGGTGTGGGCGTCGTGGGCGGTGTGGGGGGT
GTGGGCGGATCGCGCGGGTTCGCGGCTCGTTCCCCACTGTATGCGAGCTCCCCGGCCGCT
TACCGCCAGCAGCCGAGCCCAGCCGCCCAGCCCCAGCCAAGGTTTACACACGATGGGATG
CCTTTCGCTAGAGACACACAGAGCGGTTCAGTATCAACTTCCGGCGGGGGTTTCTCGTGT
TCCCCCGGCGTGGCGGGCGTGGTGGGGGTGGGGGTGGTGGGGGGTAGTACCCCGTTCCCG
GCGGCGTCCCCCGCCGCTCACTCGTACACCCCGTCTCCAGCCCACACGCCGTACTCGCAC
CACTCGTCCCCAGCGCCGGCCCCAGCCCCTCACACGCCGTATGATTCACATCATTTTGCC
AATCAGGGAAGTGGATCGAGCGGCTCGGGCGGCGGCTCGGGCTCGGGCTACGGTGCAGAG
TTGTCTAGCGACATCGGTGCGGCGATATCGTCCCCGGCGCCCGTGTCGCCAGCCTGCGCC
ACCCTGGACTTCGAGCCGCCCCGTGATGACTCGCCAATGGGCAGCACAGACATGCATCCG
GGCAGCAACAGCAACTCCTCGCTGTCCGACTACAATAAGCAGAGTAATCCGGGCGGAGGC
GAGATGTCGCCGGCCGGTGTGGGCGCGGGCTCGTTCGGGGGCGCTCTGTACGACGACACG
AGACTAGCGTACAGCGACAAACCCGACTATCATTACCAGGAACAAGGCAATGGTGTGGGA
GACAGCCCTCGGTTAGATCAATTACATCAACATCCCTCCATGTATCCTAGCAATTTTAAC
AGGTCAACGCCTACCGGAGACAGCGACTCGGGCTTCGGCCGCGGCTCGTTCCGGGCGCCC
GAGTTACACCACCCTCCGCCTGCAGGTTCCCCCAGTGAGTACAGCGGCGGAGGCGAGTCT
GGTAACGGAACTCCTAAAAGCAAATCACAGGACGTGGCTTCCAAGTCGCTATCGGGACTG
GAGTCGCTCGTTGACCAGATACCTTCCATAGCGGACGGCCCGGCGGGCGTGGGCGGTGTG
GGCGGCGTGGGCGCGGCCGTCGGCAGCGTGGGCGAACAGGGCAGCGCCCCACCAGTGCCC
TCGCTGCCAGAGTACACGCCAGCGTTATACCCGCCATACCCGGCGTACGGCGCACCGGCA
TACGGAAATAACAGCTACGGCGCTCCGTTTGTCGGTTACGGCGGTGGTTGGGGCACCCAG
CTGATGCGGCCGGCGCCGGGCTACTTACCGGACTGGCAGTACGGGTACGGTCCGCCCGCG
TACGCCTCATACAACTCACCGTACTACAACGGATATCCGGGACCGCCGCCCGCGCACCAC
CAACAGACTCACTACCTGTCCCCGCCGCTGTTGGAGCTCCACAAAAGCGGCGAGCACGCG
GCCGCCGTGTCTGCGGTGCCCGCGGTCCCCTCCGTGCCCTCCGTCGGCTTCGGGGGCTTC
TGTTAG
Protein sequence:
MTQHHQHPHQQLHHNQYAKRDDEIYNGVKRECEDPYSFVEEEAMCAMLGQQHHLQHLQHH
DQHHHQHMQIHPQQMMLNQPKKRGRKKKIKDENGLELKVDGVLDSVSGIVRPVKERKKHD
RFNGMSEEEVSRRTLPDHLAENLDIIIIGINPGLFAAYKGHHYAGPGNHFWKCLYLSGLT
REQMSADEDYKLLNFGIGFTNMVSRPTKGSADLTRREIKEGSAILLEKLQTFRPKVAVFN
GKLIYEVFSGKKDFCFGKQPDCIAGTNTYMWVMPSSSARCAQLPRAADKVPFYAALKKFR
DYLNGLLPRLDDAELVFPDTTSRRPHEEMEIRRLTMEPEPGDTIILEDGTEVPLKKKRGR
PKKVKLENGETVPAVPRAPRQPRPPPSMETGDQPAKKKRGRPKKIRPEEQQFLLQQQQQQ
QQQQQQQQQQQQQQNNSMLQPQLSSMTQLPHEQFLHNSSGDFQQMSSPLGVGGVGVGVGN
VGVGSVGGVSNVSSGVMYGVQHQHQQQMSDSSPYYQPNNSGGMDSPLDVGGGLGMSRGYG
SPGGVGVGGVGFAASPRHAHSYASPRSQPYSPGPQRLAATPQPQQQFPSSPAAFSAPSPP
RAGYGSPGVGGVGVVGGVGGVGGSRGFAARSPLYASSPAAYRQQPSPAAQPQPRFTHDGM
PFARDTQSGSVSTSGGGFSCSPGVAGVVGVGVVGGSTPFPAASPAAHSYTPSPAHTPYSH
HSSPAPAPAPHTPYDSHHFANQGSGSSGSGGGSGSGYGAELSSDIGAAISSPAPVSPACA
TLDFEPPRDDSPMGSTDMHPGSNSNSSLSDYNKQSNPGGGEMSPAGVGAGSFGGALYDDT
RLAYSDKPDYHYQEQGNGVGDSPRLDQLHQHPSMYPSNFNRSTPTGDSDSGFGRGSFRAP
ELHHPPPAGSPSEYSGGGESGNGTPKSKSQDVASKSLSGLESLVDQIPSIADGPAGVGGV
GGVGAAVGSVGEQGSAPPVPSLPEYTPALYPPYPAYGAPAYGNNSYGAPFVGYGGGWGTQ
LMRPAPGYLPDWQYGYGPPAYASYNSPYYNGYPGPPPAHHQQTHYLSPPLLELHKSGEHA
AAVSAVPAVPSVPSVGFGGFC