DPGLEAN08885 in OGS1.0

New model in OGS2.0DPOGS200103 
Genomic Positionscaffold1123:- 13867-30758
See gene structure
CDS Length3306
Paired RNAseq reads  7968
Single RNAseq reads  18427
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA004554 (3e-130)
Best Drosophila hit  Thd1 (2e-97)
Best Human hitG/T mismatch-specific thymine DNA glycosylase (4e-58)
Best NR hit (blastp)  PREDICTED: similar to Thd1 [Tribolium castaneum] (2e-116)
Best NR hit (blastx)  PREDICTED: similar to Thd1 [Tribolium castaneum] (1e-103)
GeneOntology terms



  
GO:0008263 pyrimidine-specific mismatch base pair DNA N-glycosylase activity
GO:0006289 nucleotide-excision repair
GO:0003690 double-stranded DNA binding
GO:0006298 mismatch repair
GO:0006974 response to DNA damage stimulus
InterPro families


  
IPR020478 AT hook-like
IPR015637 DNA glycosylase, G/T mismatch
IPR005122 Uracil-DNA glycosylase-like
IPR017956 AT hook, DNA-binding motif
Orthology groupMCL11280

Nucleotide sequence:

ATGACGCAGCATCACCAACACCCGCATCAACAACTACATCATAACCAATACGCTAAGAGA
GACGATGAGATATATAACGGTGTGAAACGGGAGTGCGAGGATCCGTACTCGTTCGTCGAG
GAGGAGGCGATGTGCGCGATGCTCGGGCAACAGCACCACCTGCAGCACCTGCAGCATCAC
GACCAGCACCACCACCAGCACATGCAGATCCACCCGCAGCAGATGATGCTCAACCAGCCC
AAGAAAAGGGGCCGCAAGAAGAAAATAAAAGATGAAAACGGGTTGGAACTTAAAGTGGAC
GGAGTTTTAGACAGCGTGTCGGGTATCGTGCGTCCGGTGAAGGAGCGGAAGAAACATGAC
CGGTTCAACGGGATGAGTGAGGAGGAGGTGTCGCGACGCACACTGCCCGACCACCTCGCG
GAGAACCTCGACATCATCATTATCGGTATCAACCCGGGTTTGTTCGCCGCCTACAAGGGT
CATCACTACGCTGGTCCCGGGAACCACTTCTGGAAATGTCTCTACCTATCCGGACTCACG
CGGGAACAGATGAGCGCTGACGAGGATTACAAGCTTCTAAACTTTGGCATCGGCTTCACG
AACATGGTGTCCCGTCCTACCAAAGGCTCAGCGGATCTCACGAGGAGGGAGATCAAGGAA
GGATCCGCCATTTTGTTGGAGAAGCTTCAGACCTTCCGACCCAAGGTGGCCGTGTTCAAC
GGGAAATTGATCTACGAAGTGTTCTCCGGGAAGAAGGACTTCTGTTTCGGGAAACAACCC
GACTGTATCGCCGGGACTAACACTTACATGTGGGTGATGCCGTCGTCGTCGGCTCGTTGC
GCCCAGTTGCCGCGGGCCGCCGACAAGGTCCCGTTCTACGCGGCCTTGAAGAAGTTCAGG
GACTACCTGAACGGTCTCCTGCCGCGGCTGGACGACGCCGAGCTGGTCTTCCCCGACACC
ACCTCCAGGCGGCCGCACGAGGAGATGGAGATACGCAGACTGACGATGGAGCCCGAGCCG
GGCGACACCATTATACTAGAGGACGGCACGGAAGTCCCGCTCAAGAAGAAACGCGGCCGA
CCCAAGAAGGTGAAGTTGGAGAACGGGGAGACGGTGCCCGCGGTCCCCCGGGCGCCGCGG
CAGCCCCGGCCTCCGCCCTCCATGGAGACCGGGGACCAGCCCGCTAAGAAGAAGAGAGGC
CGCCCCAAGAAAATAAGACCCGAGGAGCAGCAGTTTCTCCTGCAGCAACAGCAACAACAA
CAGCAGCAGCAGCAACAGCAACAACAACAACAGCAACAGCAAAACAATTCGATGCTGCAG
CCGCAGCTATCGTCCATGACACAGCTGCCGCACGAGCAGTTCCTCCACAACTCTAGCGGG
GACTTCCAGCAGATGTCGTCGCCGTTGGGAGTGGGCGGTGTGGGCGTGGGCGTGGGCAAC
GTGGGCGTCGGCAGTGTGGGGGGAGTGAGCAACGTCAGCAGTGGTGTCATGTACGGTGTG
CAGCATCAGCACCAGCAGCAGATGTCCGACTCGTCGCCGTACTACCAACCTAACAATAGC
GGCGGTATGGATTCTCCGTTAGACGTGGGCGGAGGTCTGGGCATGTCCCGTGGTTACGGT
TCACCTGGCGGCGTGGGCGTGGGCGGCGTGGGCTTCGCGGCGTCACCACGGCACGCGCAC
TCGTACGCGTCACCGCGCTCACAGCCATACTCACCTGGACCACAGAGACTCGCTGCTACG
CCGCAACCACAGCAACAGTTTCCATCGAGCCCGGCGGCATTTTCGGCGCCATCGCCGCCA
CGTGCTGGGTACGGTTCACCCGGAGTGGGCGGTGTGGGCGTCGTGGGCGGTGTGGGGGGT
GTGGGCGGATCGCGCGGGTTCGCGGCTCGTTCCCCACTGTATGCGAGCTCCCCGGCCGCT
TACCGCCAGCAGCCGAGCCCAGCCGCCCAGCCCCAGCCAAGGTTTACACACGATGGGATG
CCTTTCGCTAGAGACACACAGAGCGGTTCAGTATCAACTTCCGGCGGGGGTTTCTCGTGT
TCCCCCGGCGTGGCGGGCGTGGTGGGGGTGGGGGTGGTGGGGGGTAGTACCCCGTTCCCG
GCGGCGTCCCCCGCCGCTCACTCGTACACCCCGTCTCCAGCCCACACGCCGTACTCGCAC
CACTCGTCCCCAGCGCCGGCCCCAGCCCCTCACACGCCGTATGATTCACATCATTTTGCC
AATCAGGGAAGTGGATCGAGCGGCTCGGGCGGCGGCTCGGGCTCGGGCTACGGTGCAGAG
TTGTCTAGCGACATCGGTGCGGCGATATCGTCCCCGGCGCCCGTGTCGCCAGCCTGCGCC
ACCCTGGACTTCGAGCCGCCCCGTGATGACTCGCCAATGGGCAGCACAGACATGCATCCG
GGCAGCAACAGCAACTCCTCGCTGTCCGACTACAATAAGCAGAGTAATCCGGGCGGAGGC
GAGATGTCGCCGGCCGGTGTGGGCGCGGGCTCGTTCGGGGGCGCTCTGTACGACGACACG
AGACTAGCGTACAGCGACAAACCCGACTATCATTACCAGGAACAAGGCAATGGTGTGGGA
GACAGCCCTCGGTTAGATCAATTACATCAACATCCCTCCATGTATCCTAGCAATTTTAAC
AGGTCAACGCCTACCGGAGACAGCGACTCGGGCTTCGGCCGCGGCTCGTTCCGGGCGCCC
GAGTTACACCACCCTCCGCCTGCAGGTTCCCCCAGTGAGTACAGCGGCGGAGGCGAGTCT
GGTAACGGAACTCCTAAAAGCAAATCACAGGACGTGGCTTCCAAGTCGCTATCGGGACTG
GAGTCGCTCGTTGACCAGATACCTTCCATAGCGGACGGCCCGGCGGGCGTGGGCGGTGTG
GGCGGCGTGGGCGCGGCCGTCGGCAGCGTGGGCGAACAGGGCAGCGCCCCACCAGTGCCC
TCGCTGCCAGAGTACACGCCAGCGTTATACCCGCCATACCCGGCGTACGGCGCACCGGCA
TACGGAAATAACAGCTACGGCGCTCCGTTTGTCGGTTACGGCGGTGGTTGGGGCACCCAG
CTGATGCGGCCGGCGCCGGGCTACTTACCGGACTGGCAGTACGGGTACGGTCCGCCCGCG
TACGCCTCATACAACTCACCGTACTACAACGGATATCCGGGACCGCCGCCCGCGCACCAC
CAACAGACTCACTACCTGTCCCCGCCGCTGTTGGAGCTCCACAAAAGCGGCGAGCACGCG
GCCGCCGTGTCTGCGGTGCCCGCGGTCCCCTCCGTGCCCTCCGTCGGCTTCGGGGGCTTC
TGTTAG

Protein sequence:

MTQHHQHPHQQLHHNQYAKRDDEIYNGVKRECEDPYSFVEEEAMCAMLGQQHHLQHLQHH
DQHHHQHMQIHPQQMMLNQPKKRGRKKKIKDENGLELKVDGVLDSVSGIVRPVKERKKHD
RFNGMSEEEVSRRTLPDHLAENLDIIIIGINPGLFAAYKGHHYAGPGNHFWKCLYLSGLT
REQMSADEDYKLLNFGIGFTNMVSRPTKGSADLTRREIKEGSAILLEKLQTFRPKVAVFN
GKLIYEVFSGKKDFCFGKQPDCIAGTNTYMWVMPSSSARCAQLPRAADKVPFYAALKKFR
DYLNGLLPRLDDAELVFPDTTSRRPHEEMEIRRLTMEPEPGDTIILEDGTEVPLKKKRGR
PKKVKLENGETVPAVPRAPRQPRPPPSMETGDQPAKKKRGRPKKIRPEEQQFLLQQQQQQ
QQQQQQQQQQQQQQNNSMLQPQLSSMTQLPHEQFLHNSSGDFQQMSSPLGVGGVGVGVGN
VGVGSVGGVSNVSSGVMYGVQHQHQQQMSDSSPYYQPNNSGGMDSPLDVGGGLGMSRGYG
SPGGVGVGGVGFAASPRHAHSYASPRSQPYSPGPQRLAATPQPQQQFPSSPAAFSAPSPP
RAGYGSPGVGGVGVVGGVGGVGGSRGFAARSPLYASSPAAYRQQPSPAAQPQPRFTHDGM
PFARDTQSGSVSTSGGGFSCSPGVAGVVGVGVVGGSTPFPAASPAAHSYTPSPAHTPYSH
HSSPAPAPAPHTPYDSHHFANQGSGSSGSGGGSGSGYGAELSSDIGAAISSPAPVSPACA
TLDFEPPRDDSPMGSTDMHPGSNSNSSLSDYNKQSNPGGGEMSPAGVGAGSFGGALYDDT
RLAYSDKPDYHYQEQGNGVGDSPRLDQLHQHPSMYPSNFNRSTPTGDSDSGFGRGSFRAP
ELHHPPPAGSPSEYSGGGESGNGTPKSKSQDVASKSLSGLESLVDQIPSIADGPAGVGGV
GGVGAAVGSVGEQGSAPPVPSLPEYTPALYPPYPAYGAPAYGNNSYGAPFVGYGGGWGTQ
LMRPAPGYLPDWQYGYGPPAYASYNSPYYNGYPGPPPAHHQQTHYLSPPLLELHKSGEHA
AAVSAVPAVPSVPSVGFGGFC