New model in OGS2.0 | DPOGS201295  |
---|---|
Genomic Position | scaffold2389:- 2144-6057 |
See gene structure | |
CDS Length | 3312 |
Paired RNAseq reads   | 1214 |
Single RNAseq reads   | 2856 |
Migratory profiles | Query via corresponding ESTs |
Best Bmobyx hit | BGIBMGA003045 (2e-136) |
Best Drosophila hit   | mutagen-sensitive 201, isoform C (3e-68) |
Best Human hit | DNA repair protein complementing XP-G cells (3e-44) |
Best NR hit (blastp)   | hypothetical protein TcasGA2_TC013347 [Tribolium castaneum] (2e-175) |
Best NR hit (blastx)   | mutagen-sensitive 201 [Nasonia vitripennis] (6e-141) |
GeneOntology terms    | GO:0006284 base-excision repair GO:0000014 single-stranded DNA specific endodeoxyribonuclease activity GO:0006289 nucleotide-excision repair GO:0005634 nucleus GO:0003697 single-stranded DNA binding |
InterPro families    | IPR020045 5'-3' exonuclease, C-terminal subdomain IPR006084 DNA repair protein (XPGC)/yeast Rad IPR001044 Xeroderma pigmentosum group G protein IPR006085 XPG N-terminal IPR006086 XPG/RAD2 endonuclease IPR008918 Helix-hairpin-helix motif, class 2 IPR019974 XPG conserved site |
Orthology group | MCL13146 |
Nucleotide sequence:
ATGGGAGTGACTGGCCTGTGGAGACTTATCGAGCCGGCCGGAAAACCCGTGCCTGTTGAA
ACATTAGAGAATAAAGTTTTAGCAGTCGATATTTCAATATGGTTGCATCAGATGGTGAAA
GGCTATCAGGATGCTAAAGGAGCTCCTCTACCGAATGCTCATCTCATTGGGTTGTTTCAA
AGACTATGTAAATTACTATATTTTAGAATCAAGCCAGTTTTTGTATTTGATGGTGGATTT
CCAGATCTTAAAAGAGAGACAATTGCTAAAAGACAAGACAACAAAACGAAATATAATTCT
GCATCAGAAAAATTAAAGAGAGAAATCACTTTGCTTTTGGGCAAAAAAACTGCAATTGGT
TCATTGCTAGGAAAACAGATTTCTCCCACTAAGAATAAACAACCTCAGGCTAACGATGAC
ATCTTTAAACTACCGGAATTACCAGAAAAGGGTGCATATTCTGAATCTGAGTCTGAAGAT
GAACAAGATTCAAGTGCATCAACAGTGGACTTACACTCTGTAGATTTTGACTCAGATAAA
TTTAAAAATATGCCCATAAAAGAAAAATATGACCTTCTAATTGAACTGAAGGAAACAAGG
AAAATGAATTCTTGGGGCAAAATAAACACATTACCTAAGAAAAGTGACAACTTTTCGGAT
TTTCAAATGCAGAGATTGCTAAAACGGAGGCAACTACAAGAATGTTTGGAAGAAACTGAA
AAAGAGATGGGTGATGAGGGCATGTCCTTAAACGAATTGGAATCATTGCTCAATGAAGAG
GGCATAGACACCAAAATAGAGAGCTTGCCGAGCCGCAGAATAGCGTCAAATAATACAACG
AGATATCTACTTATAAGTAATGTAAGGCAGGCGTTGGAAAATGCTAAGAGAAAGGAAGAG
GCTGCCCAACAAGCACAAATGCAAATATCAGCGACAACTGAAGTAGAAAAACATGATACC
AAAGATATTCAGAAAAATGATGAATGCGACGATGATTTAGAAAAGGCTATTAAAATGTCG
CTGGAATGTGTGGAAGAGGCCGACACGAGCGCTTGTACGTCAAAAACTGATGAGTCCTGG
ACATCTTTCCTCACAGAATCTGATTATTCAGATGACGAAGACGAAGATGGATTCGCCCCT
CCTGATATGACGTCAGCAAAATCATACATCAAACAATATACCGACTTAAACTACAAAGTC
ATTGATAACATTGTTGCCGCAAAACAAAAAGAAAAAAATAAACCTTCTAAACCAAGAGTT
AATGAAATCATTGAAGAGCTCGCACAAGAAAAGACTATTATAGAAGATGAGATTGAGTTA
ATGTCTAGTGATGATGACAAAGATAAGTGTGAAGTAGTAGAAAAAGAAAACGATAAAGAT
AAGTCATGTGCTGTAGAAGCGGTAATGGAAAATGAAGTCATAAGTAAAAAGTGTGAATTG
GAAGAAAGCAATCATGAAGAGGCTTCTATAATATGTGTTGAGAAATCTATTGCTGATGTT
ATTTCACTTGATTCAAGCTTAGGAGAAGCGGAAACAGATAACGATGAAGTTAAAGTTGAA
CCTGCGAAAGATAATATTAAATTAGAAAGTTCAAGCTCCAGTGAGGATGACTTCGAAGAT
GTTTCTGATGAAGAGACCGAGTCTAAGAAACCAGTTGTTACATTAACCCTCAATATGGGC
AATACGATCGAAGATGATATATTTGCCGATATATTTGAAAGTAAAGCTGGTGAAAAATGT
TTAGCACCTAATATAAAGGAAGAGCAAGAAATAAAGGAAGATGTTAGCAAACACGAGAAC
GAAGTGAAAGATATAATTAAATCAAATGTCACAGCTGAAGTTGAAACAACTTCAAAAAGT
ATAGAAACAACTTCAAAAAATATAGAAACATGCTTGAATGAAGAAAGTGAAAAACAGAGA
AATGAAGATCAAAAAGATGTAGTGGATAGAGACCAAACAGTATCAGAGGAGCTTGCTACA
ACACAAAAGGCAGTGATACCAGAAAGACAAAAAATATCAGCTGAAGAACTTAACACTATG
GTAACGGAAATTGAGAATGAGGAACAGTTACTGCTTCAAGAAAAGGGCAAATTAGATCGT
ATTGGTCGCAACATAACTGAACAGATGACCAAAGAGGCACAGGAACTGCTTCAAATCTTC
GGCATCCCGTACATCGTAGCTCCGATGGAGGCTGAAGCGCAGTGCGCGGTGTTGGAGGCT
CTCAAACTTACCGACGGCACCATAACAGATGACAGTGACATTTGGCTGTTCGGGGGCAGA
ACGGTGTACAAAAACTTTTTCAATCAGAAAAAACATGTCTTGCAATTTTTGCGGGAACGA
ATTGAAAAATCGTTTAATTTGAGTCGTGAGAAGTTGGTGCTGCTGGCTCTGCTCGTGGGA
AGCGATTACACAGTCGGAGTTACTGGCGTGGGACCTGTGACCGCTTTGGAGATTTTGGCT
TCATTTCCTTTTAACAAGAAAAAAACAATAGCTGAAGACGCAAAATTCACCGATTATCAA
GAAATTGTAGCGGGACTGCAAGATTTTAAGAAATGGGTGAAGGCGGGGAGAAGAACGGAT
AATGTTAGTTTGAAGAAGAAACTAAAAAATGTCAGCCTCTCGGATGACTTTCCCAGTGTG
AGGGTAGTTCAAGCATATTTCGAACCGAATGTAGAAAAGAGCAGTGAAAAGTTCTCCTGG
GGCGATCCAGACATCACTGAGTTAAGAGAGTACGCGAGGGCCAAGTTTGGGTGGTCCCAA
CACAAGTTGGATGAAATAATAAAACCAGTCATTAAAAGGATGCAAGAAAATAAGACTCAG
AAAACTGTCCACGACTATTTTAAAAAGAAACTCGTGTTGGATTCTTTGGAGGACCAGATG
AGTAAGAGAGTCAAAGCTGCAATACAAAAGATGGGAACCGAGGCTTCAGGAGAAGAATTA
AATGCACCTGAAAAACCGAAACCCAAAAGAGCAAGAAAAAGATCTACAAATCAAGATGCT
ACAAAGCCAGGACCTTCCAAGGAAAAACGTAAAAGGCGCAACGATGAAACCAAACAGATA
AGCGAAGTTATACAAGCGGCGCCAGCAAACGATCCAGATAAGACATTTGATATAGCAGTT
CCAAAGACGGATAGGTATCAGGAAATTATACCGCAGAGAGAGAGGGACAGGAAATGCATT
TTAGAAAATAAGTTAAAGGCTATAGAGGTGTTCCGCAAGAGTAAGATTGATCCTAAAAAG
AAAACGACGAAGAGGCGGCTTCCTGTACCCAAAGAAAAAGCTGATCTCTCAGAGAGTAGT
GACACAGATTAA
Protein sequence:
MGVTGLWRLIEPAGKPVPVETLENKVLAVDISIWLHQMVKGYQDAKGAPLPNAHLIGLFQ
RLCKLLYFRIKPVFVFDGGFPDLKRETIAKRQDNKTKYNSASEKLKREITLLLGKKTAIG
SLLGKQISPTKNKQPQANDDIFKLPELPEKGAYSESESEDEQDSSASTVDLHSVDFDSDK
FKNMPIKEKYDLLIELKETRKMNSWGKINTLPKKSDNFSDFQMQRLLKRRQLQECLEETE
KEMGDEGMSLNELESLLNEEGIDTKIESLPSRRIASNNTTRYLLISNVRQALENAKRKEE
AAQQAQMQISATTEVEKHDTKDIQKNDECDDDLEKAIKMSLECVEEADTSACTSKTDESW
TSFLTESDYSDDEDEDGFAPPDMTSAKSYIKQYTDLNYKVIDNIVAAKQKEKNKPSKPRV
NEIIEELAQEKTIIEDEIELMSSDDDKDKCEVVEKENDKDKSCAVEAVMENEVISKKCEL
EESNHEEASIICVEKSIADVISLDSSLGEAETDNDEVKVEPAKDNIKLESSSSSEDDFED
VSDEETESKKPVVTLTLNMGNTIEDDIFADIFESKAGEKCLAPNIKEEQEIKEDVSKHEN
EVKDIIKSNVTAEVETTSKSIETTSKNIETCLNEESEKQRNEDQKDVVDRDQTVSEELAT
TQKAVIPERQKISAEELNTMVTEIENEEQLLLQEKGKLDRIGRNITEQMTKEAQELLQIF
GIPYIVAPMEAEAQCAVLEALKLTDGTITDDSDIWLFGGRTVYKNFFNQKKHVLQFLRER
IEKSFNLSREKLVLLALLVGSDYTVGVTGVGPVTALEILASFPFNKKKTIAEDAKFTDYQ
EIVAGLQDFKKWVKAGRRTDNVSLKKKLKNVSLSDDFPSVRVVQAYFEPNVEKSSEKFSW
GDPDITELREYARAKFGWSQHKLDEIIKPVIKRMQENKTQKTVHDYFKKKLVLDSLEDQM
SKRVKAAIQKMGTEASGEELNAPEKPKPKRARKRSTNQDATKPGPSKEKRKRRNDETKQI
SEVIQAAPANDPDKTFDIAVPKTDRYQEIIPQRERDRKCILENKLKAIEVFRKSKIDPKK
KTTKRRLPVPKEKADLSESSDTD