Monarch geneset OGS2.0

DPOGS207004
TranscriptDPOGS207004-TA978 bp
ProteinDPOGS207004-PA325 aa
Genomic positionDPSCF300001 + 1046476-1048060
RNAseq coverage15x (Rank: top 81%)
Annotation
HeliconiusHMEL0086663e-13971.56% 
BombyxBGIBMGA012924-TA4e-11472.93% 
DrosophilaOgg1-PA4e-7545.83% 
EBI UniRef50UniRef50_D6WPN31e-9755.41%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WPN3_TRICA
NCBI RefSeqXP_968299.13e-9855.41%PREDICTED: similar to N-glycosylase/DNA lyase [Tribolium castaneum]
NCBI nr blastpgi|910866715e-9755.41%PREDICTED: similar to N-glycosylase/DNA lyase [Tribolium castaneum]
NCBI nr blastxgi|910866715e-9755.59%PREDICTED: similar to N-glycosylase/DNA lyase [Tribolium castaneum]
Group
Gene OntologyGO:00062812.1e-48DNA repair
GO:00038242.1e-48catalytic activity
GO:00036842.5e-31damaged DNA binding
GO:00085342.5e-31oxidized purine base lesion DNA N-glycosylase activity
GO:00062892.5e-31nucleotide-excision repair
GO:00036774.8e-25DNA binding
GO:00062842.6e-23base-excision repair
KEGG pathwaytca:6566977e-98 
 K03660 (OGG1)maps-> Base excision repair
InterPro domain[120-236] IPR0112572.1e-48DNA glycosylase
[237-304] IPR0231701.8e-32Helix-turn-helix, base-excision DNA repair, C-terminal
[5-125] IPR0129042.5e-318-oxoguanine DNA glycosylase, N-terminal
[1-119] IPR0122944.8e-25Transcription factor TFIID, C-terminal/DNA glycosylase, N-terminal
[130-297] IPR0032652.6e-23HhH-GPD domain
Orthology groupMCL14573 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207004-TA
ATGGCTTGGAATAAAATAAATTGTTGTCAGCGAGAATTGCAATTGCTTGGTACACTTAACGGAGGTCAAAGTTTTAGGTGGAATTATAATAAAGACACAAATGAATGGAAAGGCGTTTTTTCAAGAACCTTATGGAAGTTACGGCAACGAGACGATTTTTTGGAATATCAAGTTTTAGGATCTCTACTCATTAAATCAAAAGAAAATAATTCTGTTAAAGTAGATTTTGCGGATATGCTTACAAAATATTTTAGGTTAGATTTCAACTTAAAAGACCACTATAAAGTATGGTCAGATAAAGATGAACTTTTTAAATCTGCCTGTACAAAGTTCTATGGAATAAGAATGCTAAATCAGGAGCCTGTAGAAAATCTTTTTTCGTTTATCTGCAGCCAGAACAATCATATTTCCAGGATATCCAGCCTGGTTGAAAAACTCTGCATCTATTATGGTGATGAAATTTGTCAGTTTGAAGGAGTGACATATTATGCTTTTCCTGATGTGGAAAAGCTTATGGACATAAAAGTGGAATCTAAATTAAGAGAACTAGGTTTTGGTTATAGAGCCAAATTTATTCAAAAATCAGCAGCTCAGATTGTAGAGTGGGGAGGAGACGAATGGTTTAAAAGATTAAAGGATATGAAATACAAGGACGCCCGACAGGAACTTATAAAATTGTGTGGAATCGGACCTAAAGTCGCTGACTGTATATGCCTGATGTCATTGAATCATCTAGAGGCACTTCCTGTTGACACGCACGTGTATCAAATAGCTGCCACAAACTATCTCCCACACTTGAAAGGTAAAAAAAGTGTCACAGAAAAAATTTATACTGAAATAGGCGACCACTTTAGAAGTTTGTATGGAGATAAAGCAGGATGGGCACATACTGTGCTCTTCTGTGCTGATTTAAAAAAATTTCAACAAGATGACTCAAATGAGGATGTCGTTAAAAGTAAAAGAAAAAAGAAAAAATAA

Protein sequence:

>DPOGS207004-PA
MAWNKINCCQRELQLLGTLNGGQSFRWNYNKDTNEWKGVFSRTLWKLRQRDDFLEYQVLGSLLIKSKENNSVKVDFADMLTKYFRLDFNLKDHYKVWSDKDELFKSACTKFYGIRMLNQEPVENLFSFICSQNNHISRISSLVEKLCIYYGDEICQFEGVTYYAFPDVEKLMDIKVESKLRELGFGYRAKFIQKSAAQIVEWGGDEWFKRLKDMKYKDARQELIKLCGIGPKVADCICLMSLNHLEALPVDTHVYQIAATNYLPHLKGKKSVTEKIYTEIGDHFRSLYGDKAGWAHTVLFCADLKKFQQDDSNEDVVKSKRKKKK-