Monarch geneset OGS2.0

DPOGS211165
TranscriptDPOGS211165-TA2808 bp
ProteinDPOGS211165-PA935 aa
Genomic positionDPSCF300007 + 269287-274264
RNAseq coverage1647x (Rank: top 8%)
Annotation
HeliconiusHMEL0172170.082.86% 
BombyxBGIBMGA003151-TA0.080.07% 
DrosophilaTop1-PC0.063.54% 
EBI UniRef50UniRef50_P301890.063.54%DNA topoisomerase 1 n=60 Tax=Bilateria RepID=TOP1_DROME
NCBI RefSeqXP_971195.20.061.25%PREDICTED: similar to conserved hypothetical protein [Tribolium castaneum]
NCBI nr blastpgi|3431751070.077.46%topoisomerase 1B [Spodoptera exigua]
NCBI nr blastxgi|3431751070.079.12%topoisomerase 1B [Spodoptera exigua]
Group
Gene OntologyGO:00036778.8e-266DNA binding
GO:00039178.8e-266DNA topoisomerase type I activity
GO:00056948.8e-266chromosome
GO:00062658.8e-266DNA topological change
GO:00039188.1e-94DNA topoisomerase (ATP-hydrolyzing) activity
KEGG pathway 
InterPro domain[532-907] IPR0134998.8e-266DNA topoisomerase I, C-terminal, eukaryotic-type
[373-602] IPR0083367.2e-107DNA topoisomerase I, DNA binding, eukaryotic-type
[603-935] IPR0110101.3e-94DNA breaking-rejoining enzyme, catalytic core
[604-836] IPR0135008.1e-94DNA topoisomerase I, catalytic core, eukaryotic-type
[758-933] IPR0147274.6e-76DNA topoisomerase I, catalytic core, alpha/beta subdomain, eukaryotic-type
[491-603] IPR0130302.7e-70DNA topoisomerase I, DNA binding, mixed alpha/beta motif, eukaryotic-type
[604-751] IPR0147118.1e-70DNA topoisomerase I, catalytic core, alpha-helical subdomain, eukaryotic-type
[532-541] IPR0016318.1e-51DNA topoisomerase I, C-terminal
[404-490] IPR0130341.2e-32DNA topoisomerase I, domain 1
[812-882] IPR0090541.2e-17DNA topoisomerases I, dispensable insert, eukaryotic-type
Orthology groupMCL11290 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211165-TA
ATGAGTGTTGAAAATCCAGCCAGTGATGATGGTGATTCAGGAAAAGTAAATGGAAATAAAAGTGAACACGTGAACGGCATTCAAAACGGATACAGCAGCTCCGAGAAGCACAAAAGCAGCCACAAATCATCTAGTAAAGATAAACATCGCGACAAAGAACGTGACCACAAAAGCTCGAAACATAGCAGCAGTTCAAGCAGAGACAAAGATAAGGATCGTCATTCAAGCAATAAAAATCATAGCAGCTCCCACAAGTCATCCAGCAGAGATAAAGAACGGGATAGGGAACACAGAGACGACAAACATAAGGATAGAGAAAGATCTGATAGAGAGAGAAGTGATAAAGATAGGGACAGGCACAAGAGTGACAAAGACAGACATAGGGAGAGGAGTGAAAAAGATAAAGATAGAAGTGAAAAAGATAAACATAAATCCAGTAATGGTGAGAAGAAGTCTTCAAAGGAACATTCATCTTCATCTAGGGACAAAGATAAATCAAAGGATACAGATAAACACAGGGACCATGATAAAGACAGAAGTTACAAAGAAAAACATAGAAGCGATAAAGACAGACACTCCAGCTCAAAGGAGAAACACAGTTCTAGTAAAAGTAAGGAAAAAAGTAGTTCTGAAAAAAATGACAAAGTCAAATTGGAAGAGGAGTATCGGGATTCATCGATGAAACAGGAATACATGGAGGTTGACGAACCAACAGTAAAGAGGGAAATGAAATCTGAAAGCGATGATGGCTATGGGGGGGCTCTTAATACGACTGTATCATCATGTGACTATTCACTATCACAGTTCAAAGATGAACCTTTGTCGGAGATGCCCCTTGAGGAAGACAGTGCATCTGGCGGAGAGGAAGATGTACCACTGTTAGAGCGTAAGGCAATCAAAAAAAGAGCTATCAGTGAGAGTGAAGAAGACACACCACTGTTACAACGGAAGAAACAGAAAAAGAAAGTGAAGAAAGAAAATTATGATGACTACGATGATGAAGAATCGCAACAGAAAAAGAAGGCGAAGAAAACGAAATCCACAAAAAGCATCAAGACTGAAGCTGATGATGGCCCGAGCCCCACCAAACGGAAGAAAAAAGAAAATGAAGAACAGGAAGTCTGGAAATGGTGGGAAGAATCAAAAACGGACGATGGAACTAAGTGGCATTTTCTTGAACACAAAGGTCCCCTGTTTGCACCTCTATACGAACCGCTGCCAGAAAACGTTAAATTTCGTTATGACGGTAAAATAGTGCGGCTGTCGCAGGACAGCGAAGAAGTGGCCGGTTTCTACGCCCGTATGTTAGACCACGACTACACTACTAAGACTGTGTTCAATACCAACTTTTTCAATGATTGGCGCAAAGTCATGACAAATGAAGAAGCGAAATTAATTAAAGATCTCTCAAAATGTGACTTTAAAGAAATGCAAACATATTTCCAAAGTGTGTCAGAAAAGAATAAGAATCGCAGTAAAGAAGAGAAAGCAGCACTCAAAGCAAAGAATGAGGAAATCCAAAAGGAATATGGTTTTTGTACTATTGATGGACATAAAGAAAAAATTGGTAATTTTAGAATAGAGCCACCCGGCCTCTTTAGGGGTAGAGGTGAACATCCCAAGATGGGAATGTTGAAGAGGCGTGTGATGCCAGAAGACGTAATAATTAACTGTTCAAAAGACAGTAAGATACCAAAACCACCGGCTGGTCACAAATGGAAGGAAGTTAGACACGACAACACTGTAACATGGTTGGCATCATGGACAGAAAATGTTCAGCAGCAAGCCAAGTATGTCATGTTAAATCCCAGCTCCAAATTGAAGGGCGAAAAGGATTGGCAGAAATATGAAACGGCAAGAAATTTGCACAAATGTATCGATAAAATTAGAGAAACATATAAATCAGATTGGAAAGCTAAAGAGATGCGCGTCCGTCAACGTGCTGTGGCTTTGTATTTCATTGATAGACTGGCTTTAAGAGCAGGTAATGAGAAGGATGATGACCAAGCTGATACAGTCGGTTGTTGTTCCCTCCGCGTTGAGCACATTGAATTGCACAAAGAGAAAGATGGAAAGGAATTTGTGGTTGTGTTTGATTTCCTCGGTAAAGACTCTATTAGATATTACAATGAGGTGCCAGTAGAAAAACGTGTTTTTAAGAATCTCGAGATTTTCATGGAAAATAAAAAGGATAGTGATGATCTGTTTGACAGATTGAACACGCAGACTCTGAATGAACATTTAAAAGAATTGATGCCAGGGCTGACCGCTAAAGTTTTCCGTACCTACAACGCGTCCATAACGTTACAAAGACAACTGGAAGAGCTCACCGACCCCGATGCAACCATACCTGAGAAAATATTAGCTTATAACCGGGCAAATCGAGCCGTCGCCATACTTTGTAACCATCAGCGCGCGGTCCCCAAAGGTCATTCAAAGTCAATGGAAGCATTGAAAGAAAAAATTCAAGCTAAAAGAGACCAGGTTGATGAGGCCGAGGCTGATTATAGAGATGCAGCGAAGGCAGCTAAACGAGGCTCGGTAAAAGAAAAGTTAGCTTGTGACAAGAAGAAAAAAGCGCTAGAGCGGTTAAAGGAGCAGTTAAAGAAATTGGAGCTCCAAGAAACAGATCGTGATGAAAACAAAACAATAGCCCTCGGAACCTCCAAACTCAACTACCTTGATCCGAGGATCTCAGTGAGCTGGTGCAAGAAACACGGTGTACCAATTGAAAAAATATACAATAAAACGCAACGTGATAAATTCCGATGGGCTATTGACATGGCCGGGCCCGACTACATTTTCTAG

Protein sequence:

>DPOGS211165-PA
MSVENPASDDGDSGKVNGNKSEHVNGIQNGYSSSEKHKSSHKSSSKDKHRDKERDHKSSKHSSSSSRDKDKDRHSSNKNHSSSHKSSSRDKERDREHRDDKHKDRERSDRERSDKDRDRHKSDKDRHRERSEKDKDRSEKDKHKSSNGEKKSSKEHSSSSRDKDKSKDTDKHRDHDKDRSYKEKHRSDKDRHSSSKEKHSSSKSKEKSSSEKNDKVKLEEEYRDSSMKQEYMEVDEPTVKREMKSESDDGYGGALNTTVSSCDYSLSQFKDEPLSEMPLEEDSASGGEEDVPLLERKAIKKRAISESEEDTPLLQRKKQKKKVKKENYDDYDDEESQQKKKAKKTKSTKSIKTEADDGPSPTKRKKKENEEQEVWKWWEESKTDDGTKWHFLEHKGPLFAPLYEPLPENVKFRYDGKIVRLSQDSEEVAGFYARMLDHDYTTKTVFNTNFFNDWRKVMTNEEAKLIKDLSKCDFKEMQTYFQSVSEKNKNRSKEEKAALKAKNEEIQKEYGFCTIDGHKEKIGNFRIEPPGLFRGRGEHPKMGMLKRRVMPEDVIINCSKDSKIPKPPAGHKWKEVRHDNTVTWLASWTENVQQQAKYVMLNPSSKLKGEKDWQKYETARNLHKCIDKIRETYKSDWKAKEMRVRQRAVALYFIDRLALRAGNEKDDDQADTVGCCSLRVEHIELHKEKDGKEFVVVFDFLGKDSIRYYNEVPVEKRVFKNLEIFMENKKDSDDLFDRLNTQTLNEHLKELMPGLTAKVFRTYNASITLQRQLEELTDPDATIPEKILAYNRANRAVAILCNHQRAVPKGHSKSMEALKEKIQAKRDQVDEAEADYRDAAKAAKRGSVKEKLACDKKKKALERLKEQLKKLELQETDRDENKTIALGTSKLNYLDPRISVSWCKKHGVPIEKIYNKTQRDKFRWAIDMAGPDYIF-