Monarch geneset OGS2.0

DPOGS213238
TranscriptDPOGS213238-TA2547 bp
ProteinDPOGS213238-PA848 aa
Genomic positionDPSCF300124 - 440499-449598
RNAseq coverage198x (Rank: top 47%)
Annotation
HeliconiusHMEL0134210.093.04% 
BombyxBGIBMGA009517-TA0.081.72% 
DrosophilaTop3beta-PA0.064.15% 
EBI UniRef50UniRef50_O959850.063.92%DNA topoisomerase 3-beta-1 n=100 Tax=Eumetazoa RepID=TOP3B_HUMAN
NCBI RefSeqXP_001606777.10.071.86%PREDICTED: similar to prokaryotic DNA topoisomerase [Nasonia vitripennis]
NCBI nr blastpgi|3838572610.071.95%PREDICTED: DNA topoisomerase 3-beta-1-like [Megachile rotundata]
NCBI nr blastxgi|3071842450.071.95%DNA topoisomerase 3-beta-1 [Camponotus floridanus]
Group
Gene OntologyGO:00036771.5e-95DNA binding
GO:00056941.5e-95chromosome
GO:00062651.5e-95DNA topological change
GO:00039161.5e-95DNA topoisomerase activity
GO:00039171.4e-12DNA topoisomerase type I activity
KEGG pathwaynvi:1001171840.0 
 K03165 (TOP3)maps-> Homologous recombination
InterPro domain[1-847] IPR0003800DNA topoisomerase, type IA
[2-634] IPR0234051.2e-149DNA topoisomerase, type IA, core domain
[169-581] IPR0134971.5e-95DNA topoisomerase, type IA, central
[288-544] IPR0036021.6e-74DNA topoisomerase, type IA, DNA-binding
[488-627] IPR0138248.3e-43DNA topoisomerase, type IA, central region, subdomain 1
[145-241] IPR0036018.4e-34DNA topoisomerase, type IA, domain 2
[297-420] IPR0138265.3e-27DNA topoisomerase, type IA, central region, subdomain 3
[3-137] IPR0061713.5e-19Toprim domain
Orthology groupMCL12894 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213238-TA
ATGAAGACAGCATTAATGGTGGCTGAAAAGCCGTCCCTGGCTCAAAATCTAGCAAATATTCTCAGTAATGGAAAATGCAATACCAACAAGGGCTCTAATTCAGCTTGCGCAGTTCATGAGTGGACAGGTACCTTCAAAAACGAACCTGTGAAATTTAAAATGACTTCAGTGTGTGGTCATGTGATGAGCTTAGATTTCACTGGCAAATATAATAATTGGGATAAAGTAGATCCCGTTGAACTGTTCATATGTCCTACAGAGAAGAAGGAAGCAATGCCAAGACTTAGGATTCCCGCTTTCCTAGCACAGGAGGCTAGAGGATGTGATTATCTCATTCTTTGGTTGGATTGTGATAAAGAAGGGGAAAATATATGTTTTGAGGTTATGTCCTGCGTTCAAAACTACATGAAAGGTGACGTATACTCACCAGCAGTGACATTTCGGGCGCGATTTTCAGCCATCACAGATAAAGATATTAAAACAGCCATGATGAATCTGGTTAGACCAAATGAAAGCGAATCTCGAAGTGTTGACGCCAGACAGGAACTAGATTTGCGTATCGGATGTGCCTTCACGAGATTCCAGACGAAGTATTTTCAAGGTCGCTACGGTGATTTGGACGCGTCTCTCATATCGTACGGTCCCTGCCAGACTCCGACACTCGGATTCTGTGTCCAACGCCACGATGACATCCAGACCTTCAAACCGGAAACCTATTGGGTGTTGAGAGTGACCGCCTCCACCTCCGAGGGCAGAGAGCTCCCGCTTGAATGGAAACGTGTCAGGAGCTTCGAAAAGGACATAGCTAACATGTTTCTGGTCGGCATCAAGGAATTCAAAGAGGCCACAGTTGTTAATATCCAAGCTAAAGAGAAGATAAAGTCCAGACCGACCGCTCTCAACACTGTTGAGTTGATGAGGGTGGCCAGTGCTGGTCTCGGTATGGGACCACATCACGCTATGCAGATTGCTGAACGTCTGTACACTCAAGGTTATATATCATATCCTAGAACAGAGACGACTAGTTATGGAGAGAATTTTGATCTCATTGGTAGTCTTCGTCAACAACAGAATTCTAACAAGTGGGGTTCTGAGGTACGAGCTTTACTGGCTAATGGTATCAATAAGCCCAAGAAGGGCCACGACGCGGGTGACCATCCACCGATCACTCCTATGAAGCCTGCCTCCGAATCCGAGCTGGAGGGTGACATGTGGCGTATATACGACTACATCACGCGGCATTTCATAGCGACACTGTCGCGCGACTGCCGCTACCTCAGCACGACCCTTACCTTCAGCGTGGGCTCCGAGACGTTCTATTACACTGGCAATACTCTGGTCGACGCTGGCTACACTGAGATCATGCATTGGCAGGCTTTCGGTAAGGATGAGTTCGTCCCAGTACTGAAGGTGGACGAGGTGCTTCGGGCACACGACCACCGCCTCGTGGAGTGTCAGACCTCGCCCCCGGACTACCTCACCGAGTCTGAGGTGATAACTCTGATGGAGAAGCACGGGATCGGCACGGACGCGTCCATACCTGTCCACATCAATAACATCTGTCAGAGGAACTACGTGAGCGTCGGCAGCGGGCGGCGGCTCGTGCCCACCAGCCTGGGCGTCGTGCTCGTACATGGATATCAGAAGATCGACCCGGAGCTAGTGTTACCGACGATGCGATCGGCCGTCGAGGAACAGCTCAACCTCATCGCAATCGGTCGAGCCGATTTCCACGCGGTGTTGACTCACACCACGGAGATCTTCAGGCGGAAGTTCCAATACTTCGTGAGGTCCATAGAGGCCATGGACCAACTGTTCGAGGTCAGCTTTTCGTCGCTCAAGACCAGCGGCAAGGCGCTGTCCCGCTGCGGCAAGTGCAGGAGATACATGAGATACATACAGGCGAAGCCCGCCCGCCTGCACTGCTCCCACTGTGACGACACCTACACGCTGCCCCAGCACGGCACGGTCCGCATTTACCGCGAGCTGAAGTGTCCTCTGGACGACTTCGAGCTGCTGTCCTGGTCCACCGGCAGCAAAGGGAAGAGCTTCCCGCTCTGCCCTTACTGCTACAATCACCCACCATTCAGGGATATGAAGAAGGGCTTCGGCTGTAACTCCTGCACTCACCCCACTTGTCCCTACGGCGTGAACTCCACCGGCGTCTCCGGCTGTGTCGAATGTGATGGAGTTTTAGTTTTGGATCCCTCGGCGCCGAAGTGGAAGCTGGCGTGTAACCGTTGTGACGTCATCATAAACGTGTTCGAGGACGCGAGCCGCGTGTCCGTGTGCGAGGCGGCGTGCGCGTGCGGCGCTCAGTTAGTGTGCGTCGAGTACCGCGCCGAGCGGACCAAGCTGCCGGCCGCGCTCACCGAGATGACCGCCTGCCTTTACTGCGAGCCGGCTTTCAGCGCGCTTGTGGAGAAGCATCGTGCGGTGGCGCCCCGGAGCGGAGGATCGCGAGGACGGAGCGCCAGGGGCAGAGGGAAACATCGCAACAAACAACCCAAAGACAAAATGGCCCAATTAGCGGCGTATTTCGTATAA

Protein sequence:

>DPOGS213238-PA
MKTALMVAEKPSLAQNLANILSNGKCNTNKGSNSACAVHEWTGTFKNEPVKFKMTSVCGHVMSLDFTGKYNNWDKVDPVELFICPTEKKEAMPRLRIPAFLAQEARGCDYLILWLDCDKEGENICFEVMSCVQNYMKGDVYSPAVTFRARFSAITDKDIKTAMMNLVRPNESESRSVDARQELDLRIGCAFTRFQTKYFQGRYGDLDASLISYGPCQTPTLGFCVQRHDDIQTFKPETYWVLRVTASTSEGRELPLEWKRVRSFEKDIANMFLVGIKEFKEATVVNIQAKEKIKSRPTALNTVELMRVASAGLGMGPHHAMQIAERLYTQGYISYPRTETTSYGENFDLIGSLRQQQNSNKWGSEVRALLANGINKPKKGHDAGDHPPITPMKPASESELEGDMWRIYDYITRHFIATLSRDCRYLSTTLTFSVGSETFYYTGNTLVDAGYTEIMHWQAFGKDEFVPVLKVDEVLRAHDHRLVECQTSPPDYLTESEVITLMEKHGIGTDASIPVHINNICQRNYVSVGSGRRLVPTSLGVVLVHGYQKIDPELVLPTMRSAVEEQLNLIAIGRADFHAVLTHTTEIFRRKFQYFVRSIEAMDQLFEVSFSSLKTSGKALSRCGKCRRYMRYIQAKPARLHCSHCDDTYTLPQHGTVRIYRELKCPLDDFELLSWSTGSKGKSFPLCPYCYNHPPFRDMKKGFGCNSCTHPTCPYGVNSTGVSGCVECDGVLVLDPSAPKWKLACNRCDVIINVFEDASRVSVCEAACACGAQLVCVEYRAERTKLPAALTEMTACLYCEPAFSALVEKHRAVAPRSGGSRGRSARGRGKHRNKQPKDKMAQLAAYFV-