Monarch geneset OGS2.0

DPOGS207474
TranscriptDPOGS207474-TA3222 bp
ProteinDPOGS207474-PA1073 aa
Genomic positionDPSCF300051 + 188876-195787
RNAseq coverage305x (Rank: top 37%)
Annotation
HeliconiusHMEL0052447e-17153.37% 
BombyxBGIBMGA000946-TA0.052.09% 
Drosophilamus210-PC3e-13255.84% 
EBI UniRef50UniRef50_D7F7K90.051.83%Nucleotide excision repair protein n=2 Tax=Obtectomera RepID=D7F7K9_BOMMO
NCBI RefSeqNP_001177140.10.051.83%nucleotide excision repair protein [Bombyx mori]
NCBI nr blastpgi|2981609210.051.83%nucleotide excision repair protein [Bombyx mori]
NCBI nr blastxgi|2981609210.050.70%nucleotide excision repair protein [Bombyx mori]
Group
Gene OntologyGO:00056342.1e-197nucleus
GO:00036842.1e-197damaged DNA binding
GO:00062892.1e-197nucleotide-excision repair
KEGG pathwaytca:6644483e-157 
 K10838 (XPC)maps-> Nucleotide excision repair
InterPro domain[143-1052] IPR0045832.1e-197DNA repair protein Rad4
[936-1010] IPR0183281.1e-29DNA repair protein Rad4, DNA-binding domain 3
[700-815] IPR0183256.6e-21DNA repair protein Rad4, transglutaminase-like domain
[820-871] IPR0183261.5e-18DNA repair protein Rad4, DNA-binding domain 1
[873-929] IPR0183277.2e-12DNA repair protein Rad4, DNA-binding domain 2
Orthology groupMCL13327 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207474-TA
ATGCCAACGACAAGAAAGAAAGTAATTAAAACCGTATACAAAGATGAGGAAGATAGCAATGAAGGAGGGGATTTCAGCGATTCTGGCTCTGATGCAGTAATAACCGAACAGTCCAGTTCTGAAGAGGACGGCGATGAGAAATCACATCATTCTTCCGACGACGATTTTATAAGCAAAAAACCGAAATCCAAAGGCAGAAATCAGGATTCTAAAGTAAGAAAGAAGAAAACGAAATTTACTAAAACATTTCTAGAAAGAATCTCTAAACAAGAATCTGATGAACAAGTTGAGGCACCACCAATTTCTGTCAAAGATCTAACCGAAGCAGATAAGCTTTTACCATCATTCCTTAATCTTTCTGAAAGTGAGAGTCATTCTGTGATACCGCAACAAGGATTACAGCTGGTAGTTGATATTCCGGGAATGGTTAAGAAAAAAACTAAAAAGCTCGACGTTGAAATGATGCTAAAGAGGAAAATGAACCGAGTGAAGAAACAATATCAAGTTTTCATGCACAAGGTTCATGTGTTGTGTTGGTTGGGACATGGTAATTATGTCAGCCAGGTCCTAAACGATCAAGAGGTGTTGGCTGAAGCATTGTCTTTGGTACCATCCAAAGAATGCTATCCCGGAGAAAGAGTGGATATGAAATATGTCGAACAAATAACCACATGGTATAAAGATAAAGTGAGTATACATCAGGACAAACATGAGAACAAATTTAGACCCAAAGCCCCACCGTTAAAGTCTATTCTTCTGCAACAAATGAAGAAACGGGTGTTCAGCACGAAAAAATATATGGTTTTCGTCTTTGTTTCCATGTTGCGTAGTCTAGGACTACAGTGCCGTGTTATGTTTAATTTCGTTACATTGCCTATAAGGCCACCCGTTTCGGAATTGTGTTCACTCTCAACCAAAGTGAAGGGCACAGAAGATGCAAAGAAAAACGACCAGAAAACAAAATCACCAAGAAAATCAACTAAGAGTAAAAGCAAGAGAGATGTTATTCCACAGTTGGATGGAAATTATGACGTAATTGAAAGTGATGATGGAAATATTATGCAAGTTGATGGTGGTGATGATACAACCACAGCTAGAACGAGAAGACAACGGCTTTCATTAAGAAAAGTGAAACAAGCAAATGATGTTAAGAAACCAGAGGAGGTTATCAGCCCAACTAAAATAACAAAAAAGAATCTCAGTTTGAAAATAGAAAATAAGACTGTGGGTGAAACAAAGCGAAAACCAACGACAAGGACCAAAAGAAACTTGAGATTGCAATCAAAAAATACAAAAACAACAATACATGAAACTAAACAGTCAGAAAGTCTTTCTAATAAAAAAGATGATATAAAAGCCACTAAAAGTAAACGAAAAATATTAAGTTTAAATCTATCAGTTACCAAAAACGACCAAGATAACAAAACCAAAAGTACATCAAAAACGATAGTTAATAAATGCTCTGCAAACAGAATTACTAGAGCAAACATTACTTCACTCAACGAATCAAACGCAATACTTACTAAGACTTTATCAAAACAATCGTCCTTAGATAAAGTTCCTAAGATCATCCTTACAGACATAAACGATCAAACTGTATCGAGTAAATTCTTTGAGAAGTCGCCCACCAAAAGAACGTCAAGAAAACGATCACAAACAACGGAACCAAAGAAATCGCCGAATGAAATGTCAAATGCACGAACGAGAAGTGCACACGCAACAGAAAGCAAGTATTTTGCTCCTGAAACCGATAAAAGTCCCGCCAAAAGATGTAGAACAACTAGAAAAATTGATTCTGATGATTCAAAAAGAGTAAGTCATAGAGATCTCGCGAAAAAGAATGTCCAAGATCTCCAATCTCCAAAGATCTCCAAACCCAAAAATGATGTCACGAAAGATCTCGTTCACATTATCAAGGGAAGGGTAAAGGAGGCCAAAACGGATGCAAAAAAACGTATTGTAAAAGGAAAAGAAAAACATGAATCTGATTCTGATAGCGATCATCTGGCCGTTGAATCTCCAGCTCCCCGTAAATCTGAGAGCGACGAAGACTTTAAAGTGGAAAAAGTTACACCTAAACAAAAGAAGCCAGTTAAGAAAATAGACCGATGTGTTATATCAGCAGATGATGAGATGCCTTTGAATAAAATTAATGTGTGGTGCGAGATATATGTTGAAGAATTAGAAGAGTGGGTTCCCGTTGACGTTGTTAGAGGCATAGTTCATTCTGCCAATGAATTATATAGTCGTTCGACACACCCTGTATCATACATTGTTGGTTGGGACAACAATAATTACTTAAAAGATCTGACAAGACGCTACGTGCCATATTGGAACACAGTTACACGTAAACTGAGAGTTGATCCTGGATGGTGGGAAGAAGCGATAAAGCCGTGGTTGGGACCAAAAACCGCCAGGGACAGGGAAGAGGATGAAAGATTGCACAGAATGCAACTAGAAGCGCCATTGCCCAAAGTTATATCCGAATACAAAAACCACCCTCTATATGTGTTGAAACGTCATCTCCTCAAGTTCGAAGCCATATATCCGCCTGATGCTGAAACCCTTGGCTTCGTTCGCGGGGAGCCCGTTTATCCGAGGGATTGTGTTTACATTTGCAAGTCGAGGGATGTTTGGATCAAGGATGCCAAAGTAGTTAAACTCGGAGAACAGCCATACAAGATAGTTAGAGCTCGTCCCAAATATATAAGAGCCACAAACACCTTTATAACTGATCGACCTCTGGAAATCTTTGGGCCATGGCAGACACAGGATTATGAACCTCCGACTGCAGAAAATGGAATTGTTCCACGGAATCCTTATGGAAATGTTGAATTGTTCAAAAAATGCATGCTACCGAAAGGCACTGTCCATATCAATTTACCAGGTTTACAACGAGTTGCTAAGAAATTGAATATTGATTGTGCTCCAGCGTTAACAGGATTTGACTGCAATGGTGGCTATGTCCACCCTGTATATGAAGGCTTTGTAGTCTGTGAAGAGTTTGAAAAGGTTCTCACGGAAGCTTGGCTTCAGGATCAAGAAGAGTTGGAACGTAAAGAACAGGAAAAAGTAGAAACCCGAGTGTACGGAAACTGGAAGCGGCTTATAAGAGGACTTATCATAAAAGAACGACTAAAAGCCAAATATGGATTTGCAGAGCCCAGCACATCTCAGGATAAAAAGAAGAAAGGCCCAAAACTTGTTGTGAAGAAAAAATAA

Protein sequence:

>DPOGS207474-PA
MPTTRKKVIKTVYKDEEDSNEGGDFSDSGSDAVITEQSSSEEDGDEKSHHSSDDDFISKKPKSKGRNQDSKVRKKKTKFTKTFLERISKQESDEQVEAPPISVKDLTEADKLLPSFLNLSESESHSVIPQQGLQLVVDIPGMVKKKTKKLDVEMMLKRKMNRVKKQYQVFMHKVHVLCWLGHGNYVSQVLNDQEVLAEALSLVPSKECYPGERVDMKYVEQITTWYKDKVSIHQDKHENKFRPKAPPLKSILLQQMKKRVFSTKKYMVFVFVSMLRSLGLQCRVMFNFVTLPIRPPVSELCSLSTKVKGTEDAKKNDQKTKSPRKSTKSKSKRDVIPQLDGNYDVIESDDGNIMQVDGGDDTTTARTRRQRLSLRKVKQANDVKKPEEVISPTKITKKNLSLKIENKTVGETKRKPTTRTKRNLRLQSKNTKTTIHETKQSESLSNKKDDIKATKSKRKILSLNLSVTKNDQDNKTKSTSKTIVNKCSANRITRANITSLNESNAILTKTLSKQSSLDKVPKIILTDINDQTVSSKFFEKSPTKRTSRKRSQTTEPKKSPNEMSNARTRSAHATESKYFAPETDKSPAKRCRTTRKIDSDDSKRVSHRDLAKKNVQDLQSPKISKPKNDVTKDLVHIIKGRVKEAKTDAKKRIVKGKEKHESDSDSDHLAVESPAPRKSESDEDFKVEKVTPKQKKPVKKIDRCVISADDEMPLNKINVWCEIYVEELEEWVPVDVVRGIVHSANELYSRSTHPVSYIVGWDNNNYLKDLTRRYVPYWNTVTRKLRVDPGWWEEAIKPWLGPKTARDREEDERLHRMQLEAPLPKVISEYKNHPLYVLKRHLLKFEAIYPPDAETLGFVRGEPVYPRDCVYICKSRDVWIKDAKVVKLGEQPYKIVRARPKYIRATNTFITDRPLEIFGPWQTQDYEPPTAENGIVPRNPYGNVELFKKCMLPKGTVHINLPGLQRVAKKLNIDCAPALTGFDCNGGYVHPVYEGFVVCEEFEKVLTEAWLQDQEELERKEQEKVETRVYGNWKRLIRGLIIKERLKAKYGFAEPSTSQDKKKKGPKLVVKKK-