Monarch geneset OGS2.0

DPOGS208542
TranscriptDPOGS208542-TA2634 bp
ProteinDPOGS208542-PA877 aa
Genomic positionDPSCF300064 + 871798-877892
RNAseq coverage200x (Rank: top 47%)
Annotation
HeliconiusHMEL0087540.076.90% 
BombyxBGIBMGA005180-TA4e-10861.98% 
Drosophilamei-9-PA0.050.17% 
EBI UniRef50UniRef50_B4NP890.048.95%GK17500 n=7 Tax=Diptera RepID=B4NP89_DROWI
NCBI RefSeqXP_002011208.10.050.17%GI16132 [Drosophila mojavensis]
NCBI nr blastpgi|1951335620.050.17%GI16132 [Drosophila mojavensis]
NCBI nr blastxgi|1571267640.049.78%DNA repair endonuclease xp-f / mei-9 / rad1 [Aedes aegypti]
Group
Gene OntologyGO:00062814.3e-250DNA repair
GO:00045194.3e-250endonuclease activity
GO:00036764.3e-250nucleic acid binding
GO:00036772.8e-51DNA binding
GO:00062592.8e-51DNA metabolic process
GO:00045182.8e-51nuclease activity
KEGG pathwaydmo:Dmoj_GI161320.0 
 K10848 (ERCC4, XPF)maps-> Nucleotide excision repair
InterPro domain[93-854] IPR0061674.3e-250DNA repair protein
[628-775] IPR0208192.8e-51DNA repair nuclease, XPF-type/Helicase
[626-768] IPR0113352.6e-37Restriction endonuclease, type II-like
[630-710] IPR0061665.9e-27ERCC4 domain
[782-864] IPR0109947.3e-13RuvA domain 2-like
Orthology groupMCL11567 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208542-TA
ATGGATGGAAATCAAAATATGGAAGAGTTATCTCAAGATATTGAGACAAAATATCCTCTACTTAATTATGAAAAACAAATTTTTACAGACATATTTGAAAACGATGCCCTCTTAATTATGGCCAAAGGACTAAGTTACAATAGCATTGTTTCCAATTTATTGTGGGTTTATAAGGATCCGGGAAATTTAGTATTAATAATAAATAGCAGCGACCATGAAGAAAAATTTTTCAGTGAGAAATTCAATTTATCAGTTTTACCAAATTTGGGGTCGGAGAGGGAAAAAGCTTATTTAGAGGGTGGCATATATTTTGTGTCGACAAGAATACTAGTAGTTGATTTATTAAAAAAGCGAGTACCGGTAACCCACATAACGGGGATCATTGTGTTGAGGGCTCACACAGTACTGGAGTCATGTCAAGAGGCTTTCGTTCTCAGGTTATTTAGACAAAATAATAAGACAGGTTTTATAAAAGCATTTTCAAACTCACCTATATCATTCACGTTTGGCTATCATCAAGTGGAAAAAGTTATGAGAGCTGTTTTCGTCAAAGAATTATATTTATGGCCCAGATTCCATGGTTTAGTCATTAAAAGTCTTAAAAGTAGGCAAGCTGAGGTGGAAGAGATACATGTTTCGTTGACGCCGAATATGTTACAAATACAAACGTGTCTTCTAGACATCATGAATTATACAGTGAAACAACTTAAGAGCATTAACAGAAATCTGGACATGCAAGAAATTACTACAGAAAATTGCATAACCAAAAAGTTCCATAAAATATTACAATCCCAACTCGACTGTGTGTGGCATCAGCTCAGCAACAGAACAAAGGAATACATACAGGATTTGAAAGTTTTAAGGACTTTGATAGTTAATCTAATCCACGAGGATTCTGTATCCTTCTATTGGCTGGTCAGTAAGTATCGTACATCAGAGTATGCTCAGGTCAACTCGGGTTGGATTCTGTTGGACTCGGCTGAAAGGCTGTTCAAGGTAGCAAGATCTAGAGTCTTCAACGGCAATGAATTTGATCCTGAACACTGTCCTAAATGGAAGTGCTTGAGTGATTTATTAAAAGTTGAAATACCAGAGGAAGTGAAAAAGAAGAATAACGGGTCGCTTAATAATACCAAAGTGCTTATACTGTGTGAGAACAATAAGACGTGTTTCCAACTCAACAACGTCTTGACTATGGGGGCCAATAGGTATCTATTCTATAATGCATTCCGAAGAGACATACAGATTACATCCGTTTCATCCAAATATAAAAATCTCACAGAAGAATTTCCAACAAAATCTGACAAAGATCGAGTTAAAGACAATAAAGATGATAGTAATTTGGATGAAGAATTGGATGAAGTCAAATCAAATTATATGTTAACTTTGACACAGGCTATGGAAATGAATAAAAATACAACAGAAGACAGTATGTTTGAACCCATCTCACAGTTGGCTAACTTAGATCTCACTCAATTGGAAAATGAAAAACCTTTGATATGTATACAAACGTTCAAGCAGAATGGAAACCATTTCTCGTTAGAGCAGACGTTAGAGGCGTTGAGGCCGGAATACATCATACTGTATCAGAGCGACGTGTCGGCCGTGCGACAGATAGAACTGTATGAATGCAAGAAAAAACCCGAGGAGCCGAAGTCTAAAATCTATTTCCTTATACACGACAAAACCGTGGAGGAACAGTCGTATCTGACGTCACTGAGGAGGGAGAAGCAGGCGTTTGAAATGCTCATTCAAGCTAAGAGTGTGATGGTGGTACCTTCGTATCAGGACGGGAGAACTGATGAGTATTTTAATTTAAATGTCGAAGAAAACGAATCGGCCGTAAACACGAGGAAAGCAGGCGGTCAAGTGTCGTCAGTGGCGCCGCGTGTTATAGTGGACATGCGTGAGTTCCGTTCCGACCTGCCAGCGCTTCTTCACCGCCGTGGGATCAACATAGATCCCGTCACCATCGCGATAGGCGACTATATTCTGACGCCGGATATATGCGTAGAGAGGAAATCAATCTCGGATTTGATTGGATCGCTCAACTCCGGTCGTTTGTACACACAATGTACACAGATGTGTCGGAACTACTCCAGACCCATACTACTTATTGAGTTTGATCAGAATAAACCGTTCAATTTACAGGGTAACTTTGTTGTATCTACGGATATATCGGGTGCTGACATACAGCAAAAATTACAGCTACTTACAATACATTTCCCGCGTCTAAAGCTAGTGTGGTCGCCGAGCCCTTATGCTACAGCTGAACTATTCTATGAACTCAAAGAAGGTAGGAAGAACCCTAACGTAGACGAAGTTGTAGCGCTCAGTGGTGAAAACACAGCTGATGATATGAACTACGAGAGATATAATATCGTTGTTCACGATTTCGTCCAGAAACTCCCAGGAGTCACGTCCAAGAATATATCCAGGATCATGAACAGGGGCATTTCGTTGGACCATCTAATTACCCTGACACAGGATCAATTACAGGAAATAGTAGAAAATAAGAACGAAGCTGAAATAATATATTCAGTATTACATGTAAAAGCAAAGCCAGTTGACTCGGACAAAACAAAGCCTTTTGGTAAAAGGAAGTTGGGGGGGAAATTTAAAAGCAATAAATGA

Protein sequence:

>DPOGS208542-PA
MDGNQNMEELSQDIETKYPLLNYEKQIFTDIFENDALLIMAKGLSYNSIVSNLLWVYKDPGNLVLIINSSDHEEKFFSEKFNLSVLPNLGSEREKAYLEGGIYFVSTRILVVDLLKKRVPVTHITGIIVLRAHTVLESCQEAFVLRLFRQNNKTGFIKAFSNSPISFTFGYHQVEKVMRAVFVKELYLWPRFHGLVIKSLKSRQAEVEEIHVSLTPNMLQIQTCLLDIMNYTVKQLKSINRNLDMQEITTENCITKKFHKILQSQLDCVWHQLSNRTKEYIQDLKVLRTLIVNLIHEDSVSFYWLVSKYRTSEYAQVNSGWILLDSAERLFKVARSRVFNGNEFDPEHCPKWKCLSDLLKVEIPEEVKKKNNGSLNNTKVLILCENNKTCFQLNNVLTMGANRYLFYNAFRRDIQITSVSSKYKNLTEEFPTKSDKDRVKDNKDDSNLDEELDEVKSNYMLTLTQAMEMNKNTTEDSMFEPISQLANLDLTQLENEKPLICIQTFKQNGNHFSLEQTLEALRPEYIILYQSDVSAVRQIELYECKKKPEEPKSKIYFLIHDKTVEEQSYLTSLRREKQAFEMLIQAKSVMVVPSYQDGRTDEYFNLNVEENESAVNTRKAGGQVSSVAPRVIVDMREFRSDLPALLHRRGINIDPVTIAIGDYILTPDICVERKSISDLIGSLNSGRLYTQCTQMCRNYSRPILLIEFDQNKPFNLQGNFVVSTDISGADIQQKLQLLTIHFPRLKLVWSPSPYATAELFYELKEGRKNPNVDEVVALSGENTADDMNYERYNIVVHDFVQKLPGVTSKNISRIMNRGISLDHLITLTQDQLQEIVENKNEAEIIYSVLHVKAKPVDSDKTKPFGKRKLGGKFKSNK-