Monarch geneset OGS2.0

DPOGS214746
TranscriptDPOGS214746-TA3387 bp
ProteinDPOGS214746-PA1128 aa
Genomic positionDPSCF300022 + 678318-687773
RNAseq coverage735x (Rank: top 18%)
Annotation
HeliconiusHMEL0120760.088.67% 
BombyxBGIBMGA004740-TA0.078.99% 
Drosophilapic-PA0.062.34% 
EBI UniRef50UniRef50_Q165310.064.14%DNA damage-binding protein 1 n=82 Tax=Coelomata RepID=DDB1_HUMAN
NCBI RefSeqXP_001607743.10.068.39%PREDICTED: similar to DNA repair protein xp-e [Nasonia vitripennis]
NCBI nr blastpgi|3071861380.068.97%DNA damage-binding protein 1 [Camponotus floridanus]
NCBI nr blastxgi|3072057600.069.12%DNA damage-binding protein 1 [Harpegnathos saltator]
Group
Gene OntologyGO:00056342.3e-81nucleus
GO:00036762.3e-81nucleic acid binding
KEGG pathwaynvi:1001147610.0 
 K10610 (DDB1)maps-> Ubiquitin mediated proteolysis
    Nucleotide excision repair
InterPro domain[780-1087] IPR0048712.3e-81Cleavage/polyadenylation specificity factor, A subunit, C-terminal
Orthology groupMCL13132 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214746-TA
ATGGCTTATCATTACGTAGTTACCGCACAGAAGCCTACAGCAGTTATATCATGTATCACAGGAAATTTTACATCACCTACGGATCTGAACCTTCTAGTGGCGAAGGTGTCTCGCCTGGAGATGTACCTAGTAACTCCAGAGGGACTGAGACCTATGAAGGAGGTTGGGCTGTATGGGAGGGTGGCTAAGATGAAATTATTTAGACCACCGTATGAGCAAAAAGATTTAGTATTCATACTGACGGCTCGTTACAATGCTATGATACTGGAATGGAGGACAGGGGCTAACGGGGAGCTGGAGGTAGTCACCAGAGCTCATGGCAATGTTGCCGACCGTATCGGCAAACCATCGGAGAACGGAATTCTGGCAGTCATAGACCCACAAGCCAGAGTGATCGGACTCAGGCTATATGATGGATTATTTAAAATAATACCACTGGATAAAGATTCTACTGAGCTCAAAGCTGCTAGTTTAAGATTAGAAGAGCTGAATGTGTACGACTTAGAATTTCTGCACGGATGCTCAAATCCAACATTAATTTTAATTCATCAGGATCTCAATGGAAGACATATTAAGACCCATGAGATTAATTTAAGGGACAAAGAATTCATGAAGATACCATGGAAGCAGGACAATGTGGAGACAGAGGCTTCAATTCTCATTCCAGTTCCAAGTCCACTTGGTGGTGCTATTGTGATTGGTCAAGAATCTATTGTGTATCATGACGGACAAAGTTATGTAGCAGTTGCACCGCCACAGATAAAGACCCCTATCAACTGCTACTGCCGCGTGGACGTTCGCGGTCTGCGCTACTTGCTGGGCGACATCGCCGGCCGCCTATTCATGCTGTTGTTGGAACTGTCGGAGCGAGATGGCACAGCCTCTGTCAGGGACCTCAAAGTTGAACTGCTCGGTGATATCCCGATACCCGAGTGTATGACTTATTTGGACAACGGCGTGGTGTTCGTGGGGTCTCGCTTGGGGGACAGCGCCCTGGTCCGGCTGGCCGCGGTGAGGGACGACGCCTCGCAGTACGTGCAGCCCATGGAGACCTTTACCAGCCTCGCGCCCATCGTCGACATGTGCGTCGTGGACCTCGAGCGCCAGGGACAGAACCAACTCATCACGTGCTCCGGTGCGTTCAAGATGGGTTCGCTGCGTATAATACGGAACGGGATCGGCATCCAGGAGCAGGCGTCCATAGACCTGCCCGGCATCAAGGGCATGTGGGCGCTCACACTCGGCCAGGGACCGCACCACGACACCCTCGTACTGTCCTTCGTGGGACAGACTCGTGTGCTGACTCTAAACGGCGAGGAGGTGGAGGAGACAGAGATAAAGGGTTTCGTGTCGGACAGACAGACATTCTTCACCGGGAACGTGTGCCACGACCAGCTGATCCAGGTCACCGACGAGGGTATACGACTCATAGGACGCGGGCCGGGTGGCTGGAACGGAGTCGCCGCCTGGGCCCCCGCAGGCCGAGCGGTGTCCGTGGTGTCGTGTGGAGAAACGCGGGCCGTGGCCGCCGCTGGGCTGAGGATATACCTCGTGGCCATAAAACAGGGGGCGCTGGAATTGATTTCTGAGGTGTGCATGAACGAGGAGGTGGCCTGCCTGGACCTGGGCCCGGGAGGCGAGGAGGCCCTGCTGGGTGTTGGGCTATGGACTGATATATCCGTCAGAGTGCTCAAGTTACCGGACCTCCGACCACTCCACACGGAGAAACTCTCTGGAGAGATAATCCCGCGCTCTCTTCTCATCTGTGTGTTGGAGGGCGTGTGTTATTTGCTGTGCGCGTTGGGTGACGGCTCTATGTTCTACTTCACCGTAGACCCGGACAGCGGAGTGCTCACCAACAAGAAGAAGGTCACACTTGGCACGCAGCCCACAGTACTCAGGAGCTTCAGATCGCTGTCAACGACCAACATCTTCGCGTGCTCTGATCGTCCAACAGTTATATTTTCGTCCAACCACAAGTTGGTTTTCTCCAACGTTAATCTCAAGGAAGTGGCCCATATGTGTTCACTCAACGCCGTGGCTTATCCCGACAGCTTGGCTCTAGCCACGGACAGCACAGTGACCATCGGTACCATAGATGAAATACAGAAGCTGCACATCCGAACCGTGCCCCTGGGGGAGACGCCCAGACGCATCGCGTACCAAGAAGCTTCGCAGACGTTCGGCGTGATCACGATGCGCGTGGACAAGGTGGAGTGGACGGGCGGGTGCGGCTCGCTGGTGCGGCCCTCGGCCTCCACGGCCGCCGCTTCCGCCTCGGCCGCCGCCCCGCCCTCCAAGCACGCGCCCGCCCCGCTCGACCTCGAGCTCCACAACCTGCTCATACTGGACCACCACACCTTCGAGGTCCTCCACGCTCATCAACTGCTGGCCAACGAGTTCGCCATGTCGCTAGTGTCGTGCAAGCTGGCCGACGATCCCAACCACTACTACGCTGTGGGCACCGCCATACTCAACCCCGAGGAGTCGGAACCCAAACAGGGGAGGATTCTCTTATTCCACTGGTGCGAAGGAAAACTCACTCAAGTTGCTGAAAAAGAAATCAAAGGAGGTTGTTACACGTTGGTGGAGTTCAATGGAAAGTTACTAGCATCCATAAATAGCACTGTTAGATTATTTGAATGGACTTCGGAGAAGGAGTTGAGATTAGAATGCAGTCACTTCAACAATATTGTGGCCCTGTACCTCAAAGTCAAGGGCGACTTCATACTTGTGGGAGATCTCATGAGGTCCATGTCTTTGTTGCAGTACAAGCAGATGGAGGGTTCCTTTGAAGAGATAGCTCGTGACTACAGCCCCAACTGGATGACGGCCGTCGAGATCCTAGATGACGACACCTTCCTCGGGGCCGAGAACAGCTTCAACCTCTTTGTATGCCAAAAAGACAGCGCGGCCACGACCGATGAAGAGAGGCAGCAGATGGGCTACATGGGTCAGTTCCACGTCGGTGACATGGTGAACGTGATGAGGAGGGGCGCTCTGGTCGCTCAACTCGCAGACACCGCCGCGCCCGTCGCCCGACCCGTCCTGCTGGCTACCGTCTCCGGCGCTATATGTCTGGTTGTGCAATTATCACAGGAACTATTTGATTTCCTTCACCAACTAGAAGAGAGGCTCACACACACCATTAAATCGGTGGGCAAGATCCCTCACTCGTTCTGGAGATCCTTCAACACTGATATCAAAACTGAACCAGCCGAAGGGTTCATCGACGGTGACCTGATAGAAAGTTTCTTAGATCTCTCCAGAGACATGCAGCAAGAAACCCTGCAAGGATTACAGATTGACGACGGCGGTGGCATGATGAGAGATGCCACAGTTGATGATCTCATCAAAATAGTGGAGGATCTCACCAGGATACATTAG

Protein sequence:

>DPOGS214746-PA
MAYHYVVTAQKPTAVISCITGNFTSPTDLNLLVAKVSRLEMYLVTPEGLRPMKEVGLYGRVAKMKLFRPPYEQKDLVFILTARYNAMILEWRTGANGELEVVTRAHGNVADRIGKPSENGILAVIDPQARVIGLRLYDGLFKIIPLDKDSTELKAASLRLEELNVYDLEFLHGCSNPTLILIHQDLNGRHIKTHEINLRDKEFMKIPWKQDNVETEASILIPVPSPLGGAIVIGQESIVYHDGQSYVAVAPPQIKTPINCYCRVDVRGLRYLLGDIAGRLFMLLLELSERDGTASVRDLKVELLGDIPIPECMTYLDNGVVFVGSRLGDSALVRLAAVRDDASQYVQPMETFTSLAPIVDMCVVDLERQGQNQLITCSGAFKMGSLRIIRNGIGIQEQASIDLPGIKGMWALTLGQGPHHDTLVLSFVGQTRVLTLNGEEVEETEIKGFVSDRQTFFTGNVCHDQLIQVTDEGIRLIGRGPGGWNGVAAWAPAGRAVSVVSCGETRAVAAAGLRIYLVAIKQGALELISEVCMNEEVACLDLGPGGEEALLGVGLWTDISVRVLKLPDLRPLHTEKLSGEIIPRSLLICVLEGVCYLLCALGDGSMFYFTVDPDSGVLTNKKKVTLGTQPTVLRSFRSLSTTNIFACSDRPTVIFSSNHKLVFSNVNLKEVAHMCSLNAVAYPDSLALATDSTVTIGTIDEIQKLHIRTVPLGETPRRIAYQEASQTFGVITMRVDKVEWTGGCGSLVRPSASTAAASASAAAPPSKHAPAPLDLELHNLLILDHHTFEVLHAHQLLANEFAMSLVSCKLADDPNHYYAVGTAILNPEESEPKQGRILLFHWCEGKLTQVAEKEIKGGCYTLVEFNGKLLASINSTVRLFEWTSEKELRLECSHFNNIVALYLKVKGDFILVGDLMRSMSLLQYKQMEGSFEEIARDYSPNWMTAVEILDDDTFLGAENSFNLFVCQKDSAATTDEERQQMGYMGQFHVGDMVNVMRRGALVAQLADTAAPVARPVLLATVSGAICLVVQLSQELFDFLHQLEERLTHTIKSVGKIPHSFWRSFNTDIKTEPAEGFIDGDLIESFLDLSRDMQQETLQGLQIDDGGGMMRDATVDDLIKIVEDLTRIH-