Monarch geneset OGS2.0

DPOGS215089
TranscriptDPOGS215089-TA2367 bp
ProteinDPOGS215089-PA788 aa
Genomic positionDPSCF300187 + 223586-232116
RNAseq coverage235x (Rank: top 43%)
Annotation
HeliconiusHMEL0105350.089.02% 
BombyxBGIBMGA007191-TA0.089.20% 
DrosophilaXpd-PA0.076.84% 
EBI UniRef50UniRef50_E3XFQ70.070.26%Putative uncharacterized protein n=2 Tax=Anopheles darlingi RepID=E3XFQ7_ANODA
NCBI RefSeqXP_970844.10.078.95%PREDICTED: similar to Xeroderma pigmentosum D CG9433-PA [Tribolium castaneum]
NCBI nr blastpgi|910792340.078.95%PREDICTED: similar to Xeroderma pigmentosum D CG9433-PA [Tribolium castaneum]
NCBI nr blastxgi|910792340.079.05%PREDICTED: similar to Xeroderma pigmentosum D CG9433-PA [Tribolium castaneum]
Group
Gene OntologyGO:00168171.9e-262hydrolase activity, acting on acid anhydrides
GO:00040032.1e-120ATP-dependent DNA helicase activity
GO:00168182.1e-120hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides
GO:00056341.8e-91nucleus
GO:00036771.8e-91DNA binding
GO:00055241.8e-91ATP binding
GO:00062891.8e-91nucleotide-excision repair
GO:00061396.2e-69nucleobase, nucleoside, nucleotide and nucleic acid metabolic process
GO:00080266.2e-69ATP-dependent helicase activity
GO:00036766.2e-69nucleic acid binding
KEGG pathwaytca:6594470.0 
 K10844 (ERCC2, XPD)maps-> Nucleotide excision repair
InterPro domain[36-736] IPR0130201.9e-262DNA helicase (DNA repair), Rad3 type
[36-308] IPR0065542.1e-120Helicase-like, DEXD box c2 type
[36-53] IPR0019451.8e-91Xeroderma pigmentosum group D protein
[570-714] IPR0065556.2e-69Helicase, ATP-dependent, c2 type
[297-441] IPR0106436.8e-51Domain of unknown function DUF1227
[100-283] IPR0106142.5e-50DEAD2
Orthology groupMCL10922 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215089-TA
ATGGCTTACTGGTATATTTCCCTTATGACTACATATATCCAGAACAGTATGCCTATATGCTGGAACTTAAACGAGCCCTTGATGCTAAGGTTGACCGTTGATGGTTTACTGGTATATTTCCCTTATGACTACATATATCCAGAACAATATGCCTATATGCTGGAACTTAAACGAGCCCTTGATGCTAAGGGCCACGGATTACTTGAAATGCCTTCAGGGACTGGTAAAACTATATCCTTGTTATCGCTTATTGTGGCTTACATGATACAAAACCCACATCACGTCAGAAAACTCATCTATTGTTCCCGAACTGTACCTGAAATAGAAAAAGTCTTAGAGGAACTTAAGAATCTTATAAAATATTATGAAAAGTCTCAAGGTGAGAAGCCGAGCTTGACGGGCGTTGTGCTCAGTTCAAGGAAAAACTTGTGCATACATCCAGAGGTATCAAGAGAGCGTGAGGGGAAGCTGGTTGATGGGAAATGTCATTCGCTAACGGCCAGTTACATCAGAGACAGACACGAACAGGACCCTTCAGTGCCCATATGTCAATTCTATGAGGGTTTTAACCGTGAGGGTCGCGAGTCCATGCTGCCGTATGGAGTGTACACTATGGATGACCTCAAACAATACGGAGCTGACAGGAACTGGTGCCCCTACTTCCTGTCTAGATTCGCTATAATCCACGCTGAGATAGTTGTGTACTCGTACCACTACTTATTAGATCCTAAGATAGCTGAAGTGGTATCAAAAGAACTGAACAAGGAGGCTGTGGTGGTGTTCGATGAGGCACATAATATAGATAATGTTTGTATCGACTCTCTAAGTGTGAAGATCACGAGGCGGACTATCGATAAGAGCACGCAAGCACTACAGACGCTAGAAAAAGCTGTGTCACAATTAAAACAAGAGGACGAGGCGCGCCTGGCGCTGGAGTACGAGCAGATGGTGGAGGGTCTGAGGGAGGCGGCGCAGCTGAGGGACAGTGACGTCATACTGGGCAACCCTGTACTACCTGATGAACTGCTCAACGAGGTGGTCCCTGGCAACATCAGGAACGCGGTCCACTTCCTCGGGTTCTTGAAGCGGTTCATAGAATACTTGAAGACGAGGCTGCGGATACAGCACGTGGTGCAGGAGTCGCCGGCCGGTTTCTTAAAGGACGTGTCGTCTCGCGTGTGTATCGAGCGCAAGCCTCTCCGTTTCGTGTCGTCGCGGCTCCAGACCCTGATGAAGACCCTCCAGATCCCGGACCCCTCGAACTTCGGCTCCTTAACACTAGTGGCGCACCTGGCGACGCTCGTGTCCACGTACACCAAGGGCTTCGTCATCATCATAGAGCCCTTCGATGACAAAACCCCGACCGTCTCCAATCCAATACTACACTTCTCATGTATGGACTCGTCGATAGCCATGCGGCCAGTGTTCGGTAGATTTCAAACTGTCATCATCACTTCCGGTACGCTATCTCCCCTGGACATGTATCCCAAGATCCTGGACTTTAACCCCGTAGTAATGAGCTCCTTCACTATGACGCTCGCCCGACCTTGCATACTGCCCATGATAGTGTCCAAAGGTAGCGACCAAGTGGCGATTTCTTCAAAGTACGAGACACGAGAAGACGTCGCGGTGATAAGGAACTACGGACAACTACTAGTAGAGATATCAGCCTGCGTGCCGGACGGGGTGGTGTGCTTCTTCACTTCGTATCTGTACCTGGAGAGCGTGGTCGGAGCTTGGTATGATCAGGGTGTCGTCGCCAATTTACAGAAACACAAGCTGCTGTTTATCGAGACGCAGGACTCGGCGGAGACCAGCTTCGCCTTAATAAACTACATTAAGGCGTGCGAGAGCGGTCGTGGGGCGGTGTTGCTATCGGTGGCGCGCGGCAAGGTCTCGGAGGGAGTGGACTTCGACCATCACCTCGGACGGGCGGTCCTCATGTTCGGGATACCTTACGTGTTCACTCAGAGCAGGATATTAAAGGCCCGTCTAGAGTACCTGAGAGATCAGTTCCAGATCCGTGAGAACGATTTCCTAACGTTCGACGCGATGCGTCACGCGGCTCAGTGTGTTGGCCGAGCGTTGAGAGGCAAGACGGACTACGGTATAATGATATTCGCTGACAAGCGCTTCAGTCGCTCGGACAAGAGAAGTAAGCTACCGCGGTGGATACAAGAACATCTGAGGGACTCGCTCTGCAACCTCAGTACCGAGGAAGCCGTACAGATAAGTAAGCGTTGGCTCCGCCAGATGTCGCAGCCGTTCAGCCGCGAGGACCAGCTGGGAGTGTCGCTGTTGACGCTCCAGCAGTTACAGAGCAAGGAGCAGCAGGAGAAGATCGAGAAGCAGGTCCTCCAGAAGTAG

Protein sequence:

>DPOGS215089-PA
MAYWYISLMTTYIQNSMPICWNLNEPLMLRLTVDGLLVYFPYDYIYPEQYAYMLELKRALDAKGHGLLEMPSGTGKTISLLSLIVAYMIQNPHHVRKLIYCSRTVPEIEKVLEELKNLIKYYEKSQGEKPSLTGVVLSSRKNLCIHPEVSREREGKLVDGKCHSLTASYIRDRHEQDPSVPICQFYEGFNREGRESMLPYGVYTMDDLKQYGADRNWCPYFLSRFAIIHAEIVVYSYHYLLDPKIAEVVSKELNKEAVVVFDEAHNIDNVCIDSLSVKITRRTIDKSTQALQTLEKAVSQLKQEDEARLALEYEQMVEGLREAAQLRDSDVILGNPVLPDELLNEVVPGNIRNAVHFLGFLKRFIEYLKTRLRIQHVVQESPAGFLKDVSSRVCIERKPLRFVSSRLQTLMKTLQIPDPSNFGSLTLVAHLATLVSTYTKGFVIIIEPFDDKTPTVSNPILHFSCMDSSIAMRPVFGRFQTVIITSGTLSPLDMYPKILDFNPVVMSSFTMTLARPCILPMIVSKGSDQVAISSKYETREDVAVIRNYGQLLVEISACVPDGVVCFFTSYLYLESVVGAWYDQGVVANLQKHKLLFIETQDSAETSFALINYIKACESGRGAVLLSVARGKVSEGVDFDHHLGRAVLMFGIPYVFTQSRILKARLEYLRDQFQIRENDFLTFDAMRHAAQCVGRALRGKTDYGIMIFADKRFSRSDKRSKLPRWIQEHLRDSLCNLSTEEAVQISKRWLRQMSQPFSREDQLGVSLLTLQQLQSKEQQEKIEKQVLQK-