Monarch geneset OGS2.0

DPOGS208731
TranscriptDPOGS208731-TA4398 bp
ProteinDPOGS208731-PA1465 aa
Genomic positionDPSCF300043 + 233089-238470
RNAseq coverage185x (Rank: top 49%)
Annotation
HeliconiusHMEL0152490.079.57% 
BombyxBGIBMGA003396-TA0.076.97% 
DrosophilaDNApol-alpha180-PA0.045.30% 
EBI UniRef50UniRef50_E0VHS70.046.39%DNA polymerase n=1 Tax=Pediculus humanus corporis RepID=E0VHS7_PEDHC
NCBI RefSeqXP_002425588.10.046.39%DNA polymerase alpha catalytic subunit, putative [Pediculus humanus corporis]
NCBI nr blastpgi|2420096350.046.39%DNA polymerase alpha catalytic subunit, putative [Pediculus humanus corporis]
NCBI nr blastxgi|2420096350.046.21%DNA polymerase alpha catalytic subunit, putative [Pediculus humanus corporis]
Group
Gene OntologyGO:00038870DNA-directed DNA polymerase activity
GO:00036770DNA binding
GO:00062600DNA replication
GO:00001660nucleotide binding
GO:00061391.5e-117nucleobase, nucleoside, nucleotide and nucleic acid metabolic process
GO:00036761.5e-117nucleic acid binding
GO:00018828.8e-31nucleoside binding
KEGG pathwayphu:Phum_PHUM2107700.0 
 K02320 (POLA1)maps-> Purine metabolism
    DNA replication
    Pyrimidine metabolism
InterPro domain[41-1210] IPR0045780DNA-directed DNA polymerase, family B, pol2
[776-1225] IPR0061342e-120DNA-directed DNA polymerase, family B, multifunctional domain
[529-997] IPR0061721.5e-117DNA-directed DNA polymerase, family B
[324-797] IPR0123371.9e-67Ribonuclease H-like
[363-700] IPR0061339.4e-35DNA-directed DNA polymerase, family B, exonuclease domain
[1249-1427] IPR0150888.8e-31Zinc finger, DNA-directed DNA polymerase, family B, alpha
[953-1075] IPR0232118.3e-25DNA polymerase, palm domain
Orthology groupMCL12211 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208731-TA
ATGGCAGAGTCCTTAGCGACTTCCAGGGCCAAACGGCAAAAAGTTGACAAAACTGGTCGTTTGTCGGCTTTAGAAAAATTAAAACAACTAAAGGGGAAAGGCTCTAAACATAAATACGATGTGGATGAATTGGAGAATGTGTATGACTTAGTTGATGAGTCTGAATATAGTAATCGGGTACTACAGCGACAAGAAGACGACTGGATTGAAGATGATGGGACAGGATATGTGGAGGATGGTCGAGAAATATTTGATGATGATGAGATTGATGATACTTATGTGGCAAAAGATAACAAAGAAACAGGCAGAGGCCACAAAAGGAAAGCTAAGGTTGCTCCCCAGCCGGCTGGGAAAGGAAATATAAGAAATCTCATAGGTGCTATGCCAAATAAAAAGAAAGAGGATGCCAAAATATCAGATGACAATATTTTATCAGATATCATGTCAGATTTAGATGGTACTACATCATCTATAGCAAAGCAAAAGCTTGTTGTTCCGAAAAAAAACATAGTTGACTCAAGTAAGAGAGACGCCCAGAATTATTTTAAAAATTTGTCATCGTCTGTTAAAAAACCGACACCACCTGTAAAAAAGGAAGAAACTGTTGTTGTGGATGTTGATAATGTTGAAAAGCCAAAGAAATCTACATGGTTAATAGACAAAGAAATAAAAAAAGAAGTAGAAGAGTCTCCTAGTCAAATTATAGAAGAATTTGCAACTCAAGATATAGACTTTGGTGACGATTTCAGCAATGATGCGGAACTACCACAAGTTCCTATTAAAAAAGAAGCTACAATGACTGACATATTACAAGATGTCGCTGAAGATTTTAAAGAAGATTTTGATATATTTTCAGTAAAGACAGAGCCCAAAGAATTGAAGGCCTTATCAAGTAATTGGTCCCAGAATGATAATCAAGTTGTTGTCAACATACAGAGTGATGTACAGTTGCCTTTACAGAAGAATAAGGATGGGGACCAAGTGTTGAAGTTTTATTGGCTCGATGCTTGGGAAGATAAATATGTTAAGCCGGGAGTTGTGTATTTGTTTGGAAAAGTATATGTTAATCCATCCAAGAAGAAAGAGGGTTGTGCTTCATGCTGCTTAGTGGTGAAAAATGTCAACCGACAATTGTTTCTTCTTCCAAGAGAATATAAATTAGATCCCATAACCTTGGAGGCGACAGATCAAGAGGTAACCATGATGGATGTTTATGAAGAATTCAACAACTGTGTTGCTAGTGAAATCGGTTTAAAAGAATTCAAGTCAAGGAAAGTTACTAAAAACTATTGCTTCAATTTACCAGACATACCCTCCCAATGTGATTACTTGGAGGTCAAGTATTCGGCTACTTTTCCAATACCACCAGTAGGAAAGAAATATTTGACATTTTCACATATATTTGGTACCAACACATCTTCTCTAGAGAGCTTCCTGCTTGATAGGAAGATCAAGGGTCCATGTTGGTTGGAGGTGAAAAATGCTGAGAGTGTTCAAGCCAAAGTGTCTTGGTGCAAATTAGAAGCATCATGTGATAAAATGGAGAATGTGGCCGTCATAAGAAATGACAGTGATTTAGAACCACCTCCTATTGTTATTGCTACTTTAAATATGAGGACAGTTAGTGATCCAAAAACAAGTAAGACAAAAATCTTGATGATGAGTTGTCTAGCTCACAACTCTTTCCCTATACACAAGCCGCCACCGAATCCACCTTTCCATCAACACTTTTGTGTGATGACAAAATGCAATGACATGTGGCCAATTGATTTGAAACAGCAAATGCAACAATATAGAGCGACAAAATTAACGAAATGTGACAACGAGAGGGAGCTCCTCAACTACTTCATGGTACAGTTTTGGAAGCTTGATCCGGATTTAGTTGTTGGTCATGACTTGCAAGGATTCCAGCAGGATTTGCTCATAGGCAACATATTAGACCTGCGTATTCCGAACTGGTCTCGACTCGGTCGCTTAAAGAGATCAGTGGCTCCGCAAAAAAAATTCGCAGCGAGAAGTGCTTTTCTTGGGAGACTAGTCTGCGATATAAAACTATCTGCGATGGAACTTATAAGGGCACGGAGCTTCGATCTAGATTCTCTGTGTGTTAGTGTTCTGAAAATGAAAGAAGGGGAGAGAATTGACGTATCGATCGAAGACTTGCCCCGATATAATGAAAGTTCAAGTGACCTTTTACAATTAGTGTCACTAAGTATGCAGGATGCTTCATACATTCTGAAGATAATGTGCGAACTCAATGTGATACCACTGGCTCTGCAAATAACGCAAATAGCGGGCAATATAATGTCCAGAACTTTGATGGGCGGACGGTCGGAAAGGAATGAGTTTTTGTTACTGCACGCTTTCACTGAAAAGAACTACATTGTGCCTGATAAAATATATGGAAAGAAGGCCGACGGTGACGATGACGAGCAGGACGAAGCCGGAAATGTATCAAAGAAACAAGCTAAGAAGAAAGCGGCGTACGCTGGAGGCCTGGTGCTCGACCCCAAGAAAGGCTTCTACGATAAACTCATACTTCTAATGGACTTCAACTCGTTATATCCAAGTATCATTCAAGAATATAATATTTGTTTCACGACGATCAAAAGAAAGAACGGCGCTTCATCAGATGATGACATCAATAACCTGGTTCTGCCCGCTCCCAATACGGAATTCGGAGTACTGCCCACACAGATAAGAAAACTAGTTGAAAGCAGACGGGAAGTAAAAAGACTAATGAAATCACCGGACCTTGCTTCCGAACTGTACATGCAATACAATATTCGGCAAATGGCGTTGAAGCTCACTGCAAACTCTATGTATGGCTGTCTCGGCTTTACACATTCTAGGTTCTATGCAAAACCTTTAGCTGCTTTAGTCACTATGAAGGGTAGAGAGATTCTCATGGACACCAAGGAAATTGTTCAGAAACTAAATTATGAAGTGGTCTACGGTGATACCGACAGTTTGATGATCAACACCAACTGTTTGGACTACGATTACGTGTTTAAGATAGGCAACGACTTGAAAAGAGAAATCAATAAGAAGTACAAACAGATCGAATTAGATATTGATGGAGTATTTAAATATCTACTTCTCTTAAAGAAGAAGAAATATGCTGCTGTAGTGGTCAGTAAGAGCAAAAGTGGTGAATTCATTTATAACCAAGAGCACAAAGGCTTAGATATAGTCAGGAGAGATTGGTCGCAGTTAGCCGCAGAGGCCGGAAAATTTATCCTAACGCAAATTCTTTCCGAGCAGACGGCTGACGAAAGACTAGAAAGTATACAGAATCATTTAAACAAATTGAAAGAAGATTTAGTTAACAGCAAAATGCCTTTATCGCTATTGACAATAACTAAGCAATTAACCAAAAATCCTAACGAATACGCAGATAAGAACAACCAGCCGCACGTCCAGGTAGCTCTGAGATTGAACAGCAAAAATAGCAGACGTTTTAAAAAGGGCGACATAGTTCCGTATATAATCTGTGAGGACGGCACAGCGAATAGTGCGACGCAGAGAGCTTATCATATAGAAGAATTGAAAAATTCCGAACATCTTAGCGTCGACTACAAATATTACTTGGCCCATCAATTACACCCCGTCATATCTCGTATATGCGAACCCATCGAGGGTTTGGATCCGGCTCGAGTAGCGGACTGCCTCGGCCTGGATCCCTCCGGCTACCGTCAGATAACAAAGAAAGAAATCTCCAATACAGATACATACGAAGTAGAGAACGATAAAGAAAAATACAGATATTGCAAAGAATTCACGTTCATATGTGTCAACGAGCAATGTAGAACTGAGAACAGAATACGAGACACGTTCAGGCAGGTGGAGAAGGAGAGCGTCACGTTCCTGGAGCGGTGTCAGAACGAGAAGTGCGCTGTTAAACCTATCGACTACTTAGCGTGTATACAGAATCAGCTGTCGTTACAAATGCGCCAGTATCACAGCGAGTATTATACAGGGTGGTTGGCGTGCGAGGACCCCGCGTGCGGGTACCGCTCGCCGCGACTGCCGCAGACCTTCGCCGCAGGATATCCGCTCTGCAGGCTGTGCGAGAAAGGCGTCATGTTCCGGGAGTACACGGAGAAAGACCTGTATTTACAAATAAACTTCTTTTTGTTTCTGTTTGATGTTAACAAGCATAATACGACAAAAACAAAAATAAGCCCTAATATTTTGTCAGCCTTCCAAGTCTTGAAAGTGATGGTTGAAGAGGTCCTTGCGAACTCGGCGTACGCTATTATAAACTTGTCAAAATTGTTCAGATTTTTTGGTGTCGATAACAGAGGCGGGAATAATATTAAGTCTGAAGATCTCGAAATCGACATTCTTCCAGAGACAGAACACATAGACGCCCTACTGGAATTGGGGACTTACTGA

Protein sequence:

>DPOGS208731-PA
MAESLATSRAKRQKVDKTGRLSALEKLKQLKGKGSKHKYDVDELENVYDLVDESEYSNRVLQRQEDDWIEDDGTGYVEDGREIFDDDEIDDTYVAKDNKETGRGHKRKAKVAPQPAGKGNIRNLIGAMPNKKKEDAKISDDNILSDIMSDLDGTTSSIAKQKLVVPKKNIVDSSKRDAQNYFKNLSSSVKKPTPPVKKEETVVVDVDNVEKPKKSTWLIDKEIKKEVEESPSQIIEEFATQDIDFGDDFSNDAELPQVPIKKEATMTDILQDVAEDFKEDFDIFSVKTEPKELKALSSNWSQNDNQVVVNIQSDVQLPLQKNKDGDQVLKFYWLDAWEDKYVKPGVVYLFGKVYVNPSKKKEGCASCCLVVKNVNRQLFLLPREYKLDPITLEATDQEVTMMDVYEEFNNCVASEIGLKEFKSRKVTKNYCFNLPDIPSQCDYLEVKYSATFPIPPVGKKYLTFSHIFGTNTSSLESFLLDRKIKGPCWLEVKNAESVQAKVSWCKLEASCDKMENVAVIRNDSDLEPPPIVIATLNMRTVSDPKTSKTKILMMSCLAHNSFPIHKPPPNPPFHQHFCVMTKCNDMWPIDLKQQMQQYRATKLTKCDNERELLNYFMVQFWKLDPDLVVGHDLQGFQQDLLIGNILDLRIPNWSRLGRLKRSVAPQKKFAARSAFLGRLVCDIKLSAMELIRARSFDLDSLCVSVLKMKEGERIDVSIEDLPRYNESSSDLLQLVSLSMQDASYILKIMCELNVIPLALQITQIAGNIMSRTLMGGRSERNEFLLLHAFTEKNYIVPDKIYGKKADGDDDEQDEAGNVSKKQAKKKAAYAGGLVLDPKKGFYDKLILLMDFNSLYPSIIQEYNICFTTIKRKNGASSDDDINNLVLPAPNTEFGVLPTQIRKLVESRREVKRLMKSPDLASELYMQYNIRQMALKLTANSMYGCLGFTHSRFYAKPLAALVTMKGREILMDTKEIVQKLNYEVVYGDTDSLMINTNCLDYDYVFKIGNDLKREINKKYKQIELDIDGVFKYLLLLKKKKYAAVVVSKSKSGEFIYNQEHKGLDIVRRDWSQLAAEAGKFILTQILSEQTADERLESIQNHLNKLKEDLVNSKMPLSLLTITKQLTKNPNEYADKNNQPHVQVALRLNSKNSRRFKKGDIVPYIICEDGTANSATQRAYHIEELKNSEHLSVDYKYYLAHQLHPVISRICEPIEGLDPARVADCLGLDPSGYRQITKKEISNTDTYEVENDKEKYRYCKEFTFICVNEQCRTENRIRDTFRQVEKESVTFLERCQNEKCAVKPIDYLACIQNQLSLQMRQYHSEYYTGWLACEDPACGYRSPRLPQTFAAGYPLCRLCEKGVMFREYTEKDLYLQINFFLFLFDVNKHNTTKTKISPNILSAFQVLKVMVEEVLANSAYAIINLSKLFRFFGVDNRGGNNIKSEDLEIDILPETEHIDALLELGTY-