Monarch geneset OGS2.0

DPOGS203534
TranscriptDPOGS203534-TA2529 bp
ProteinDPOGS203534-PA842 aa
Genomic positionDPSCF300055 + 116229-127217
RNAseq coverage330x (Rank: top 35%)
Annotation
HeliconiusHMEL0057602e-10965.37% 
BombyxBGIBMGA009181-TA4e-11567.72% 
DrosophilaDNA-ligI-PA2e-10963.29% 
EBI UniRef50UniRef50_B4INA31e-10763.29%GM13668 n=1 Tax=Drosophila sechellia RepID=B4INA3_DROSE
NCBI RefSeqXP_001606591.13e-11166.20%PREDICTED: similar to ENSANGP00000010547 [Nasonia vitripennis]
NCBI nr blastpgi|3071879085e-11065.85%DNA ligase 1 [Camponotus floridanus]
NCBI nr blastxgi|3071879088e-10565.85%DNA ligase 1 [Camponotus floridanus]
Group
Gene OntologyGO:00062819e-87DNA repair
GO:00055249e-87ATP binding
GO:00062609e-87DNA replication
GO:00039109e-87DNA ligase (ATP) activity
GO:00063109e-87DNA recombination
GO:00036772.4e-56DNA binding
KEGG pathwaynvi:1001229849e-111 
 K10747 (LIG1)maps-> Base excision repair
    DNA replication
    Mismatch repair
    Nucleotide excision repair
InterPro domain[561-823] IPR0009779e-87DNA ligase, ATP-dependent
[364-514] IPR0123082.4e-56DNA ligase, ATP-dependent, N-terminal
[681-826] IPR0123403e-54Nucleic acid-binding, OB-fold
[680-827] IPR0160277.2e-51Nucleic acid-binding, OB-fold-like
[560-672] IPR0123104.5e-28DNA ligase, ATP-dependent, central
[697-808] IPR0123091.1e-25DNA ligase, ATP-dependent, C-terminal
Orthology groupMCL12338 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203534-TA
ATGTCCCAAAAGAGTATTAAATCGTTTTTTAAAATAACTCCTAAGAAAACTGAAGTTATAACCGAAGCAGCGAATCAGGAACGTAATGCATCGCCATCCAATACAAGTATCAATAGTGAAACTGACAGTCCAAATGGAAATGCAAAGAGAGGTAAAAGATTGAGATCAAGTAGCAGTGAACATGAATCTGGAGAGGGTAAGAAGATTCCATCTCCAAGTAGTTCAGAGAAGAAAAAGAAAGTCAAACGTCAGAGAATTGAAAGTTCGGAGAGCGAAACAGAAAATACAGTAGAGGAAGAGAAAATGGAGGTGAAACTTGAAAATAACAATCCACCCAAAACTTACGCCTCACCGAAGGCCAAGAAAATGAATGAGAAGAAAATAAAGATAGAGAAGGAGCGATCACCAGAAAGTAACGATAGAAAAACAGAAGTGAAAAGTCCATCTCCTGTTAAGATGCCAAAGAATAAAGCGAATGGTAACATTATGAGTTCATTTGTAAAGATTGAGAGGCCGGACACAAAGAAAGATAAAGAAAATATTACAGACGCAGATAAAGATAACTCTGACATAGTTAAAGGTATGCGTAATGCATCGCCATCCAATACAAGTATCAATAGTGGAACTGACAATCCAAATGGAAATGCAAAGAGAGGTAAAAGATTGAGATCAAGTAGCAGTGAACATGAATCTGGAGAGGTTAAGAAGATTCCATCTCCAAGTAGTTCAGAGAAGAAAAAGAAAGTCAAACGTCAGAGAATTGAGAGTTCGGAGAGCGAAACAGAAAATACAGTAGAGGAAGAGAAAATTGAGGTGAAACTTGAAAATAACAATCCACCCAAAACCTACGCCTCACCGAAGGCCAAGAAAATGAATGAGAAGAAAATAAAGATAGAGAAGGAGCGATCACCAGAAAGTAATGATAGAAAAACAGAAATGAAAAGTCGGTCTCCTGTTAAGATGCCAAAGAATAAAGCGAATGGTAACATTATGAGTTCATTTGTAAAGATTGAGAGGCCGGACGCAAAGAAAGATAAAGAAAATATTACAGACGCAGATAAAGATAACTCTGATATCGTTAAAGAAGTTGATTACAATCCCGGTAAAACGAAATACAATCCGATCAAAGATGCCTGCTGGAAGAAAAGTCAAGATGTACCATATCTGGCGTTAGCAAAGACCCTAGAAGTCATAGAAGCGACGTCTGCTAGACTTAAAATGGTGGAGATATTAAGTAATTACTTCAAGTCGGTCATAGCATTGACTCCAGAGGATCTTCTGCCTAGCATATATCTGTGTTTGAACCAACTAGCACCAGCGTATAAGAGTCTTGAATTAGGTATAGCTGAGACATACTTGATGAAGGCCGTGGGTCAGTGTACAGGGCGGACCCTCGCACAGATGAAGGCGGCTGCACAGCGCTGCGGGGACCTGGGTCTGGTGGCGGAGCAGGCTCGCGCTACACAGAGGACGATGTTCGCTCCCCCGCCCCTCACCGTGAGGAAAGTTATTACGGCGCTCAGAGACGTGGCCGCTATGACGGAAATAAAAGACTGCAGTCTCTGGAAAATGGACGAAGCATTTTCTGGAGATTTCCGGCCAGGCGTCTGTCAACAAAAAAATTGGGAAAATCCAATCGCTTTATGTTGCATGCAGACATTCAGAAGCCAGATATCTGATCAGGTGTTATCGACTCGCAAACGGAAGGACGCGTCCGAGGACCAGATCAAGGTGCAGGTGTGTGTGTTCGTGTTCGACCTGCTGTACCTCAACGGAGAGGCGCTCGTCAGGGAAGACCTGGAGAAAAGGAGGGAGCTGTTGAGGCAGCACTTCAATGAGGTCGAAGGTGAATGGCAGTTCGCGGTGAGCCGTGACTGTACCGACGAGGAGGAGGTGGCTCAGTTCCTGCAGGAGTCTGTGAAGGCATCCTGTGAGGGTCTCATGGTGAAGGCGCTCCGGGGAGAGAATGCGCGCTATGACATAGCCAGGAGGTCGCACAACTGGCTGAAGTTAAAGAAGGACTATCTGGAGGGCGTGGGCGACTCCGTGGACGCGGTGGTGATCGGCGCTTATCACGGGCGGGGGAAGAGGACGGGCGTGTACGGCGGGTTCCTGCTGGCGTGCTACGACCCCGCTCACGAACAGTACCAGTCGCTCTGCAAGATAGGCACCGGCTTCTCCGACGAGGACCTGCGCACGCTCAGCGACACGCTCGCCGAACACGTCGTAGACGGACCCAGGAGCTACTACTTGTTCGACTCGAGCCACTCCCCGGACGTGTGGTTCTCTCCGTCGTGTGTGTGGGAGGTGCGCTGTGCGGACCTGTCCCTGTCCCCGGCTCACCGCGCCGCTCTGGGCCTCGTACATGACAGTAAAGGAATCAGTCTGCGGTTCCCGAGGTTCATCCGTGTCCGTGACGACAAGTCCGCGGAGCTGGCGACCTCCGCGGAACAGATCGCAGAGCTCTACCTCCGGCAGGACCAGGTCAAGAACACCACCAACAACAACCAGAGAGACGACTTCTACTGA

Protein sequence:

>DPOGS203534-PA
MSQKSIKSFFKITPKKTEVITEAANQERNASPSNTSINSETDSPNGNAKRGKRLRSSSSEHESGEGKKIPSPSSSEKKKKVKRQRIESSESETENTVEEEKMEVKLENNNPPKTYASPKAKKMNEKKIKIEKERSPESNDRKTEVKSPSPVKMPKNKANGNIMSSFVKIERPDTKKDKENITDADKDNSDIVKGMRNASPSNTSINSGTDNPNGNAKRGKRLRSSSSEHESGEVKKIPSPSSSEKKKKVKRQRIESSESETENTVEEEKIEVKLENNNPPKTYASPKAKKMNEKKIKIEKERSPESNDRKTEMKSRSPVKMPKNKANGNIMSSFVKIERPDAKKDKENITDADKDNSDIVKEVDYNPGKTKYNPIKDACWKKSQDVPYLALAKTLEVIEATSARLKMVEILSNYFKSVIALTPEDLLPSIYLCLNQLAPAYKSLELGIAETYLMKAVGQCTGRTLAQMKAAAQRCGDLGLVAEQARATQRTMFAPPPLTVRKVITALRDVAAMTEIKDCSLWKMDEAFSGDFRPGVCQQKNWENPIALCCMQTFRSQISDQVLSTRKRKDASEDQIKVQVCVFVFDLLYLNGEALVREDLEKRRELLRQHFNEVEGEWQFAVSRDCTDEEEVAQFLQESVKASCEGLMVKALRGENARYDIARRSHNWLKLKKDYLEGVGDSVDAVVIGAYHGRGKRTGVYGGFLLACYDPAHEQYQSLCKIGTGFSDEDLRTLSDTLAEHVVDGPRSYYLFDSSHSPDVWFSPSCVWEVRCADLSLSPAHRAALGLVHDSKGISLRFPRFIRVRDDKSAELATSAEQIAELYLRQDQVKNTTNNNQRDDFY-