Monarch geneset OGS2.0

DPOGS215661
TranscriptDPOGS215661-TA4989 bp
ProteinDPOGS215661-PA1662 aa
Genomic positionDPSCF300041 - 1330157-1339068
RNAseq coverage48x (Rank: top 71%)
Annotation
HeliconiusHMEL0141280.059.60% 
BombyxBGIBMGA003556-TA0.064.50% 
DrosophilaCG8950-PA1e-8731.79% 
EBI UniRef50UniRef50_UPI00020624AF4e-13434.02%UPI00020624AF related cluster n=1 Tax=unknown RepID=UPI00020624AF
NCBI RefSeqXP_001946729.11e-12933.33%PREDICTED: similar to DNA ligase IV [Acyrthosiphon pisum]
NCBI nr blastpgi|3287214312e-13334.02%PREDICTED: DNA ligase 4-like [Acyrthosiphon pisum]
NCBI nr blastxgi|3287214312e-13434.02%PREDICTED: DNA ligase 4-like [Acyrthosiphon pisum]
Group
Gene OntologyGO:00062811.7e-86DNA repair
GO:00055241.7e-86ATP binding
GO:00062601.7e-86DNA replication
GO:00039101.7e-86DNA ligase (ATP) activity
GO:00063101.7e-86DNA recombination
GO:00054887.5e-29binding
GO:00036771.6e-24DNA binding
GO:00056221.6e-11intracellular
KEGG pathwaygga:4187646e-132 
 K10777 (LIG4, DNL4)maps-> Non-homologous end-joining
InterPro domain[861-1380] IPR0009771.7e-86DNA ligase, ATP-dependent
[1026-1235] IPR0123106.4e-40DNA ligase, ATP-dependent, central
[633-739] IPR0119907.5e-29Tetratricopeptide-like helical
[1243-1385] IPR0160279.7e-25Nucleic acid-binding, OB-fold-like
[795-987] IPR0123081.6e-24DNA ligase, ATP-dependent, N-terminal
[1245-1381] IPR0123404.6e-20Nucleic acid-binding, OB-fold
[1584-1662] IPR0013571.6e-11BRCT
[1261-1366] IPR0123095.6e-10DNA ligase, ATP-dependent, C-terminal
Orthology groupMCL11519 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215661-TA
ATGGATGTGGATAAAGATCTTACTAATAAATTTTTAAGTGGTGAGATGTCTTTCTCCCAATACTCTAGTGAATGGTATAGTGGAGAAGAGGATGAAGATGAAGATGAGCCAGAGGAATCCAAAAAATATGAAGAAGAAGCTGAAATGTCTACCACAGTTTCAAAGAGAGGTCTTAAACGACAATCCAAGTTCCGTCGCCTCTTTCCTGCATTATCTGGTCTTATGGGAGAAGCAAATATAAGGCTTGCCAGGGGTGATAGTGAAATGGCTGAACGTATGTGCCATGAAATAATCAAACAACAACCCACAGCGGCTGAACCATATCAAACCTTAGCACAAATATACGAACATGATCCCAATAAATCATTGCAGTTTTCTTTGCTTGCTGCACATTTGAGTTTTACAGACAAAAGTGAATGGTGGAGACTCGCTGCATTATGTAGACAGAGAAGTGATTATAAACAGGAAATGGTCTGTTACACTCAGGCTATAAAATCTGAGCCACAAAATTTAGAGACACACTTGAAAAGGCTAGAGTTGTTGTCAGAATTAGAAAAACTACCGGACTTTCCCGTTAATTCACTGAAAGTATCTAAGGTGAAATGTTATCACAAAATTGTACGTTCCTTAGGACCTAGTGATGCTGAAACAATTATGAAGTATGCCAAAATGGCTGCAACTTTATATCACAACAGCACCGAAGTTGAACAAGCAGTTGAAGTGATGGGTATTGCATATAAAAAATGCTTTTCATTATTTACATTGGAGGATATTAATATGTATTTGGAGCTGTTAATTACTCAAAAGCAGTTCACCAAATGTATTGAAGTATTTGTTTCAAGTATAGGTGTGGAAATTGAAGCTGAAATTCAAACAGTGAAAAATGCTAATGGTGATATTGAAGAACAAACACACTACCTTAATTGTGTTATACCCAATAACTTAGCTATAGATTTGAAAAGTAAACTATTGGTGTGCTTTATACATTTAGGAGCACTTAATTTGGTCCAATCATTGCTTAATGATTTTTTGAGCAGTGATGTTGAAAAAGCTGGAGATCTCTATATGGATATAGAAGAAGCATTTTCAGCTGTTGGTCATTATGAGATGGCTATAAAATTATTGGAGCCTCTAATTAAAAATACTAGCTTTGATTTAGGAGCTGTATGGCTTAAATATGCAGATTGCCTGAACAAGTTGGGAAGACATGATGATGCTATAGAATCATATTACAAAGTGTTAAAGCATGTGCCACAACACGCTGACGCGAGGCGAAAGCTGTTTACAATTCTAGAAAACAAAGGAAGAATTGATGACGCTTTGAACATTCTACAGCAGGATTACAAATTTGTCGTCAGCGCTCATCTACTGTTTGATCATTGTCAATACTTAAAGAAATATAATAGAATGTTGAAATATTTGGAGGTAGGTGAAGCTCTGCTATCTCGAGATTTGACAAAATTTAGACATCCCGAAGAATTAAGAATCGCTTGTAGGACAAAGGGTGTGGTAGAACTTATTTACAATTTTCGATCTATGAGAGGCGAAAGTCCTTATCATAAGGATGATTTGCAATTTGAAGAAGAATCTTTTAGCCTTTTACCTAGCGAAGAATTCTTAATGTTCAAAGAACTTTTAAGTATAGCAAAGGAACACAAAATTTATAGCGTTTTACAAAGATTAACATTTATGGGTATGATATCGAAGGGTTTGTCGCATTACCGACCAGAAATGGAATTTTATTGCTTTCAAGCGTGTCTTCTTAATCGAGACTTCCCGAATGCTTGTCGATTTGTTAAAGACTTTTCTCTAAAATATTCCGGACCACGATCCTTTAATTTGCTAAGCTTCATCCTTAATTCTTTGGACGAAAACACTCACGGAAAATTCTTATCGCGACTGTTTCAAAAAGATTTTAATATTGTTAAAAATCTTTTTTTGGGTAATAATTTCCTAGTATCTGGAAGATATCTTGTTGCTCTGAAATACTTTCTTGAATATTACGAGCAGTGCAGAGAGCCTCTGTCAGCGTTACTTATAGCTGTTACTATATTGGCCATGGCAGCTCAGAGAACAGTGGACAGACATCATAATTTAATTTTACAGGGCTTATCTAAGTGCGATCAGGAAGCGTATTACAATATAGGGCGAGCTTATCAAATGTTGAGTATTAATAATCTGGCCATTGAATACTATGAGCGAGCCCTGGCGTGCCCTCCTCTTGCTCAATGTGAAGAACATGGCGTTATAGATTTAACTAAAGAAACTGCTTACAATCTGTATATATTGTATAAAGACCAATCCCCAGAATTAGCTCGGCGATATATGTTTAAAGTATTGTTTGAAATGAATACAACTGGTGTAATTACTCCAGCGGATGATGTATTATTCGGTGATCTATGTTCATTGTTAGAACAATTGCAGAAGAAGAAGAAACACAGACCAGAACAAAATAAACTATTATCAAATTTTGTTGACCAATTTAGGTTAAAATTAGCTAACACGCAGGGCTCCAAGAACTCTACTTTTTTTCCCATATTGAGGTTGCTCTTGCCAAGTTGTGATCGGGAACGTGGTCCCTACAACCTTAAAGAAACCAGACTAAGTACTTTATTGGTAAAAGTACTGTCTCTCAATAAAGAGTCGACAGATGCGAAACAACTGATACATTTTAGTTCTTCAAATAACTCAGTTCTAGATAGCGACTTCCCTGGTGTCGCGTTTTACGTTATAAAGAAAAGAGTTGGTCAGAATAATTCAGTATTGACAGTCAGAGAGATCAATGAGATACTTAACTCTGTTGCAACTGTAGATAATGTTCATAAAACTCCATTGGATGAAATTTTTAGTTATGCTTTAAAAAAACTGACTGCCATCGAATTCAAATGGCTTCTGAGAATAATATTAAAGGATTTAAAATTAAGTATGAGTGCAGATCGAATCTTGGGGATTTTCCATCCAGATGCCCCAGAGGTCTTCAAGAACTGCAGCAGTATTTTAAAGGTGTGCGAAGAATTAGAAGATGGCGACACTCGACCATCAGAACTGGGCGTCAATTTGTTCTACGCTGTAAGACCAATGCTGTCTGAGAGGTTGGACATCACACACATACACGTCTTGGATAAGACGAAGACCTACTGTATGGAGGAGAAGTTTGATGGTGAGAGATTCCAGATGCACATGGATAACAACGTATTTGAATACTTTTCACGGAAAGGTTTCAAGTACTCCAAAAACTATGGGCAAAGTTACGACTCCGGCATGTTAACGCCGTATTTGAAGGATATTTTTGCTCCTGAGGCGAGGAATTTCATTCTTGACGGTGAAATGATGGGTTGGCACAAAATAGATAATTATTTCGGATGCAAAGCGATGTCATACGATGTTAAGAAAATCACAGAGAACAGTTCGTTCCGCCCTTGCTTTTGCGTGTTTGATATTCTATATTATAACGACAGACCACTCATCGGCTCGCCAGATAAGGGCGGTTTACCTTTACGGGAACGACTCAAAATACTCGACGATCTATTCATAGACAAGCGAGGTGTTATAGAACATAGCAAGCGAAAAATTATCAAAGAAAGTTCAGAAGTTGTGGACGCCGTCAACGATGCCATAGACAATCAGGACGAGGGTATTGTAGTTAAAGATATAAATTCATACTACATCGCTAACAAAAGAAACGCTGGCTGGTACAAAATAAAACCGGAGTATACGGACGACACCATGAATGACCTAGACCTGGTGGTGGTTGGTGCTGATGAAGCCACCAACAAAAGACAGGGGCGTGCCAAAAGTTTCTATGTCGCGTGTGGGGATAACAATGATGGCGACCCTGTCTGGACCTGCATTGGCCGCGTGTCTAACGGACTGAAGCACGAGGAGAAGGAACGCGTTTGTTCATTACTTGAACGGAACTGGTGTATGTATAGGAAAAAACCTCCGCCTCCCTGTCTGCGCTTCGGCAAAGACAAGCCGGACTTCTGGATACTTCCAGAACATTCTATCGTATTGCAGGTGCGTGCCACCGAGCTGTTAAGCGTTGGGGACTCACACGTGCTGCGATTCCCGCGCGTGGAAGATATAAGATCAGACAAGCCGGTCGATGACGTGTGCACAATACACGAACTTAGACAACTGGCTGTGAGCAGAAGCCCGGTCAGTAAGCTAAGTACAAAGCGCGTAAACGAATCGCAAATAGATCAAAACTATATTAAAACACGCAAGCGCGGTCTGTCTAAGACCGTCCAAGTAGCGGAAAAATTCCGCACAAAGACGATTGGAGACGTGCAAGTTATATCACGAGCTTTGTTTGGGAAGAAACTTTGTGTGTTGTCGGATGACGAGGATTGTAAGAAAACGGAATTGAAACGCGTCATAGAGTCCCACGGAGGGAGACACGTTGAGAACCCAGGTTCAGATACTTGGTGCTGTGTAGTGGGAACTATAACACCGCGAGCCCGTAGACTCATAGAGACACAAGACCTAGACATCATTAGCACAGCCTGGCTCAGAAGCCTACCAGCGACAGACGACCCGTGTCAACTGTCGCCATTGGACATGCTATCAATCAAACCCGAAACGAAGCTCAAACTGAGCCTAGACTACGACCCCTTCGGTGATAGTTACAAGGATGAAATAGATGAAAAAACATTGAAGAAACTGCTGGACAAAATGGATTCGGAGTTCCCGTTGTATCCAACTTTAAAAGAAAAAGTCTGTCTGGATAAACAATTATTCGGCGCCAACAATCCTTACTCATTTTTGAGGAATTGTTTCATTCACGTTATTGACAATTCGCTTTACGAAACTATGGCGTCCTTTTTCGGAGCCAAAATCTGTTCTCTCGATGACGTCAGACTGACGCACGTCGTTATGTCAAAAGACGCGAATGTCAAAATAGATAAAGGAATTCTAGTGTCGGATGGATGGTTGGAAGAATGTTTTAACAAAAGGAGTTTTGTTCCTGTCGATGATTATCTAATTTAA

Protein sequence:

>DPOGS215661-PA
MDVDKDLTNKFLSGEMSFSQYSSEWYSGEEDEDEDEPEESKKYEEEAEMSTTVSKRGLKRQSKFRRLFPALSGLMGEANIRLARGDSEMAERMCHEIIKQQPTAAEPYQTLAQIYEHDPNKSLQFSLLAAHLSFTDKSEWWRLAALCRQRSDYKQEMVCYTQAIKSEPQNLETHLKRLELLSELEKLPDFPVNSLKVSKVKCYHKIVRSLGPSDAETIMKYAKMAATLYHNSTEVEQAVEVMGIAYKKCFSLFTLEDINMYLELLITQKQFTKCIEVFVSSIGVEIEAEIQTVKNANGDIEEQTHYLNCVIPNNLAIDLKSKLLVCFIHLGALNLVQSLLNDFLSSDVEKAGDLYMDIEEAFSAVGHYEMAIKLLEPLIKNTSFDLGAVWLKYADCLNKLGRHDDAIESYYKVLKHVPQHADARRKLFTILENKGRIDDALNILQQDYKFVVSAHLLFDHCQYLKKYNRMLKYLEVGEALLSRDLTKFRHPEELRIACRTKGVVELIYNFRSMRGESPYHKDDLQFEEESFSLLPSEEFLMFKELLSIAKEHKIYSVLQRLTFMGMISKGLSHYRPEMEFYCFQACLLNRDFPNACRFVKDFSLKYSGPRSFNLLSFILNSLDENTHGKFLSRLFQKDFNIVKNLFLGNNFLVSGRYLVALKYFLEYYEQCREPLSALLIAVTILAMAAQRTVDRHHNLILQGLSKCDQEAYYNIGRAYQMLSINNLAIEYYERALACPPLAQCEEHGVIDLTKETAYNLYILYKDQSPELARRYMFKVLFEMNTTGVITPADDVLFGDLCSLLEQLQKKKKHRPEQNKLLSNFVDQFRLKLANTQGSKNSTFFPILRLLLPSCDRERGPYNLKETRLSTLLVKVLSLNKESTDAKQLIHFSSSNNSVLDSDFPGVAFYVIKKRVGQNNSVLTVREINEILNSVATVDNVHKTPLDEIFSYALKKLTAIEFKWLLRIILKDLKLSMSADRILGIFHPDAPEVFKNCSSILKVCEELEDGDTRPSELGVNLFYAVRPMLSERLDITHIHVLDKTKTYCMEEKFDGERFQMHMDNNVFEYFSRKGFKYSKNYGQSYDSGMLTPYLKDIFAPEARNFILDGEMMGWHKIDNYFGCKAMSYDVKKITENSSFRPCFCVFDILYYNDRPLIGSPDKGGLPLRERLKILDDLFIDKRGVIEHSKRKIIKESSEVVDAVNDAIDNQDEGIVVKDINSYYIANKRNAGWYKIKPEYTDDTMNDLDLVVVGADEATNKRQGRAKSFYVACGDNNDGDPVWTCIGRVSNGLKHEEKERVCSLLERNWCMYRKKPPPPCLRFGKDKPDFWILPEHSIVLQVRATELLSVGDSHVLRFPRVEDIRSDKPVDDVCTIHELRQLAVSRSPVSKLSTKRVNESQIDQNYIKTRKRGLSKTVQVAEKFRTKTIGDVQVISRALFGKKLCVLSDDEDCKKTELKRVIESHGGRHVENPGSDTWCCVVGTITPRARRLIETQDLDIISTAWLRSLPATDDPCQLSPLDMLSIKPETKLKLSLDYDPFGDSYKDEIDEKTLKKLLDKMDSEFPLYPTLKEKVCLDKQLFGANNPYSFLRNCFIHVIDNSLYETMASFFGAKICSLDDVRLTHVVMSKDANVKIDKGILVSDGWLEECFNKRSFVPVDDYLI-