Monarch geneset OGS2.0

DPOGS210622
TranscriptDPOGS210622-TA1356 bp
ProteinDPOGS210622-PA451 aa
Genomic positionDPSCF300168 + 439182-443130
RNAseq coverage14x (Rank: top 82%)
Annotation
HeliconiusHMEL0091675e-9766.40% 
BombyxBGIBMGA013481-TA6e-6453.54% 
Drosophila% 
EBI UniRef50UniRef50_E2AZD33e-4032.08%MutS protein-like protein 5 n=5 Tax=Formicidae RepID=E2AZD3_CAMFO
NCBI RefSeqXP_001122595.18e-4633.98%PREDICTED: similar to mutS homolog 5 isoform c [Apis mellifera]
NCBI nr blastpgi|3071676691e-3932.08%MutS protein-like protein 5 [Camponotus floridanus]
NCBI nr blastxgi|3071676693e-4032.08%MutS protein-like protein 5 [Camponotus floridanus]
Group
Gene OntologyGO:00055242.2e-16ATP binding
GO:00062982.2e-16mismatch repair
GO:00309832.2e-16mismatched DNA binding
KEGG pathway 
InterPro domain[224-347] IPR0076962.2e-16DNA mismatch repair protein MutS, core
[38-235] IPR0078604.1e-06DNA mismatch repair protein MutS, connector
Orthology groupMCL26079 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210622-TA
ATGACTGTGGAGAAAAGTAATGTTACAGAGTTCACAAGTGGACGTACGGAAGAAAGATATGTGATTGGAAAATGCCACAGAAACACTGAGAAAGGGGAAACAGCAGAGGGTATGCAAGAAACCACCGAGGAAACTGATAATGAAGAGTGTATACTAACAGTGTGCTGTCGAGCGGGCAGGATGGGAGCCGCGACCTATACTGTAGGGACCGGAGAGTTACACATACTGGAAGAGATAGTGGACCGTCCGCCGGATCATCAGCTGTTCCAGGCGTTGTTCCACCAGACGGAGCCGGTCAGGGTGGTGGTGGACGGCAAGGCTCAGGGCTCCTTCATCGCCGTCCTCAAGAAACTGGTGTTCAGCGACGACACTGAAGCCAAGTGCAAGCTCGACATGGTTTCCGCCAAGGAATACAATTTCGAAGCATGTAAGCGACGTATTTTCTCCCTCTCCCTCCCCCACGAGCCCGCCAACTGTTCCGACGAAGAGAGAACTCTGTTCGTGCGGACGGTGGTAGACTTTTCCCAGACTCAGACCGTCCACGCGCTGGGTGCCATGCTCAGGTACCTGGACCTGAACTGGTCCAACTGGAGCATGAACCTCCACAGCAGACCGGAATATCTCAGCTTGAAGAGAATATCTTTACAAGACATCGTATCGATAGACGAGGACACGTACAAAGGACTCCAGATCTTCAGTTCTCTGTCCCACCCCAGCGGCTTCAAGAGAGGAGTTCGAGGGACCAACAAGGAAGGACTCAGTCTGTTTCAGCTGTTTAGCAGATGTTCCTCCAAAGTTGGTCATCGACGTATGAGAGTGTTCCTGCGACACCCAACGACGGACCTTAAAATTTTAAAGAGACGACAACAAGCAATCGCCTTCTTCATGAGACCGCAGAGTGACTCCCTGTTCAGGAATATATGCGCGTCTCTGAGATTTGTGAAAAACGTCAATGGTATCCTCACTAAAATAAAAGCATTATCAGCTAAACCGTATCAATGGAAATCCTTGTACAACTGCCACTCTTATCACTCCTACATCCCGTCTAACACACTTCACCCATCTCCTCTCTTCACCTCTCTATCCACCAACATCAACACAACAACAACATCTACTCTCCTAACTTTCTCACTCACATTCCTATCTGCCCTTTCTTGTCACTCCACATATCCATCTCAAAATCTTATTAATGTCAACTTTCCCCAATCTCCGTTCTTCTCGTCTAGTCTTCAAGTCATAGTCACATTCTATAGCGAAAACGATGTCGTCGACAAAAAAGTTACACCAGGACACCTTCTGATTAACGGCTGTCATGGTATCAATTACCAACAGGAAGAGAAGAGGGCTTTAGGCTGA

Protein sequence:

>DPOGS210622-PA
MTVEKSNVTEFTSGRTEERYVIGKCHRNTEKGETAEGMQETTEETDNEECILTVCCRAGRMGAATYTVGTGELHILEEIVDRPPDHQLFQALFHQTEPVRVVVDGKAQGSFIAVLKKLVFSDDTEAKCKLDMVSAKEYNFEACKRRIFSLSLPHEPANCSDEERTLFVRTVVDFSQTQTVHALGAMLRYLDLNWSNWSMNLHSRPEYLSLKRISLQDIVSIDEDTYKGLQIFSSLSHPSGFKRGVRGTNKEGLSLFQLFSRCSSKVGHRRMRVFLRHPTTDLKILKRRQQAIAFFMRPQSDSLFRNICASLRFVKNVNGILTKIKALSAKPYQWKSLYNCHSYHSYIPSNTLHPSPLFTSLSTNINTTTTSTLLTFSLTFLSALSCHSTYPSQNLINVNFPQSPFFSSSLQVIVTFYSENDVVDKKVTPGHLLINGCHGINYQQEEKRALG-