Monarch geneset OGS2.0

DPOGS210623
TranscriptDPOGS210623-TA1329 bp
ProteinDPOGS210623-PA442 aa
Genomic positionDPSCF300168 + 443598-446378
RNAseq coverage9x (Rank: top 85%)
Annotation
HeliconiusHMEL0091671e-13358.06% 
BombyxBGIBMGA013481-TA2e-5061.76% 
Drosophilaspel1-PA7e-2325.53% 
EBI UniRef50UniRef50_E0VH193e-7135.73%Putative uncharacterized protein n=1 Tax=Pediculus humanus corporis RepID=E0VH19_PEDHC
NCBI RefSeqXP_002425413.15e-7235.73%conserved hypothetical protein [Pediculus humanus corporis]
NCBI nr blastpgi|2420092711e-7035.73%conserved hypothetical protein [Pediculus humanus corporis]
NCBI nr blastxgi|2420092713e-6835.50%conserved hypothetical protein [Pediculus humanus corporis]
Group
Gene OntologyGO:00055244.6e-32ATP binding
GO:00062984.6e-32mismatch repair
GO:00309834.6e-32mismatched DNA binding
KEGG pathway 
InterPro domain[194-310] IPR0004324.6e-32DNA mismatch repair protein MutS, C-terminal domain
[1-193] IPR0076961.8e-19DNA mismatch repair protein MutS, core
Orthology groupMCL10692 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210623-TA
ATGTGTGAAAATACCAAAAACGAATATTTGGAACAATTGGCGTCTTTCGACACGAACAAGCTGTATGAGATGGCGCTCTATATGAATAGGATCATAGACTTCGATGCTTCCAAGACTGAAGGCAAATTCACGGTGAAGGCGGGCGTCGACTCCGAGCTCGACATCAAGAAACAAACAATGGCGAGTCTTCACGGCCTGATGACTGAGACGGCGAAGGTGGAGCTGGAGCGGCTGCCGAGCTACATCGAAGAGTGCACCATGATGTACATGCCGCATCTGGGGTACCTGCTGGGGGTCCGGGTGTGGAGGGACGACCTCGCGCCCGGGGACAAGGAGCTGCGACACATGAGGTTCATGTTTCAGAACAACGACTACATACACTACAAGAGCAAAGGATGCGAGGAGCTGGACGTGTTGCTGGGCGACACGTACCCGGAGGTGGCGGCTCACGAGACACGCATCATGATGAGGCTGACGGCCGTACTGCTGGAACACCTGCACACTCTCACCGACCTCGTGGACCGCTGCGCAGAGCTCGACTGTGTGATGACGATATCGAAGGTGTGCAAGGAGTACAGCTTCGTCCAGCCCTCGCTGACCGCGGAGAAGAGGCTGGTGATACAGCAGGGCCGGCACCCGCTGCTGCTGGCGGCCGGCGACCCCGCCGTCCCCAACGACCTGCGCTGCTCCGAGCACCGCGGGTACATCAAGATCATCAGCGGACCCAACTCCAGCGGGAAGTCCGTCTACATCAGGCAGACCGGTCTGATAGTGTACTTGGCGCACATCGGAAGCTTCGTGCCGGCGGAAAGCGCCACCGTAGGCATCGTTACTCACATATTCTCTCGGATTCAATGCACGGAGAGTATCGCCACGCACATGTCCGCCTTCCTGATAGACCTCCGGCAGTTGTTAATCTTAGCACGTGCTAGTTCTCGACGTTGGTTTAACCTGGTAGCTGCCCTCGCCCGCTACCATGTTCCTAATCATCAGATCATCAACTTACACTTCCAGATGTCGCTAGCCGTCCGCACGTGTTCCAGCCGGTCGCTGGTGCTGGTGGAGGAGGCGGGGGCGGGCACGGCGGCGGCTGGAGGCCTGGCGCTCCAGGCGGCCGCGATACACTCGCTGACGTCACTCTCCCCCTTCACCCTACTAGCCACACACAGCGACCTTCGACCATACGTCATAGACAACACACGAGTCACGTTCATGGTGAGGATGAGGGACACAACAACGACCTGGCGCCGCCCATCAGGGGCCACTGATATGAATCAGGGGACAATAGATTCTATTATAATGTACCTTCTTCATCCTCGTACAATCTGA

Protein sequence:

>DPOGS210623-PA
MCENTKNEYLEQLASFDTNKLYEMALYMNRIIDFDASKTEGKFTVKAGVDSELDIKKQTMASLHGLMTETAKVELERLPSYIEECTMMYMPHLGYLLGVRVWRDDLAPGDKELRHMRFMFQNNDYIHYKSKGCEELDVLLGDTYPEVAAHETRIMMRLTAVLLEHLHTLTDLVDRCAELDCVMTISKVCKEYSFVQPSLTAEKRLVIQQGRHPLLLAAGDPAVPNDLRCSEHRGYIKIISGPNSSGKSVYIRQTGLIVYLAHIGSFVPAESATVGIVTHIFSRIQCTESIATHMSAFLIDLRQLLILARASSRRWFNLVAALARYHVPNHQIINLHFQMSLAVRTCSSRSLVLVEEAGAGTAAAGGLALQAAAIHSLTSLSPFTLLATHSDLRPYVIDNTRVTFMVRMRDTTTTWRRPSGATDMNQGTIDSIIMYLLHPRTI-