Monarch geneset OGS2.0

DPOGS208174
TranscriptDPOGS208174-TA2598 bp
ProteinDPOGS208174-PA865 aa
Genomic positionDPSCF300207 - 162129-171026
RNAseq coverage141x (Rank: top 55%)
Annotation
HeliconiusHMEL0157206e-14054.13% 
BombyxBGIBMGA010261-TA0.073.96% 
DrosophilaMsh6-PA2e-14954.43% 
EBI UniRef50UniRef50_E2BJ161e-14953.57%Probable DNA mismatch repair protein Msh6 n=7 Tax=Formicidae RepID=E2BJ16_HARSA
NCBI RefSeqXP_001600292.12e-15755.11%PREDICTED: similar to DNA mismatch repair protein muts [Nasonia vitripennis]
NCBI nr blastpgi|3838476937e-15753.69%PREDICTED: probable DNA mismatch repair protein Msh6-like [Megachile rotundata]
NCBI nr blastxgi|3838476931e-15054.67%PREDICTED: probable DNA mismatch repair protein Msh6-like [Megachile rotundata]
Group
Gene OntologyGO:00055245.6e-108ATP binding
GO:00062985.6e-108mismatch repair
GO:00309835.6e-108mismatched DNA binding
KEGG pathwaynvi:1001156135e-157 
 K08737 (MSH6)maps-> Colorectal cancer
    Pathways in cancer
    Mismatch repair
InterPro domain[69-848] IPR0155361.3e-254DNA mismatch repair protein MutS-homologue MSH6
[634-832] IPR0004325.6e-108DNA mismatch repair protein MutS, C-terminal domain
[377-589] IPR0076961.6e-28DNA mismatch repair protein MutS, core
[196-309] IPR0161511.7e-18DNA mismatch repair protein MutS, N-terminal
[205-264] IPR0076955.9e-15DNA mismatch repair protein MutS-like, N-terminal
[452-542] IPR0078613.8e-14DNA mismatch repair protein MutS, clamp
[257-347] IPR0078602.6e-09DNA mismatch repair protein MutS, connector
Orthology groupMCL13947 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208174-TA
ATGTCAAAACGTAATTCAAATCCTGGTGCGAATACACTCTTCAATTATTTTACTAAAACTCCGCCTTGCAATAAAAAACTAAAACCAAGTGAGGATTCTGAAGCCGATAATGTTTTAAATTCCCCCGTGAGCAGTAAAAAAGGGAATAAAACTGAAAGCAAGAAGCGAGAAAGACAAGCGACACCTTCACCCGATCCGAGAAAGAGTGATAGTGAAGACGACGTTCCAGTTGTTGTTAAAAAGAGGAAAAGAATTAGACTTAATCCAGTTGACTCCGATGACTCTGATATTGAAAACAAAGTAGATAATAAGATTGGTTCACCAGAGGATAAAGTTTCTTTATCCACCAGGAAGTTGCAGGATAATTTCACTTTTGGATCTCCTAAGAGTGCATCACCTAAAGTCACCAAGACTAAAAAAAATCCAAATGAAGCTCAGCCCACCATCAAAGAAGAACCAACGTCCCAATACACAGAGGATGGTAACTGGGTTCATTGTAAACTGGATTGGCTGAAACCGGAGAAAATCAGAGATGCTACGAAAAGGAAACCAGATCATCCCGACTATGATCCCAGCACTTTATATGTTCCGCCGGACTTTATGAAGAGTCAGACACCAGCTCACAGGCAATGGTGGGAAATGAAGTCTAAGTACTATGACTGCGTATTGTTCTTCAAAGTTGGAAAATTCTATGAGCTGTATCACATGGACGCCGCTGTTGGGGTCAATGAGCTCGGATTCTCTTATATGAAGGAATACAATGGCGTTAGCAAGTACGGCGTTTGTTTCGTAGATACGACGACAGGACAGTTCTACATCGGTCAGTTTGAGGATGACAAACATTCATCTCGTCTCCTCACCACCGTTGCACATTATCCGCCAGCTTTAATTGTATTCGATCGTAAAACAACAAGTGCTCGTACAAGTAGACTGCTGTCAACGCATTGTCACAGCGCGAGACGTGAACCCACTACACTGTGGGCTCCCGAAAAGACTTTGAAGATTCTAGCTGAGAAATATTATAAAACTGACGGCGACGGAAAATGGCCTACCGGGATTACGCCTTTCCTACACGAGGAGCAAAAATGTCATCCGGACTCCAGAGCTATATTTTATGAAGAAAAAACTTATTCGAAGAGAAAAGTATTGGATTTCATACTATTGTTGAACGGGTTCACGTCTATATTGAAGCTGGTTGACTTATTCTCCGATGTGGATGCAGAGTTACTGAAGAAATTAACCCAATTTGCTCCGGAAGGCAGATTTCCTGATTATAGAGATACTTTGAAATTTTTCAAGGAGGGTTTCAACCAACAAGAGGCGGAGAAAGAAGGTCGTATACTACCTGGTAGCGGTGTTGACGCAGACTACGACAACACTATACAACTCATACAGAACATACAGGATGAATTGAAGGAATACTTGAGTGAGCAGGAGAGATACTTCAAATGTCGGTTAACGTATGTTGGAAGTGATAAGAAACGTTATCAAATAGAAGTTCCACAGAGCGCAGCGGGGAAGGCAAATTCTGATTATCATCTAGAAGGTGCTAGGAAAGGATTCAAGAGATATTCAACAGTTGAAACAAAGGATCTGCTGGCGCGAATGATAGCCGCCGAGGAAAAGAAAAGTAACGTACTGAAAGATCTTAGCAGACGGATGTTCGAGAAGTTCTCATCGCATCAGCACCAGTGGGAAATGGCCACCAAATGTGTCGCCACTATCGATATATTGTTAGCATTCACAGAGTTCGCTAGGCAACAGACTGGGGATATCTGTCTACCGGAAATCACGTACAATAAGGACCAAGAGCCCTACATAGACATAGTGGAGGGTCGCCACCCGTGTATTTCTATACCAGAGTTCATTCCTAATGATACGAGGCTGGGTGTTGACAACCCTCGCCTGCTGCTGCTGACTGGTCCCAACATGGGCGGCAAGTCTACACTCATGAGACAAGTCGGACTCCTCACCGTGTTAGCGCATCTGGGCTGCCACGTACCAGCTTCAGAATGTCGTCTGAGTGTGTGTGACCGTATCTTCACCAGACTGGGGGCCTCGGATGATATTCTGTCCGGTCAGTCGACGTTTTTGGTTGAAATGAATGAGACAGCGGCCATAGTGAAGCACGCGACCAAACACTCGCTGGTACTACTGGATGAATTAGGTCGCGGTACATCTACATACGATGGTACGTGCATCGCGTGGTCAGTATGCTGGTGGCTGGCTGGCCGGTCGTGTCGCACGCTGTTCTCAACTCACTATCACTCGCTAGTCCATCACCTGGCTGATCATCCCGCCGTACTTTTAGGACATATGGCGTGCATGGTAGAGACCGACGAATCTGCCCCGGATGGTGACCATATACCGGAGGAAACGATAACCTTTTTGTACAAACTCTCCCCCGGTGCCTGTCCGAAGTCATACGGCTTCAACGCGGCGCGGCTAGCGGGGATCCCCCGGGAAATAACGCAACGCGCACACACGATATCACGCAACCTGGAGAGCGAGGCGACGTGTGTACGCGCCTTTAGAGATGTCATCAAAACGGACAACGCGGCTGAGTTGAGGAAAATATTGTCAGCCCTGACCATATAA

Protein sequence:

>DPOGS208174-PA
MSKRNSNPGANTLFNYFTKTPPCNKKLKPSEDSEADNVLNSPVSSKKGNKTESKKRERQATPSPDPRKSDSEDDVPVVVKKRKRIRLNPVDSDDSDIENKVDNKIGSPEDKVSLSTRKLQDNFTFGSPKSASPKVTKTKKNPNEAQPTIKEEPTSQYTEDGNWVHCKLDWLKPEKIRDATKRKPDHPDYDPSTLYVPPDFMKSQTPAHRQWWEMKSKYYDCVLFFKVGKFYELYHMDAAVGVNELGFSYMKEYNGVSKYGVCFVDTTTGQFYIGQFEDDKHSSRLLTTVAHYPPALIVFDRKTTSARTSRLLSTHCHSARREPTTLWAPEKTLKILAEKYYKTDGDGKWPTGITPFLHEEQKCHPDSRAIFYEEKTYSKRKVLDFILLLNGFTSILKLVDLFSDVDAELLKKLTQFAPEGRFPDYRDTLKFFKEGFNQQEAEKEGRILPGSGVDADYDNTIQLIQNIQDELKEYLSEQERYFKCRLTYVGSDKKRYQIEVPQSAAGKANSDYHLEGARKGFKRYSTVETKDLLARMIAAEEKKSNVLKDLSRRMFEKFSSHQHQWEMATKCVATIDILLAFTEFARQQTGDICLPEITYNKDQEPYIDIVEGRHPCISIPEFIPNDTRLGVDNPRLLLLTGPNMGGKSTLMRQVGLLTVLAHLGCHVPASECRLSVCDRIFTRLGASDDILSGQSTFLVEMNETAAIVKHATKHSLVLLDELGRGTSTYDGTCIAWSVCWWLAGRSCRTLFSTHYHSLVHHLADHPAVLLGHMACMVETDESAPDGDHIPEETITFLYKLSPGACPKSYGFNAARLAGIPREITQRAHTISRNLESEATCVRAFRDVIKTDNAAELRKILSALTI-