Monarch geneset OGS2.0

DPOGS207757
TranscriptDPOGS207757-TA1860 bp
ProteinDPOGS207757-PA619 aa
Genomic positionDPSCF300042 - 383615-396013
RNAseq coverage962x (Rank: top 13%)
Annotation
HeliconiusHMEL0119550.071.33% 
BombyxBGIBMGA005310-TA0.069.49% 
DrosophilaMmp2-PB3e-6749.10% 
EBI UniRef50UniRef50_D6WYY92e-12746.89%Matrix metalloproteinase 2 n=9 Tax=Coelomata RepID=D6WYY9_TRICA
NCBI RefSeqXP_969495.11e-12746.89%PREDICTED: similar to matrix metalloproteinase [Tribolium castaneum]
NCBI nr blastpgi|3503989896e-13753.42%PREDICTED: matrix metalloproteinase-15-like [Bombus impatiens]
NCBI nr blastxgi|3503989892e-15654.10%PREDICTED: matrix metalloproteinase-15-like [Bombus impatiens]
Group
Gene OntologyGO:00310129.3e-48extracellular matrix
GO:00065089.3e-48proteolysis
GO:00042229.3e-48metalloendopeptidase activity
GO:00082709.3e-48zinc ion binding
GO:00082379.9e-40metallopeptidase activity
KEGG pathway 
InterPro domain[371-570] IPR0005853.6e-65Hemopexin/matrixin
[56-206] IPR0240793.6e-60Metallopeptidase, catalytic domain
[57-202] IPR0018189.3e-48Peptidase M10, metallopeptidase
[37-203] IPR0060269.9e-40Peptidase, metallopeptidase
[69-84] IPR0211908.5e-34Peptidase M10A, matrix metallopeptidase
[526-566] IPR0184873e-09Hemopexin/matrixin, repeat
Orthology groupMCL12459 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207757-TA
ATGAAGCGACTCCGGGGTGACAACTTTTCGATGGTTCCGTTAAATCTTATTGCGCGTTTAGGTCTCGCATTGTTGGCGTGCATCACGAGGGAAGTAATCCCATCGGGTTCCGCCGGATATATCTCCCAGCGCGGATCTACAACTAATCGTCTGTCACAAGTTCGCCGACCATCGCAACTGGATCCTCACGGCACGAGGAGTGTGCTGGCTCGTGCCCTCGACATATGGGAACAAGCCTCCAGGCTCACCTTCACGGAGATCAACTCGGATGAAGCTGACATTGTTGTTTCCTTCGCCAAACGCTACCACGATGACGCCTATCCTTTCGACGGCCGTGGCTCCGTGTTGGCTCACGCCTTCTTCCCTGGAACTGGTCGAGGAGGTGACGCTCACTTTGACGACGACGAACTGTGGCTCTTGAGACCGAACAACGACGATGAGGAAGGTACGTCGCTGTTTGCCGTGGCCGTTCACGAATTCGGACACTCGTTGGGCTTGAGCCACAGTTCCGTCAAAGGAGCACTTATGTTCCCCTGGTACCAAGGCTTCCAACCAAACTTCGTTTTACCGGAAGATGACAGGAATGGTATTCAACAGATGTATGGTCCGAAGGTAAAGAAGACTTGGGCGAAGATACCTTATTATAGACCAGCTGAAACGCCACCAACCACAACGACTACTACCACCACCACCACAACAACAACCAGAAGACCATACATTCATCAACATCACCCAGAACGACATCCGAACCATCGTCCTTACACACCATACCCTCGTCCTCCAAATAGAAATCCAGTATATTACCCCGAACGTCCCACATTACCCGACAGACCTCATCACCCTGAAAGAAATTACCCCGATCGTAACCCCTACTACCCTGATCGTCGTCGTTATAACACTACGGAAGAACATCCCAGAAGAACGAACCACAATCACTATCCTCGACCAACAGAGACCACGACGCACGCTACAACTTATCGCCCACGTTACCCGCAGTCCCGACCAGAATATCCGAGCCATCCAAGACAGAATTACCCAACTGACCCGAGTCAAGATTACCCCAAAAGAAAACCCACTTATCCGGTTAAAACGACTACCATCAAACCGACCCCACCTGCCGACAAACCCGACACATGCGACACTAGTTACGACGCTATATCTCTAATACGTAATGAGCTCTTCATTTTTAAGAATAAGTATCATTGGAGGATTGGAGCCGACAGACGGTACACGGGCTATCCAATCGAGATTACTAGAATGTGGACCGGTTTACCAAAGAATTTAACTCACGTGGATGCCGTTTATGAAAGACCTGATCGGAAAATAGCTTTCTTTATTGGTAAGGTTAACATGAACAAACTAAGTGTATACTTGATGCCGGGATATCCCAAGAACCTCGCTCAACTCGGCTTACCGGAGAGCCTTGAGAAGTTAGACGCGGCCATGGTTTGGGGCTACAACGGAAAAACTTACTTCTACAGTGGCACCATGTACTGGAAATTCGACGAAGATCTGGGACGAGTTGAACTCGACTACCCTCGTGATATGGCTATGTGGAAGGGAGTTGGATACAACATAGATAGCGTGTTCCAATGGAAAGACGGCAAGACATACTTCTTCAAAGGCAAAGGTTTTTGGAAGTTCAACGATCTACAGATGCGAGTCGAGAACGAACGTCAAACCCCGTCCGCGCCTATCTGGATGTCATGTCCTATCGAACGGACGGGACGACGAGCGCCGTTCAGAGCACTCCCAGCCCCGGGATCAACACTACGCTCTCCCAGCCGAGCCACTCTCAACAAACATAGTCTACTGCCGTATCTAGCTTCTCTAATTATTATACTAGCCCGCTCTCTTTAG

Protein sequence:

>DPOGS207757-PA
MKRLRGDNFSMVPLNLIARLGLALLACITREVIPSGSAGYISQRGSTTNRLSQVRRPSQLDPHGTRSVLARALDIWEQASRLTFTEINSDEADIVVSFAKRYHDDAYPFDGRGSVLAHAFFPGTGRGGDAHFDDDELWLLRPNNDDEEGTSLFAVAVHEFGHSLGLSHSSVKGALMFPWYQGFQPNFVLPEDDRNGIQQMYGPKVKKTWAKIPYYRPAETPPTTTTTTTTTTTTTRRPYIHQHHPERHPNHRPYTPYPRPPNRNPVYYPERPTLPDRPHHPERNYPDRNPYYPDRRRYNTTEEHPRRTNHNHYPRPTETTTHATTYRPRYPQSRPEYPSHPRQNYPTDPSQDYPKRKPTYPVKTTTIKPTPPADKPDTCDTSYDAISLIRNELFIFKNKYHWRIGADRRYTGYPIEITRMWTGLPKNLTHVDAVYERPDRKIAFFIGKVNMNKLSVYLMPGYPKNLAQLGLPESLEKLDAAMVWGYNGKTYFYSGTMYWKFDEDLGRVELDYPRDMAMWKGVGYNIDSVFQWKDGKTYFFKGKGFWKFNDLQMRVENERQTPSAPIWMSCPIERTGRRAPFRALPAPGSTLRSPSRATLNKHSLLPYLASLIIILARSL-