Monarch geneset OGS2.0

DPOGS207023
TranscriptDPOGS207023-TA3336 bp
ProteinDPOGS207023-PA1111 aa
Genomic positionDPSCF300001 + 1447486-1465391
RNAseq coverage247x (Rank: top 42%)
Annotation
HeliconiusHMEL0156240.050.20% 
BombyxBGIBMGA012964-TA0.078.78% 
DrosophilaCG2025-PA0.036.46% 
EBI UniRef50UniRef50_UPI00022CA9CB0.042.42%UPI00022CA9CB related cluster n=3 Tax=unknown RepID=UPI00022CA9CB
NCBI RefSeqXP_001599332.10.043.90%PREDICTED: similar to metalloendopeptidase [Nasonia vitripennis]
NCBI nr blastpgi|3454788240.042.80%PREDICTED: nardilysin-like [Nasonia vitripennis]
NCBI nr blastxgi|3454788240.042.90%PREDICTED: nardilysin-like [Nasonia vitripennis]
Group
Gene OntologyGO:00468725e-76metal ion binding
GO:00038245e-76catalytic activity
GO:00065084.9e-35proteolysis
GO:00042224.9e-35metalloendopeptidase activity
GO:00082701.2e-16zinc ion binding
KEGG pathway 
InterPro domain[326-572] IPR0112495e-76Metalloenzyme, LuxS/M16 peptidase-like, metal-binding
[121-320] IPR0112379.9e-61Peptidase M16, core
[118-245] IPR0117654.9e-35Peptidase M16, N-terminal
[277-458] IPR0078631.2e-16Peptidase M16, C-terminal
Orthology groupMCL10523 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207023-TA
ATGTCTAAAAGAAATATCTTCCACAGAACACCTAAATTTAAACCTGAAGTGAAAATGGCTAGTGCCCGTCAGAAACAATCTACCAGTATGAATAGCAAGAAGGTGAAAGTTGAAGTTCTTCCTGAACCCATCAAGTCTACATCTGATAAAAAACTATATAAGACAATAAAGCTAGAAAATGGATTGACAGCACTACTCATATCAGATCCCAGCCGACAGTTTGTACCAGAGGAACTGAGTTCTAGCGAAGAAGAATCCAGCGGTACTGATGAAAGCTCAGGACTCGAAAGTGACAGTGGCAAGTCTGGTGGCAGTGACCAACACGGAACTAAGAAGAGAGGGGACTCGGACGAAGAGAAACTAGCAGCTTGCGCGTTATGCGTCGGTGTTGGAAGCTACAGTGACCCTCACGACATTCAAGGGTTGGCCCACTTCGTTGAGCACATGGTGTTTATGGGAAGCGAGAGGTATCCAAAGGAGAATGAATTTGACGCTTTCATTAAGAAAAAAGGGGGTTCGGACAACGCGTCCACGGACTGCGAATTGACGACATTCTACTTTGAGATTCAAGAGAAACATCTTCCGCACGCTATGGACATGTTCAGCCAGTTCTTCGTGAGCCCGCTCATGATGAAGGAGGCCATGCAAAGGGAACGTGAGGCCATCGAATCGGAATTCGCGATCGCCTCTCCGTCCGACTCGAATCGTAAGGACCAGTTGTTGTCAAGTCTGTTCCCGGAGAACCACCCAGCTCGCACATTCACCTGGGGAAACCTGAAGAGTCTCAAGGAGGATATAGACGATGATAACAGACTTCACACTGCAGCTCATGAGTTCAGGAAGAGGCATTACAGCGCTCATAGGATGACTGTAGCGGTTCAGGCTCGCATGGACCTCGCATCACTGGAACAGTACGTGGTGAACACATTCGGTCAGATACCAACAAACAGGCTGCCACCAGAAGACTTCTCCGATTTCAAGTTCAGTCCACGGACCATTACACCGGAGTTCACCAGCATTTACTATGTGAAGCCGGTCAGCGATACTACTGAGGTCCATTTGACTTGGTGTATGCGGTCTCTACTGTCCGAATACGAGTCAAAGCCTCACCAGTACATATCATACCTACTGGGACACGAGGGCAAGGGCAGTTTGCTCTCTTATTTAAGAAAAAAGGTGTGGGCGTTGGCTATATATACTGGCAACTCTGAGAGCGGTATAGACTATACATCCATGTACAGTTTGTTCTCAACGCAAGTGGTGCTGACAGAAGACGGATTAGCAAATATTGACAAAGTGCTGGAAGCGATATTTTCATATATCAATATGCTTAAAAAGCTCGGACCATCTGAGAGGATCTATGACGAAATAAGGACAATAGAAGAGACCAGTTTCCGTTTCGACGAGGAATCTCAGCCGTCGGACTATGTGGAGACATTGTCGGAGAATATGCACTTCTTTCCGCCACAACATTACATAACAGGGGATCGCCTGTACTACAAATATGACCCTAAGGGTATTAAAAGCTTACTCGATCTCATGAGAGCAGACACTGTCAATATAATGATACTCAGCAACAAACATCCTAAGCCGATCAAATATGATAGTAAAGAGAAATGGTTTGGCACGGAGTACAAGAGGGAGGCTATAAACCCGGCGTGGTTGAAGAAATGGTTATCAGTCACGCCCTACAGCCAGTTCCACCTGCCGGAGAAGAACGTGTACATCACGACCAACTTTGATCTCATTCAACCAGCTAAACCATATTTAGAGGAAGCTGAACGTTTGGGGATAGATCTCATCAATAATTCAGCAAAAGATATACACAGGAAGGTAGCTGCGAACGAATTTACAAGCAAGGTCCTTAAACACGGCGAACTTATGGCCACCGTCAATAGATTCAGGCTCGACCAGCCAAACCTCCTTCGCAAGAACCGGCACATGGAGCTGTGGTATAAACCCGATTTTAAATTCCGTTTCCCAACAGCGCTGTTGTACTTCTACTTCATAACACCATTAAGTCTCAAGTCTCCGAGAGAGGCTTGCCTACTTGATCTCTGGAGCGACGTGCTACAACAGGGACTTAAGGAAGACGTCTATCCCGCCAATATGGCGGATCTGACGCATTTGTTGTACGTCACCGACAGAGGTCTGACCCTGAAAATCTCTGGGTACAGTCAGAATCTTCACCTGGTTGTGTCTCTGATATCACGCGCGATGCGCGACTCTGCCCGCATGCCGCACGCTCTGTTCGAGGCTGTGCGCGACGTTCGTGCGAGGACCTACCATAACGTCCTAATCAAACCGCACAAACTGGCCAAGGATGTCCGTATGAGCCTTTTACTGGAGCCCTATATGTCGCCACGTGACAAGGCGACCTTCATACAGAACGTCACTTTGCCGGAACTACAGGACTTCACACAGAAGTTGCTCAATAAGATGTACCTACAGATTCTTGTGCAAGGTAACCTGGCTTGGCACGAGGCTGTGACTATATCAGAGAATGTTTTGAAAACAATAAAATGGGATGGACTAGAACCACACGAGATCCCTGACATCAAAGTTCACCAGTTACCACTTGGAGAGCGTAAAATCCGCGTGGCTAGCCTCAACCCGTCATCAACGAACAGTATCGTCACCAACTACTACCAGGGGGAGAGGAGCACGCCGCAGGAGGCCGCCGCGCTTGAAGTACTAATGATGCTGATGGAAGAACCAGTTTTCGATGCTCTTCGTACTAAGGAGCAGCTTGGATACAGCGTGTTCAGCATGATGCGTTACACCTTCGGCGTGTTGGGCTTCTCGATTACTGTTAACACTCAAGTCGACAAGTTCAGCGTATCCCATGTTGATCGTCGAGTGGAGGCGTTCCTCAAGAAGTTCGCTCGTGATGTGAAGAGGGGTGGGGAGAGGGCGCTGGCGGCGGCCAGGCACGCGCTGGTGCAGCTCAAACATACCGCTGACTACGAGCTCAAGGAAGAGGTTGAGAGAAACTGGCGCGAAATCCTGACCCAAGAATACCAGTACCAACGTCTATTTGTCGAGGCTGACGCCATAGAGAGAATCAAACTGTCTGATATCAAAAACTGGATAGATAACCACTTCCCCTCAGGAAACAGGTCGCAGTTCAGGAAACTATCAGTACAGGTCGTGGGTAACAAGCCGCAAGATGAAAGCGTGGACGGACCTAAAAAACTATCACTAATTTATTCCAATGCCAGCGAGAACAGCGGCGACCCCACAGAGAACGAAGCTGACTTCATCAAAAACATAGAAATATTCAAGACAGACCTGCCTCTCATAAATGTACCGAAAGTTGAATTAGCGCAATGTTAA

Protein sequence:

>DPOGS207023-PA
MSKRNIFHRTPKFKPEVKMASARQKQSTSMNSKKVKVEVLPEPIKSTSDKKLYKTIKLENGLTALLISDPSRQFVPEELSSSEEESSGTDESSGLESDSGKSGGSDQHGTKKRGDSDEEKLAACALCVGVGSYSDPHDIQGLAHFVEHMVFMGSERYPKENEFDAFIKKKGGSDNASTDCELTTFYFEIQEKHLPHAMDMFSQFFVSPLMMKEAMQREREAIESEFAIASPSDSNRKDQLLSSLFPENHPARTFTWGNLKSLKEDIDDDNRLHTAAHEFRKRHYSAHRMTVAVQARMDLASLEQYVVNTFGQIPTNRLPPEDFSDFKFSPRTITPEFTSIYYVKPVSDTTEVHLTWCMRSLLSEYESKPHQYISYLLGHEGKGSLLSYLRKKVWALAIYTGNSESGIDYTSMYSLFSTQVVLTEDGLANIDKVLEAIFSYINMLKKLGPSERIYDEIRTIEETSFRFDEESQPSDYVETLSENMHFFPPQHYITGDRLYYKYDPKGIKSLLDLMRADTVNIMILSNKHPKPIKYDSKEKWFGTEYKREAINPAWLKKWLSVTPYSQFHLPEKNVYITTNFDLIQPAKPYLEEAERLGIDLINNSAKDIHRKVAANEFTSKVLKHGELMATVNRFRLDQPNLLRKNRHMELWYKPDFKFRFPTALLYFYFITPLSLKSPREACLLDLWSDVLQQGLKEDVYPANMADLTHLLYVTDRGLTLKISGYSQNLHLVVSLISRAMRDSARMPHALFEAVRDVRARTYHNVLIKPHKLAKDVRMSLLLEPYMSPRDKATFIQNVTLPELQDFTQKLLNKMYLQILVQGNLAWHEAVTISENVLKTIKWDGLEPHEIPDIKVHQLPLGERKIRVASLNPSSTNSIVTNYYQGERSTPQEAAALEVLMMLMEEPVFDALRTKEQLGYSVFSMMRYTFGVLGFSITVNTQVDKFSVSHVDRRVEAFLKKFARDVKRGGERALAAARHALVQLKHTADYELKEEVERNWREILTQEYQYQRLFVEADAIERIKLSDIKNWIDNHFPSGNRSQFRKLSVQVVGNKPQDESVDGPKKLSLIYSNASENSGDPTENEADFIKNIEIFKTDLPLINVPKVELAQC-