Monarch geneset OGS2.0

DPOGS208618
TranscriptDPOGS208618-TA3426 bp
ProteinDPOGS208618-PA1141 aa
Genomic positionDPSCF300052 + 619605-642191
RNAseq coverage200x (Rank: top 47%)
Annotation
HeliconiusHMEL0165850.077.52% 
BombyxBGIBMGA005722-TA0.081.17% 
DrosophilaNep3-PA0.055.56% 
EBI UniRef50UniRef50_Q16Q770.060.17%Endothelin-converting enzyme n=5 Tax=Neoptera RepID=Q16Q77_AEDAE
NCBI RefSeqXP_317711.10.057.86%AGAP007796-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|312264550.057.86%AGAP007796-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|312264550.057.86%AGAP007796-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00082371.2e-121metallopeptidase activity
GO:00065081.2e-121proteolysis
GO:00042221.1e-62metalloendopeptidase activity
KEGG pathway 
InterPro domain[282-1141] IPR0007180Peptidase M13, neprilysin
[375-771] IPR0087531.2e-121Peptidase M13
[930-1141] IPR0240796.9e-100Metallopeptidase, catalytic domain
[937-1140] IPR0184971.1e-62Peptidase M13, neprilysin, C-terminal
Orthology groupMCL11436 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208618-TA
ATGACGACTCCCGACTCTCCGTCCCAATCGACCGGGTCATCCGAGGACACCCCTACACCAGGACCAGAAAAGATAAATAAATTCAATTTCCTCAAACACGCTCATAAGGCGAGGAACAACCCGTTCCTCAAAGCACGACCGACACTCAACAGAGAAAGCTCTACAGCCGAATTATTTTCAAGAGAAAGCTCTAGTTCCGATTTCTTCCAGAATTCTAGGTTAAATCTATTTGATACAACACCGAAACAAGAAGCAAATGATGGCTATGGGAAGGGAGTTTCCTTCATAGGTTGGCATCAAAGAACGAAGTTGGAAAGAATTCTTCTCGTTATAACAGCTTTTCTTCTGGTGGCTATTCTTCTTACCACCTGTTTATTGCTGACCATCAAGGGACCGTCCTATATACCTGTGCCACCATGTTACAACCCTGAACCCAAAGAAGAGAGCAGCGTATGTTTATCAGGTTCGTGTATTTACACAGCCAGCGAGGTTATTAGAGCTTTGGATGAAACACAGGATCCCTGCGAGGATTTCTACGATTTCGCATGTGGTGGATGGTTGAAAAACAATCCCATACCAGAAGGGAAATCGAGCTGGGGTATATTCAGCAAGATTGAACTACAGAATCAACTAATTATTCGCTCGGCAATCGAAAAAGTTAACGTATCTGATAAGAACAGCGCTGAAACAAAAGCAAGAATATACTACGACGCGTGCATAGACGGGAATGAGACGATAGAGAAACTGGGTGAGAAACCTCTCATCAGTGTGATAAAGAAGCTGGGGGGATGGCATCTTGTAACGAACACGCTGGTTAAGCAGAGGAAATGGGATCTACAGAGACTCCTACAGGATGTTCAGAATACGTTAGTTGGAGTTTCCTTCATAGGTTGGCATCAAAGAACGAAGTTGGAAAGAATTCTTCTCGTTATAACAGCTTTTCTTCTGGTGGCTATTCTTCTTACCACCTGTTTATTGCTGACCATCAAGGGACCGTCCTATATACCTGTGCCACCATGTTACAACCCTGAACCCAAAGAAGAGAGCAGCGTATGTTTATCAGGTTCATGTATTTACACAGCCAGCGAGGTTATTAGAGCTTTGGATGAGACACAGGATCCCTGTGAGGATTTCTACGATTTCGCATGTGGTGGATGGTTGAAAAACAATCCCATACCAGAAGGGAAATCGAGCTGGGGTATATTCAGCAAGATTGAACTACAGAATCAGCTAATTATTCGCTCGGCAATCGAAAAAGTTAACGTATCTGATAAGAACAGCGCTGAAACAAAAGCAAGAATATACTACGACGCGTGCATAGACGGGAATGAGACGATAGAGAAACTGGGTGAGAAACCTCTCATCAGTGTGATAAAGAAGCTGGGGGGGTGGCATCTTGTAACGAACACGCTGGTTAAGCAGAGGAAATGGGATCTACAGAGACTCCTACAGGATGTTCAGAATACTTATAATCTGGGAGGATTCTTCAATTGGGCGGTTACGGAAGACGATAGAAATTCCTCAAAACACGTCATTGTGCTAGATCAAGGCGGACTAAATCTACCGACACGAGACAATTACCTAAACGCTACAGCCCACAAGAAAGTACTGGACGCCTATCTGGATTATATGACAAAGATATGCACATTACTGGGAGCTAACGAAACAGAAGCTAGGGCACAAATGTCGAAGGTTATACAGTTTGAGACGGAACTCGCGAATATAACCATCCCATCCGAGGATAGGAGGGATGAAGAGGGATTGTACAATCCGTATACCGTGAAGCAGTGGCAGAGGGAGGCGCCGTTCTTGAACTGGTCGATGTTCTTCAACGACGCCTTCAAACTCGTCAATAGGAGTATATCGGATAACGAGAGAATAGTTGTCTACGCGCCGGAATATTTCAGAAATTTAACAAGACTAGTAAGAAAATACAGCAAGAGTGAAGAGGATCAGAAAACACTGACGAGTTACATGATGTGGCAAGTGTCCCGTTCTTTATCGTCGTATTTGTCCAAATCTTTCCGTGACGCGACCAAAATATTGAGGAAGGCGCTGTTTGGATCCGAGGGCACCGAGGAGTCCTGGAGATACTGCGTCACGGATACCAACAACGCTGTTGGCTTCGCTGTCGGCGCGATGTTTGTGCGCGAAGTGTTCCATGGTGAGGCGAAGACTCAGGGCGAGATCATGATAGACAACATCCGAGCGGCTTTCAAGAAGAATTTGAAGAATCTCATCTGGATGGACGAAGAGACGAGAGATGCTGCGGAGATTAAGGCGGATGCTATCACTGATATGATAGGTTTCCCCGACTACATACTGAACAAAGACGAGCTGGACAAGCAGTACGAGGAGCTGGACGTAAGACCGAACAAGTACTTCGAGAACAACATCGCCTTCAACACGTACAGCCTGAAACATGATCTAAGGAAATTGGATAAACCCGTCAATAAAACTAAATGGGGCATGACACCGTCCACTGTGAACGCGTATTACACGCCCACCAAAAACCAGATAGTATTCCCCGCTGGTATTCTCCAACTGCCGTTCTATGATGGAGATAATCCCAAGAGCGTGAACTACGGAGCGATGGGCGTTGTTATGGGCCACGAGTTAACCCACGCGTTCGACGACCAAGGACGAGAATACGATAGATTCGGCAATTTGAACCGTTGGTGGAACAACGCTACCATAGCACGTTTCAAGCAAAGGACTCAATGCATTCAGAAACAGTATTCAACATACGAGATCGAAGGCCAGCATTTGAATGGAAAACAAACTCTCGGCATGACACCGTCCACTGTGAACGCGTATTACACGCCCACCAAAAACCAGATAGTATTCCCCGCTGGTATTCTCCAACTGCCGTTCTATGATGGAGATAATCCCAAGAGCGTGAACTACGGAGCGATGGGCGTTGTTATGGGCCACGAGTTAACCCACGCGTTCGACGACCAAGGACGAGAATACGATAGATTCGGCAATTTGAACCGTTGGTGGAACAACGCTACCATAGCACGTTTCAAGCAAAGGACTCAATGCATTCAGAAACAGTATTCAACATACGAGATCGAAGGCCAGCATTTGAATGGAAAACAAACTCTCGGCGAGAATATAGCAGACAACGGAGGTTTAAAGGCGTCGTTCCACGCTTATAAGGAGTACAGTAAAAACTCCAAAGTTAACCTCACTTTACCTGGATTGAAGTACAACCACAGACAATTGTTCTTCATATCTTTCGCTCAGGTATGGTGTTCAGCAATGACAAAGGAGTCGACGAAAATGCAAATCGAAAAGGACGATCACACCGTGGCCAAGTATAGAGTCATTGGACCAATATCGAACCTTCGAGAATTCTCTGAAGAATTCAATTGTCCCGTAGGAAGTAAAATGAACCCAAAACATAAATGCGAGGTATGGTAA

Protein sequence:

>DPOGS208618-PA
MTTPDSPSQSTGSSEDTPTPGPEKINKFNFLKHAHKARNNPFLKARPTLNRESSTAELFSRESSSSDFFQNSRLNLFDTTPKQEANDGYGKGVSFIGWHQRTKLERILLVITAFLLVAILLTTCLLLTIKGPSYIPVPPCYNPEPKEESSVCLSGSCIYTASEVIRALDETQDPCEDFYDFACGGWLKNNPIPEGKSSWGIFSKIELQNQLIIRSAIEKVNVSDKNSAETKARIYYDACIDGNETIEKLGEKPLISVIKKLGGWHLVTNTLVKQRKWDLQRLLQDVQNTLVGVSFIGWHQRTKLERILLVITAFLLVAILLTTCLLLTIKGPSYIPVPPCYNPEPKEESSVCLSGSCIYTASEVIRALDETQDPCEDFYDFACGGWLKNNPIPEGKSSWGIFSKIELQNQLIIRSAIEKVNVSDKNSAETKARIYYDACIDGNETIEKLGEKPLISVIKKLGGWHLVTNTLVKQRKWDLQRLLQDVQNTYNLGGFFNWAVTEDDRNSSKHVIVLDQGGLNLPTRDNYLNATAHKKVLDAYLDYMTKICTLLGANETEARAQMSKVIQFETELANITIPSEDRRDEEGLYNPYTVKQWQREAPFLNWSMFFNDAFKLVNRSISDNERIVVYAPEYFRNLTRLVRKYSKSEEDQKTLTSYMMWQVSRSLSSYLSKSFRDATKILRKALFGSEGTEESWRYCVTDTNNAVGFAVGAMFVREVFHGEAKTQGEIMIDNIRAAFKKNLKNLIWMDEETRDAAEIKADAITDMIGFPDYILNKDELDKQYEELDVRPNKYFENNIAFNTYSLKHDLRKLDKPVNKTKWGMTPSTVNAYYTPTKNQIVFPAGILQLPFYDGDNPKSVNYGAMGVVMGHELTHAFDDQGREYDRFGNLNRWWNNATIARFKQRTQCIQKQYSTYEIEGQHLNGKQTLGMTPSTVNAYYTPTKNQIVFPAGILQLPFYDGDNPKSVNYGAMGVVMGHELTHAFDDQGREYDRFGNLNRWWNNATIARFKQRTQCIQKQYSTYEIEGQHLNGKQTLGENIADNGGLKASFHAYKEYSKNSKVNLTLPGLKYNHRQLFFISFAQVWCSAMTKESTKMQIEKDDHTVAKYRVIGPISNLREFSEEFNCPVGSKMNPKHKCEVW-