Monarch geneset OGS2.0

DPOGS204778
TranscriptDPOGS204778-TA3198 bp
ProteinDPOGS204778-PA1065 aa
Genomic positionDPSCF300231 + 536814-562611
RNAseq coverage419x (Rank: top 29%)
Annotation
HeliconiusHMEL0021561e-12235.16% 
BombyxBGIBMGA013714-TA0.065.10% 
DrosophilaNep1-PA2e-11932.56% 
EBI UniRef50UniRef50_D6W7L00.057.75%Putative uncharacterized protein n=5 Tax=Neoptera RepID=D6W7L0_TRICA
NCBI RefSeqXP_970993.10.056.40%PREDICTED: similar to Endothelin-converting enzyme 1 [Tribolium castaneum]
NCBI nr blastpgi|910926680.056.40%PREDICTED: similar to Endothelin-converting enzyme 1 [Tribolium castaneum]
NCBI nr blastxgi|910926680.057.24%PREDICTED: similar to Endothelin-converting enzyme 1 [Tribolium castaneum]
Group
Gene OntologyGO:00065082e-216proteolysis
GO:00042222e-216metalloendopeptidase activity
GO:00082374.7e-81metallopeptidase activity
KEGG pathway 
InterPro domain[380-1065] IPR0007182e-216Peptidase M13, neprilysin
[849-1065] IPR0240792e-114Metallopeptidase, catalytic domain
[407-799] IPR0087534.7e-81Peptidase M13
[858-1062] IPR0184971.4e-58Peptidase M13, neprilysin, C-terminal
Orthology groupMCL17430 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204778-TA
ATGTATTATATAATTTCCGCGAAGGCTTTAGCGCTGTATCATTACCCTGTGCAAGCGATCAGTGACATCAAACAAGCTATTGTTAGTTGTCGGTCTGCCGCCGCAGCGCTTGCTTCGCTAGCTCAAGCTCGCACTGTGCGCTGTTCGTGGGTGGTGATCGTATCCCTGGGCTTGTCCCTGGCGGTGTTGGCGGTGTACACCACGGTCGTGGTGGTGGTAGACCTTAAAGCACCCAAACCCTGCCTCACTGAAGTGTGTGTCAATACAGCGTCGAGAGTACTAGCGGCGTTAAACAAGAGCGTGGATCCCTGCGACGACTTCTACGAGTTCGCTTGTGGCGGGTGGATCGAGAAGAACCCTGTCCCGGAGTGGGCGACCTCCTGGGATCAGCTCGCCATCCTGCGAGAGAAACTGGTCACTGACCTGAGGGAACTGCTGGAAGACAAAAACGACCACGGCCTGCCTAAGAGCGTGCTTAAAGCTAAAGCCCTCTACCGCACTTGTATGGATGTTGACAAGCTAGAGGTGTACGGAACCGCGCCCATCACGGATCTGTTGCTACAACTAGGCCTTCCTCCAACGCCCCCTTCCGTGTCCAGTGATAACTTCTCGTGGGAGCAGGTGTCTGGGCGCGCCCGCAGGACTCTCGGTCTCAGTGTTCTGCTGAGCGTTCAGGTCGCTGAGGATGTGAGGAACACTAGCAGGAACAGGGTCGTGTTGGAGCAGGTATCTCCAGGGTTCAGCGATCGTTACCTGCGCCAGGCGGACAAGTTCTCGTTCGAGTTGGAGCAGTACCGGATCTACATCACGTCAATGATCAAAGCCTTCCATCCCGACACGGACGCGGAACGCTTCGCCGACGACATTATAGAATTCAGCAAGACTCTGGCTGGCATCATGACGCCGGTGGAGGTTCGTCGCAGCGGCACTCACCTGTTCCACGAGCTGAGTGTGACTCAACTGCTGGGAGGGAACGGAGCTCCTCCTGAATGGCACCAGGTATATTTTATGGTTGCCAAACAAGATGAAAAGAAAGCTCGCACTGTGCGCTGTTCGTGGGTGGTGATCGTATCCCTGGGCTTGTCCCTGGCGGTGCTGGCGGTGTACACCACGGTCGTGGTGGTGGTAGACCTTAAAGCACCCAAACCCTGCCTCACTGAAGTGTGCGTCAATACAGCGTCGAGAGTACTAGCAGCATTAAACAAGAGCGTGGATCCCTGCGACGACTTCTACGAGTTCGCATGTGGTGGGTGGATCGAGAAGAACCCTGTCCCGGAGTGGGCGACCTCCTGGGATCAGCTTGCCATCCTGCGAGAGAAACTGGTCACTGACCTGAGGGAACTGTTGGAAGACAAAAACGACCACGGCCTGCCTAAGAGCGTGCTCAAAGCTAAAGCCCTCTACCGCACTTGTATGGATGTTGACAAGCTAGAGGTGTACGGAACCGCGCCCATCACGGATCTGTTGCTACAACTAGGCCTTCCTCCAACGCCCCCTTCCGTGTCCAGTGATAACTTCTCGTGGGAGCAGGTGTCTGGGCGCGCCCGCAGGACTCTCGGTCTCAGTGTTCTGCTGAGCGTTCAGGTCGCTGAGGATGTGAGGAACACTAGCAGGAACAGGGTCGTGTTGGAGCAGGTATCTCCAGGGTTCAGCGATCGTTACCTGCGCCAGGCGGACAAGTTCTCGTTCGAGTTGGAGCAGTACCGGATCTACATCACGTCAATGATCAAAGCCTTCCATCCCGACACGGACGCGGAACGCTTCGCCGACGACATTATAGAATTCAGCAAGACTCTGGCTGGCATCATGACGCCGGTGGAGGTTCGTCGCAGCGGCACTCACCTGTTCCACGAGCTGAGTGTGACTCAGCTGCTGGGAGGGAACGGAGCTCCTCCTGAATGGCACCAGCACGACTGGCAGAAGTATATAGACCTGGTGTTCTCCAACACGAGCGTGTCTCTGACGGACGGCGACCGAGTCATCGTGATGGACCTGCCCTACCTGCACCGCCTGGCCGGCACGCTGGCTCGTACCGACCCACTCATCACAGAGCGCTTCCTGTGGTGGAGCGTGTTCTCGACCGTGGCTCCGATGACTCGCGCCATATTTCGGACCCTCGGGTTCGAGTTCAGCCGCGCGGCCTGGGGCCTGCGGGCCCGCGTCGACCGCCACAAGGCCTGCGCCGCCAACGTCAACGCCAACTACGGCCTCGCGCTCAGCTACCTCTACGTCAATAAACACTTCGATGAACACGAACGCGAAAAGGCTATAGAAATGATCGAGGACGTCCGCGAGTCGTTCGCGGAGGCGGCTCGCTCCCTGCCCTGGATGGACGACGGCACGCGGGACACGGCGCTGCACAAGCTGAGGGCCATACGGACCTTCGTGGGCTTCCCCGCCTGGCTCATGGACACACACAAGCTGGACCGACATTACGAACACGTGGAGGTGGTGGAGGGGAACCTGTTCGAGTCATACTTGAAGCTGACCTGGGCCACCGTCAAGAAGTCACTGGAGTCTCTGAGAGAGACGCCGGACAGGAACAGGTGGGTCGCGACCGCCACCACAGTCAATGCCTTCTATTCAGCAACACTTAATTCAGTCACATTCCCGGCTGGCATCTTACAACCACCTTTTTACGGAAATGGAATCGAGGCAATAAACTACGGATCCATCGGAGCCATCATGGGTCATGAAGTGACACACGGCTTCGACGATCAAGGTCGTCGGTACGATTCAGACGGCAATCTAGCGTCGTGGTGGTCACGGGAAACGCTGGAGCAGTACCAGGCGCGGGTGAGGTGCATCGTGGAGCAATACGACCAGTACGGCCTGCCGCAGCTGGCCGGGTATAACGTGCACGGGTTCAACACGCAGGGGGAAAATATCGCCGACAACGGGGGCCTGCGGGCCGCGCTCCGGGCTTACCGCAGGCACGAGGCGCGCGCCGGGCGGGCCGCCCTCCTGCCAGGTCTCCCGGGACACACTCCCACACAACTCTTCTTCCTCGGATTCGCCCAGATATGGTGCGGGAACTCCACTACGGGGGCGCTGAAATCGAAAATGGTGGAAGGCGTCCACAGTCCTAACAAAATAAGAGTCATAGGGACCTTGAGCAATTCCAAGGAGTTCTCAGAAGCTTGGAAATGTCCTCTGGGGTCTCCCATGAACCCAGAACACAAGTGCGTTTTGTGGTAA

Protein sequence:

>DPOGS204778-PA
MYYIISAKALALYHYPVQAISDIKQAIVSCRSAAAALASLAQARTVRCSWVVIVSLGLSLAVLAVYTTVVVVVDLKAPKPCLTEVCVNTASRVLAALNKSVDPCDDFYEFACGGWIEKNPVPEWATSWDQLAILREKLVTDLRELLEDKNDHGLPKSVLKAKALYRTCMDVDKLEVYGTAPITDLLLQLGLPPTPPSVSSDNFSWEQVSGRARRTLGLSVLLSVQVAEDVRNTSRNRVVLEQVSPGFSDRYLRQADKFSFELEQYRIYITSMIKAFHPDTDAERFADDIIEFSKTLAGIMTPVEVRRSGTHLFHELSVTQLLGGNGAPPEWHQVYFMVAKQDEKKARTVRCSWVVIVSLGLSLAVLAVYTTVVVVVDLKAPKPCLTEVCVNTASRVLAALNKSVDPCDDFYEFACGGWIEKNPVPEWATSWDQLAILREKLVTDLRELLEDKNDHGLPKSVLKAKALYRTCMDVDKLEVYGTAPITDLLLQLGLPPTPPSVSSDNFSWEQVSGRARRTLGLSVLLSVQVAEDVRNTSRNRVVLEQVSPGFSDRYLRQADKFSFELEQYRIYITSMIKAFHPDTDAERFADDIIEFSKTLAGIMTPVEVRRSGTHLFHELSVTQLLGGNGAPPEWHQHDWQKYIDLVFSNTSVSLTDGDRVIVMDLPYLHRLAGTLARTDPLITERFLWWSVFSTVAPMTRAIFRTLGFEFSRAAWGLRARVDRHKACAANVNANYGLALSYLYVNKHFDEHEREKAIEMIEDVRESFAEAARSLPWMDDGTRDTALHKLRAIRTFVGFPAWLMDTHKLDRHYEHVEVVEGNLFESYLKLTWATVKKSLESLRETPDRNRWVATATTVNAFYSATLNSVTFPAGILQPPFYGNGIEAINYGSIGAIMGHEVTHGFDDQGRRYDSDGNLASWWSRETLEQYQARVRCIVEQYDQYGLPQLAGYNVHGFNTQGENIADNGGLRAALRAYRRHEARAGRAALLPGLPGHTPTQLFFLGFAQIWCGNSTTGALKSKMVEGVHSPNKIRVIGTLSNSKEFSEAWKCPLGSPMNPEHKCVLW-