Monarch geneset OGS2.0

DPOGS210786
TranscriptDPOGS210786-TA3234 bp
ProteinDPOGS210786-PA1077 aa
Genomic positionDPSCF300027 - 1376173-1383689
RNAseq coverage77x (Rank: top 65%)
Annotation
HeliconiusHMEL0169422e-17855.12% 
BombyxBGIBMGA007097-TA0.062.68% 
DrosophilaNep4-PA0.052.40% 
EBI UniRef50UniRef50_E2C8U10.073.46%Endothelin-converting enzyme 1 n=5 Tax=Formicidae RepID=E2C8U1_HARSA
NCBI RefSeqXP_002430353.10.069.98%hypothetical protein Phum_PHUM474680 [Pediculus humanus corporis]
NCBI nr blastpgi|2420198130.069.98%hypothetical protein Phum_PHUM474680 [Pediculus humanus corporis]
NCBI nr blastxgi|3287825440.070.35%PREDICTED: endothelin-converting enzyme 1 [Apis mellifera]
Group
Gene OntologyGO:00082371.1e-127metallopeptidase activity
GO:00065081.1e-127proteolysis
GO:00042226.3e-60metalloendopeptidase activity
KEGG pathway 
InterPro domain[240-1077] IPR0007180Peptidase M13, neprilysin
[861-1077] IPR0240798.2e-131Metallopeptidase, catalytic domain
[261-811] IPR0087531.1e-127Peptidase M13
[870-1075] IPR0184976.3e-60Peptidase M13, neprilysin, C-terminal
Orthology groupMCL10214 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210786-TA
ATGGCGCGCGTTGACTCAACACCCCAAACAGTTTCGATCAAACCGAAAAGTGCCAATTTTTGTAGATTTATAATTATTATTGTTTTAACATTATTCTGTGCTGTAGCCTATTTTATTTCGAGGAGACCTCTAGACACATTAGGAGATCCGTCTGATCAGTTTACTGTTAAAAACATCCCAAATATATATACACGTTTGGAAAGTGTTGATCCTGAGGAGAGGGAGGTGTTCCAAGGATTCCAAACATCATTCCTGCCGCCGGAGGAGGTGGTGTTATCTAGGAAAGGATTTGAAGATAGTGACGAAGTTTTAAACAGTAATGTGAAAAGCGATATATTTAAATGGGAAGAAAAAAGACGAGAAAATCCCACAAGGGTTAAACGAGGAAGCCATTCCACTTTCTCGACACAAGATTACGATCAAGAAACACCAACACAAAACCAACAAAATGTCGAAGATATTCAAGAAGATGACGACGATGGAAACGAAAAGGAAATGGTATATGGCAGTGACTCCGATCAACACGACGAGGAAAGAAATATGCAAAGTGTGGTGCATCCAGAGATTTACGTTGATCGTGGGCCGTTAGACGACGTGGAACAAGTTCGCCGGCAGCCAGAATTAGAGGAAGGTGAAATGGGTTCCGCGGCTCATTTGTATAAACCGGTTCGATCTTACACAGGGGTACACGCGTTTTGGAAGGGTCAGGGTGACAAAGAAACTATAAGACACACGCAGTCAAAAATAATGCGACAATACATGGATGCTGAAGCGGACCCTTGTCATGATTTTTATCAATATGCTTGTGGAAATTGGCCAACACTCAACCCAATACCAGCTGATAAAGCTGGTTACGATACATTTGAGATGTTGAGGGAAAACTTGGATACGGTATTGAAGGACATGTTGGAGTTCTCTAAAGATGAAGAGATCCCTAGCCAGTATCCGGGGCCACATCTAGATTTCAATGATAATTTAAAACATGCTATAAATTCGCAATGCTCACAAGAATCTCATGATATCGTTGATTATATTATTACAAATTCCAAAGAAATACTCAATTTGACTGAGAGAAAAAATACTTATGACATTAAAGACGAGAGCAAAACTGAAATAATTAATCGTATCAGACGATATCTTGATATGAAAACACGTGACATGAAGAAGTCTTCATTTAAAACGAAATTTAAATTACACGAGTATTTATTCATGAACAATAAAAAGGGAAAAAATTTAAGGAGACCCAAAAGACATACTGATAACAATGACACGCGGAACCAAAGCGAAAAGAAAAAACGCAATATAATATACGATAAAAACGGATCCAACAGAGAAACACATTTTAAACGTGGTAAAAGAAAAGAAACATTGGAGCAACTTTTGGAAAATCTTAAACAGAAATATGAATTACCCAAAAACGACCCAGCAAATGGCGACGCGGCATTGAAAGCTAGATTTTTATTTAAGTCTTGTATGAACCACGATATCTTGCAGAAAAGAGGCCACGTACCTCTTCTAGATCTACTTGATATTTTAGGAGGCTGGCCGATACTAAAACCCGGATGGGATTCAAAAAATTTCGACTGGTTGGAACTTATGGCAAAACTAAGGCTATATAATAATGACATTTTAATATCTGAATGGGTTGGACCAGATATAAAGAATTCAGATGAATTCGTTATACAGTTTGATCAAACGAGTCTAGGTTTGCCTACAAGAGATTATTTTCTACAAGAGTCTAACAAGGTATATTTAGAGGGTTATAGAGCATATTTGATAAAAATAGCAACTTTACTCGGAGGAAACATTGAGCATGTAAAAGAGAGTGCAGTAAAACTGATCGATTTCGAAATCAACCTTGCTAAAATAACTTCCGCCCCAGAAGACAGGCGAAACGTATCAGAACTCTACCGCCGCATGACACTCGCCAAGCTGGAAGGACTGGTCCCCGAGATCAAGTGGAGGAAATATTTGTGCATCGTGATGAACAGGACGATTGACTCAAGCGAAACTGTAGTACTGTTCGCTCTGTCGTACGTACGGCACTTAGTTCAATTGATAAAGAAGACGGATCCTAATACTTTATCAAATTACTTATTGTGGCGTTTCGTGAGACATCGTGTCAACAATCTGGATGATCGCTTCCAATCTGCGAAACAACAATTCTATTACATTTTATTTGGACGCGAACAAGCGCCGCCAAGGTGGAAGAACTGTATATCCCAAGTGAATTCAAATATGGGCATGGCATTAGGGTCAATGTTTGTTAGGAAATACTTTGACGAGATGAGCAAAAACGACACGATGACGATGACGAGGGAAATCCAACAGGCGTTCAGAGAGTTACTGCACATGACGGATTGGATTGATGAGGAGACAAAAAAACTAGCCGCCCATAAAGTCGACTCTATGATGCTCAGAATAGGCTACCCCGACTTCATTCTGAACAAGAAAGAGCTCGACGATCGTTATAAGGAAGTGCAAATACATCCAGATAAATATTTTGAGAATATACTGAATATACTTCAACATCTCACTAAAATGGAACAGTCGCGAATCGGCCAGCCTGTTAATAAGACACTATGGAATACAGCGCCGGCGGTCGTGAACGCTTATTACAGCCGTAATAAAAATCAGATCATGTTCCCCGCTGGGATCCTACAACCACCTTTCTACCATCGACACTTCCCGAGGTCGCTGAACTTTGGAGGCATCGGAGTGGTTATTGGTCACGAAATTACCCACGGGTTTGACGACAAGGGTCGTTTGTTTGACTGCGAGGGTAACCTGCACCGCTGGTGGTCTGATTCCGCCATCGAGGCATTCCATCGTCGAGCTCAGTGCCTCATCGACCAGTACGGACGATACGTAGTGCCAGAAGTCAATATGAAACTAGACGGTGTTAACACACAGGGTGAGAATATAGCCGACAATGGTGGCGTGAAGCAGGCGTTCCACGCTTACCAACGCTGGCTGCTACAGCACGGCGCCGTTGACGAGACGCTTCCAGAACTCAACCATACCAGCACGCAGTTGTTCTTTCTCAACTTCGCCCAGGTATGGTGTGGTGCAATGCGGCCGGAAGCTATGAGAAATAAATTAAAGACAGCTGTCCACTCTCCAGGAAGGTTCCGTGTAATTGGAACCCTTTCTAATTCCCTGGATTTCGCCAGAGAATTCCAATGTCCACCGGGATCGCCCATGAATCCGATTCATAAATGTAGTGTTTGGTAG

Protein sequence:

>DPOGS210786-PA
MARVDSTPQTVSIKPKSANFCRFIIIIVLTLFCAVAYFISRRPLDTLGDPSDQFTVKNIPNIYTRLESVDPEEREVFQGFQTSFLPPEEVVLSRKGFEDSDEVLNSNVKSDIFKWEEKRRENPTRVKRGSHSTFSTQDYDQETPTQNQQNVEDIQEDDDDGNEKEMVYGSDSDQHDEERNMQSVVHPEIYVDRGPLDDVEQVRRQPELEEGEMGSAAHLYKPVRSYTGVHAFWKGQGDKETIRHTQSKIMRQYMDAEADPCHDFYQYACGNWPTLNPIPADKAGYDTFEMLRENLDTVLKDMLEFSKDEEIPSQYPGPHLDFNDNLKHAINSQCSQESHDIVDYIITNSKEILNLTERKNTYDIKDESKTEIINRIRRYLDMKTRDMKKSSFKTKFKLHEYLFMNNKKGKNLRRPKRHTDNNDTRNQSEKKKRNIIYDKNGSNRETHFKRGKRKETLEQLLENLKQKYELPKNDPANGDAALKARFLFKSCMNHDILQKRGHVPLLDLLDILGGWPILKPGWDSKNFDWLELMAKLRLYNNDILISEWVGPDIKNSDEFVIQFDQTSLGLPTRDYFLQESNKVYLEGYRAYLIKIATLLGGNIEHVKESAVKLIDFEINLAKITSAPEDRRNVSELYRRMTLAKLEGLVPEIKWRKYLCIVMNRTIDSSETVVLFALSYVRHLVQLIKKTDPNTLSNYLLWRFVRHRVNNLDDRFQSAKQQFYYILFGREQAPPRWKNCISQVNSNMGMALGSMFVRKYFDEMSKNDTMTMTREIQQAFRELLHMTDWIDEETKKLAAHKVDSMMLRIGYPDFILNKKELDDRYKEVQIHPDKYFENILNILQHLTKMEQSRIGQPVNKTLWNTAPAVVNAYYSRNKNQIMFPAGILQPPFYHRHFPRSLNFGGIGVVIGHEITHGFDDKGRLFDCEGNLHRWWSDSAIEAFHRRAQCLIDQYGRYVVPEVNMKLDGVNTQGENIADNGGVKQAFHAYQRWLLQHGAVDETLPELNHTSTQLFFLNFAQVWCGAMRPEAMRNKLKTAVHSPGRFRVIGTLSNSLDFAREFQCPPGSPMNPIHKCSVW-