Monarch geneset OGS2.0

DPOGS206560
TranscriptDPOGS206560-TA909 bp
ProteinDPOGS206560-PA302 aa
Genomic positionDPSCF300108 - 677247-682011
RNAseq coverage4562x (Rank: top 3%)
Annotation
HeliconiusHMEL0043711e-11064.73% 
BombyxBGIBMGA013746-TA2e-5442.40% 
DrosophilaCG9737-PA5e-4034.89% 
EBI UniRef50UniRef50_Q49QW04e-5643.57%Prophenol oxidase activating enzyme 3 n=5 Tax=Obtectomera RepID=Q49QW0_SPOLT
NCBI RefSeqNP_001036832.13e-5342.76%prophenoloxidase activating enzyme [Bombyx mori]
NCBI nr blastpgi|739135635e-5841.18%hemolymph proteinase 24 [Manduca sexta]
NCBI nr blastxgi|739135635e-5941.72%hemolymph proteinase 24 [Manduca sexta]
Group
Gene OntologyGO:00038245.9e-74catalytic activity
GO:00042527.2e-67serine-type endopeptidase activity
GO:00065087.2e-67proteolysis
KEGG pathwaydpo:Dpse_GA159031e-36 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[40-300] IPR0090035.9e-74Peptidase cysteine/serine, trypsin-like
[50-295] IPR0012547.2e-67Peptidase S1/S6, chymotrypsin/Hap
[76-91] IPR0013145.6e-11Peptidase S1A, chymotrypsin-type
Orthology groupMCL34657 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206560-TA
ATGGAAGTCTTGATGAAGAGATTTTTTTTTCTATTTGCTTTGTTCGCCGCTGGGACTGAAACCTTTGGAGATTCATGCGATAAAATAGACAGGCCGCCTAAACAGGAGACGGGATGCTGCGGCCTTGACGTGGCAGATGACGATCTTAAAATCAGTGGTGATGCTGCTATCAACCAGTTTCCTTGGCTGGCTCTAATAGAGAATAAAGATGAAAGGATTACGTGTACCGGAGTCTTGATTAGTTCCCAGCACGTGTTAACAGCGGCGCAGTGTAATTACCACGGACCTGTCAAAGTACGTTTGGGTGAATATAATAAAACCCACTCGGGTCCTGACTGTGTGGTCGTCGATGGTAAGCAGGTGTGCAACGCTGGTGCAATCACCATCCCCATAGCCACGGTCAGATCGCACCCTGATTTCCAACGACCCAACCACAAGGGACACGATATAGCTCTTATCAAACTAGCCGATAAAGTCCCATTCAATGAATTCATTCGTCCCATATGTCTGCCGAACAGCGATGTATTGGTGTCACCACCTGCGAACCTTCGACTGGTGAACGCCGGCTGGGGCAGCAGACCTTCCAGCAGTTATGAAGACGATGTGGTCAAACATTACGTTCAGTTGCCGTATGTTGATAAGGAGACTTGTTCGAAGGTGTACAGTTCGTTGAAGGATACATCATATCAGATAAGGGATAGTCATATATGTGCTGGAGGTGAAAAAGGCTTAGATTCCTGCCGAGGTGACGGTGGCGGACCGCTTATGTACAAGGATAATGGTTTGTACACACTAGTGGGTCTAGTCAGCTTCGGCAAGGTACCTTGCGGTGTAGAAGGTGTCCCCAGTGTGTACACAAAAGTGTACTCCCACATCCCATGGATAGAGAGTCTCACGAGATCATTATAG

Protein sequence:

>DPOGS206560-PA
MEVLMKRFFFLFALFAAGTETFGDSCDKIDRPPKQETGCCGLDVADDDLKISGDAAINQFPWLALIENKDERITCTGVLISSQHVLTAAQCNYHGPVKVRLGEYNKTHSGPDCVVVDGKQVCNAGAITIPIATVRSHPDFQRPNHKGHDIALIKLADKVPFNEFIRPICLPNSDVLVSPPANLRLVNAGWGSRPSSSYEDDVVKHYVQLPYVDKETCSKVYSSLKDTSYQIRDSHICAGGEKGLDSCRGDGGGPLMYKDNGLYTLVGLVSFGKVPCGVEGVPSVYTKVYSHIPWIESLTRSL-