Monarch geneset OGS2.0

DPOGS215283
TranscriptDPOGS215283-TA855 bp
ProteinDPOGS215283-PA284 aa
Genomic positionDPSCF300110 - 281339-364365
RNAseq coverage2x (Rank: top 92%)
Annotation
HeliconiusHMEL0145184e-6549.43% 
BombyxBGIBMGA000840-TA1e-8259.76% 
DrosophilaCG31954-PA1e-3336.55% 
EBI UniRef50UniRef50_Q5QBF43e-3838.33%Serine protease n=2 Tax=Culicoides sonorensis RepID=Q5QBF4_9DIPT
NCBI RefSeqXP_317170.25e-4035.83%AGAP008296-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|606787993e-3938.72%MPA3 allergen [Periplaneta americana]
NCBI nr blastxgi|606787996e-4038.72%MPA3 allergen [Periplaneta americana]
Group
Gene OntologyGO:00038245.8e-68catalytic activity
GO:00042524.3e-62serine-type endopeptidase activity
GO:00065084.3e-62proteolysis
KEGG pathwayani:AN2366.21e-34 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[53-283] IPR0090035.8e-68Peptidase cysteine/serine, trypsin-like
[55-278] IPR0012544.3e-62Peptidase S1/S6, chymotrypsin/Hap
[85-100] IPR0013148.6e-09Peptidase S1A, chymotrypsin-type
Orthology groupMCL19872 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215283-TA
ATGGGAGGAGTGGATCTCATGGATAGCTTTTTGGGCAGAAATCATATACAAATGAGGTCTAAAAAGTGGTATTTGCGTTACTCATATCCATCCGACATTATACTACCAGGCAGAATTCCAGTTTGGGATACCAAGGAAGAAACATCGTTCGCGTCAGATGAAAGGATTGTAGGTGGAGAAGAAGCAAACATAGAGGACTACCCTCATCAAATTGGCTTTCTTTTTCAACAAAACAATAACACTTATTTTTGTGGTGGATTTATTATCAGCGAATATTACATATTGACTGCTGGCCATTGTTCTCAAAATGTTGATCCCACGACGGTTGTCCTTCGAGCTGGAAGCTCGTTTCGTAATAATGGAACTATAATACCAATTGATGAGATTGTCGCTCATCCAGAGTACGACAACCCGCCTTATGACAAAGACGTTGGTTTCATTCGAACCACTGAAGCCATGCAATTTAGCGACACGATGAAGCCCATACCACTAGTGTCAAGAAATGAAAAGTGCAGAAGCCAAGTTGCCATCAGTGGTTGGGGCCGGACATCACAAGGAGCATCTTCAATACCGCTGAGATTACGAGATGTGAAAGTACCCGTCGTTGGACATTCTAGATGCAAGAGATCATATCCTGATATATTGACCCGTAACATGATTTGTGTTGGTAATTACATATTAGGTGGAAAAGGACCATGTCAAGGTGACTCAGGTGGAGCAGTAGTTTACGACGGTAAAGCTTGTGGTATTGTTTCCTTTGGAAGAATATGTGCTCAACCCTTATCACCGAGTGTTTGTGCTGATATATCAGCAAAAGACATTAGAGATTTTATATTTGATAATACTGGTGTATGA

Protein sequence:

>DPOGS215283-PA
MGGVDLMDSFLGRNHIQMRSKKWYLRYSYPSDIILPGRIPVWDTKEETSFASDERIVGGEEANIEDYPHQIGFLFQQNNNTYFCGGFIISEYYILTAGHCSQNVDPTTVVLRAGSSFRNNGTIIPIDEIVAHPEYDNPPYDKDVGFIRTTEAMQFSDTMKPIPLVSRNEKCRSQVAISGWGRTSQGASSIPLRLRDVKVPVVGHSRCKRSYPDILTRNMICVGNYILGGKGPCQGDSGGAVVYDGKACGIVSFGRICAQPLSPSVCADISAKDIRDFIFDNTGV-