Monarch geneset OGS2.0

DPOGS209861
TranscriptDPOGS209861-TA1191 bp
ProteinDPOGS209861-PA396 aa
Genomic positionDPSCF300510 - 25711-35367
RNAseq coverage516x (Rank: top 24%)
Annotation
HeliconiusHMEL0225976e-12353.65% 
BombyxBGIBMGA001745-TA7e-10561.77% 
Drosophilasnk-PB5e-5534.90% 
EBI UniRef50UniRef50_G6D6G60.0100.00%Serine protease 7 n=2 Tax=Obtectomera RepID=G6D6G6_DANPL
NCBI RefSeqXP_313589.24e-7143.24%CLIP-domain serine protease subfamily C (AGAP004318-PA) [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3640236051e-10952.38%seminal fluid protein CSSFP027 [Chilo suppressalis]
NCBI nr blastxgi|3640236053e-10952.38%seminal fluid protein CSSFP027 [Chilo suppressalis]
Group
Gene OntologyGO:00042527.4e-82serine-type endopeptidase activity
GO:00065087.4e-82proteolysis
GO:00038244.5e-80catalytic activity
KEGG pathwaydpo:Dpse_GA195437e-38 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[140-388] IPR0012547.4e-82Peptidase S1/S6, chymotrypsin/Hap
[133-393] IPR0090034.5e-80Peptidase cysteine/serine, trypsin-like
[172-187] IPR0013142.4e-13Peptidase S1A, chymotrypsin-type
Orthology groupMCL16667 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209861-TA
ATGGAGGTGTTGTCCTTTTTTATCCTCGTCATCACACAGCTCGGATATTCGCAGAAACTTGGTGAGCTGTGCCTCAAGCAGAACACAAGGACCCTCGGACTATGCACGCCAGCAGCCAGATGCGAGTCAGCGAAACGGGAATACGCGCTCAATAGAATAATACCAACTTTCTGTGAGAGATTTGAGACCACAATTGTCGTTTGTTGCAGCGACAACACATTGTCTATTGGATCGGTCAATGTCCAACCTCCGACTCAACCGCTCAAAGCTCCGTCACCTGAACCCGTTCCAAGACCAACAAACAATGAGCTCCGAGTCAGTGAGAAAAAATGCGCTGAATATAGCAACGATGTGGTCGAAAACATCAACGTAACTGCCGGCATAAGCACGTCGAAGTGTGACTATAATAAACAGCATTTGATAGTGGGGGGAGAAGATGCTGAAAGGGGGGAATTTCCACACATGGCAGCGATTGGTTGGATAAACCTTGAAGGGACATATTCCTTCCTGTGTGGCGGGAGCCTGATCAGCCCCAACTTCGTGATGACGGCGGCGCACTGCTCCAAATACGCCAAGATGAGAATCCCAAGACCCGTCATCGTTAGACTTGGAGATCAGAACATTAAACCGGAGAAATTCGACGGCGCAAATCCCATTGACGTCAAAATCAAAAGTATACAAAACCACCCGCTGTACAAATCCCCGGGCAAATACAACGACATCATGCTGCTGGAATTGGAGACGGAAGTTAAATTCGAGGCGGCCATTAGACCTGCATGCTTGTGGAGTAAACCTGACTTCGAGGAGTACACCACAGCTGTAGCCACGGGCTGGGGTATTACTGACCCACGAACACAGCGGACTTCCGACGAGCTCAGGAAAGTATCGCTGTCGTTATTCAACAATTCATTTTGCAAAAGTGTTTTGACACCAAAAAGGAACCGCAACTGGCCGGATGGGTTCCGAGACACGCAGGTGTGCGCTGGGGAGTTACGAGGAGGAAAAGACACATGCCAGGGTGACTCAGGTTCTCCACTCCAAGTCGTCTCCAGACAGAATAAATGCATCTTCTACGTTGTAGCCGTCACCTCCTTCGGGCCAGGCTGTGCTCTCAAAGGGACACCGGCTATTTACACAAGGGTGTCCTCATATCTCGATTGGATTGAATCCGTCGTGTGGCCAGAGGGATAG

Protein sequence:

>DPOGS209861-PA
MEVLSFFILVITQLGYSQKLGELCLKQNTRTLGLCTPAARCESAKREYALNRIIPTFCERFETTIVVCCSDNTLSIGSVNVQPPTQPLKAPSPEPVPRPTNNELRVSEKKCAEYSNDVVENINVTAGISTSKCDYNKQHLIVGGEDAERGEFPHMAAIGWINLEGTYSFLCGGSLISPNFVMTAAHCSKYAKMRIPRPVIVRLGDQNIKPEKFDGANPIDVKIKSIQNHPLYKSPGKYNDIMLLELETEVKFEAAIRPACLWSKPDFEEYTTAVATGWGITDPRTQRTSDELRKVSLSLFNNSFCKSVLTPKRNRNWPDGFRDTQVCAGELRGGKDTCQGDSGSPLQVVSRQNKCIFYVVAVTSFGPGCALKGTPAIYTRVSSYLDWIESVVWPEG-