Monarch geneset OGS2.0

DPOGS208602
TranscriptDPOGS208602-TA1140 bp
ProteinDPOGS208602-PA379 aa
Genomic positionDPSCF300052 - 69118-78678
RNAseq coverage27x (Rank: top 77%)
Annotation
HeliconiusHMEL0092252e-3239.82% 
BombyxBGIBMGA013383-TA1e-5145.41% 
DrosophilaCG31954-PA5e-2735.86% 
EBI UniRef50UniRef50_D9HQ204e-4644.78%Seminal fluid protein HACP001 n=13 Tax=Heliconiini RepID=D9HQ20_9NEOP
NCBI RefSeqXP_001660675.16e-3236.68%trypsin [Aedes aegypti]
NCBI nr blastpgi|2999305851e-4544.78%seminal fluid protein HACP001 [Heliconius erato]
NCBI nr blastxgi|2999305852e-4744.98%seminal fluid protein HACP001 [Heliconius erato]
Group
Gene OntologyGO:00038242.9e-48catalytic activity
GO:00042526.9e-41serine-type endopeptidase activity
GO:00065086.9e-41proteolysis
KEGG pathwaydme:Dmel_CG123502e-22 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[55-276] IPR0090032.9e-48Peptidase cysteine/serine, trypsin-like
[54-271] IPR0012546.9e-41Peptidase S1/S6, chymotrypsin/Hap
[77-92] IPR0013147.3e-09Peptidase S1A, chymotrypsin-type
Orthology groupMCL26067 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208602-TA
ATGTTCTGCGCTGGATTCTCAGAGGGAAAAAAAGATGCTTGCCAGGGTGACTCAGGGGGTCCGGCAGTGGCGTCTGGACGTCTTCTGGGCATGGTGTCGTTCGGCTACGGGTGCGCCACCCCCGGCTCCTATGGCGTTTACAGCAAGGTTGCGAAAGTCAGTATCCTCGGAGGTCGTGACGTGACAATAGACCGTGTCCCGCACCACGTCAGCTATGGTGATATCTGCGGGGGAGTGCTCGTTCATAGAAAATGGGCCCTAACAGTCGCTCATTGTGGCACAGATAAAGATTACATAAGAGTGGGGAGTAGATACCGTTTAAAGGGTGTTCAGATTAAAATTAAGAAACATCTGATACACCCTCTATATCAAAAGGAGCACAGTTTTGATTTCGATGTACAGTTGCTGGGATTGTTCCGGGGTTTGAATTTTGGGAAAACTATCAAAGATATTGAAATCAGTGATGGAGGATGCGACAAATATATTGATATGTCTGGTTGGGGATATGGTGAGGAGAGGGGAGACTACAATCATATATTGCAACAAGTAAGAATCAAGTTGGTGGTATTCGAGTTATGTCAGATGGTTGATCAGCCCTGGTACAACGGAACGCTGACCGGGAGAATGTTCTGCGCGGGGGGAGGGAGGGGCGGAGGGCCAGACGGAGGGGGTGACGCTTGTCAGGGTGACAGCGGCGGTGGTGCGGTGTCTAAGAATCGTCTGGTTGGTCTCTCGTCGTTCGGGTACGGTTGCGGGCGCGGTGTACCCGGGGTCTACGTCAACGTATCCAATCCTGGTATTATATCATGGATCAGACAATACACCGGAGACTACAATCATATATTGCAACAAGTAAGAATCAAGTTGGTGGTATTCGAATTATGTCAGATGGTTGATCAGCCCTGGTACAACGGAACGCTGACCGGGAGAATGTTCTGCGCGGGGGGAGGGAGGGGCGGAGGGCCAGACGGAGGGGGTGACGCTTGTCAGGGTGACAGCGGCGGTGGTGCGGTGTCTAAGAATCGTCTGGTTGGTCTCTCGTCGTTCGGGTACGGTTGCGGGCGCGGTGTACCCGGGGTCTACGTCAACGTATCCAATCCTGGTATTATATCATGGATCAGACAATACACCGTACGGTAA

Protein sequence:

>DPOGS208602-PA
MFCAGFSEGKKDACQGDSGGPAVASGRLLGMVSFGYGCATPGSYGVYSKVAKVSILGGRDVTIDRVPHHVSYGDICGGVLVHRKWALTVAHCGTDKDYIRVGSRYRLKGVQIKIKKHLIHPLYQKEHSFDFDVQLLGLFRGLNFGKTIKDIEISDGGCDKYIDMSGWGYGEERGDYNHILQQVRIKLVVFELCQMVDQPWYNGTLTGRMFCAGGGRGGGPDGGGDACQGDSGGGAVSKNRLVGLSSFGYGCGRGVPGVYVNVSNPGIISWIRQYTGDYNHILQQVRIKLVVFELCQMVDQPWYNGTLTGRMFCAGGGRGGGPDGGGDACQGDSGGGAVSKNRLVGLSSFGYGCGRGVPGVYVNVSNPGIISWIRQYTVR-