Monarch geneset OGS2.0

DPOGS203741
TranscriptDPOGS203741-TA897 bp
ProteinDPOGS203741-PA298 aa
Genomic positionDPSCF300010 - 264748-268070
RNAseq coverage68x (Rank: top 67%)
Annotation
HeliconiusHMEL0029551e-6846.21% 
BombyxBGIBMGA013383-TA2e-4840.61% 
DrosophilaTry29F-PC2e-3840.52% 
EBI UniRef50UniRef50_D9HQ784e-10059.04%Seminal fluid protein HACP026 n=15 Tax=Nymphalidae RepID=D9HQ78_9NEOP
NCBI RefSeqXP_001848608.18e-4445.02%trypsin-1 [Culex quinquefasciatus]
NCBI nr blastpgi|2999307011e-9959.04%seminal fluid protein HACP026 [Heliconius melpomene]
NCBI nr blastxgi|2999307016e-10161.15%seminal fluid protein HACP026 [Heliconius melpomene]
Group
Gene OntologyGO:00042523.8e-72serine-type endopeptidase activity
GO:00065083.8e-72proteolysis
GO:00038243.7e-67catalytic activity
KEGG pathwaydpo:Dpse_GA218797e-37 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[74-290] IPR0012543.8e-72Peptidase S1/S6, chymotrypsin/Hap
[70-294] IPR0090033.7e-67Peptidase cysteine/serine, trypsin-like
[97-112] IPR0013146.3e-14Peptidase S1A, chymotrypsin-type
Orthology groupMCL26170 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203741-TA
ATGTGTCAAGTGCTCGTGTTTTTTGTGTTATTTTGTGTCGCGTCCAGTTTAGATCTAAATGTTAAAATTCCTCAAAACAAAGGACTTCTAACTGTGGAATGCATTAAAAGCTACTACAACACGGTCGTGCTGCGATCAATTCCTCCGCGTCATTATTACTACTACCACGAAGTCCTCGCTGGAAAAATCGATGACTCCATGAACGAAGAGATCGGTTTCCGAATAGTCGGCGGCGAAAAAATTTCTATCCAGGATGCTCCGTATCAGGTCCTTTATGGAAAATATTGCGGAGGCTCGCTGATTGCTCCCAACTGGGTTCTGACTGCAGCCCATTGTAAAACACATCACGCCTACGTCTACGCTGGCTCAACTTTCCGTGATGATACAAAACCTTACCTCATCTGTAGTCATTTCGTCCATCCTGGTTGGAATCGTTCTAGTCTTCACTCCCATGATTACGATTATCAATTGCTTTTGTTGGAACAGTCGATTCCGGTTAATTCCATGGCTAGACCGATAGCTATTGGATCAGTCAATGATATTCAGCCGGGGAATATGATCGCCGTCAGTGGGTGGGGACATTTGCAATATAAAGAGAGCTCTATGCAAAATATCTTACGACGTGTTAGTGTGCCGATAATGTCTTCAGAAGAATGCATGAATCTACCAGATGATGGCTATAAGAATATCACCGTCAGAATGTTTTGTGCTGGATATTTAGAAGGCACGAAGGATTCTTGTCAAGGTGATTCTGGAGGCCCAGCTGTGTACAATGGAAAGTTGGTTGGTCTGGTGTCGTATGGTTTTGGATGTGCTGGTAAAAATAGACCTGGAGTCTATACGAACATTCCTATATCGAGAGATTGGATCAGATCAGTGACATATCTTCCTTTGTAA

Protein sequence:

>DPOGS203741-PA
MCQVLVFFVLFCVASSLDLNVKIPQNKGLLTVECIKSYYNTVVLRSIPPRHYYYYHEVLAGKIDDSMNEEIGFRIVGGEKISIQDAPYQVLYGKYCGGSLIAPNWVLTAAHCKTHHAYVYAGSTFRDDTKPYLICSHFVHPGWNRSSLHSHDYDYQLLLLEQSIPVNSMARPIAIGSVNDIQPGNMIAVSGWGHLQYKESSMQNILRRVSVPIMSSEECMNLPDDGYKNITVRMFCAGYLEGTKDSCQGDSGGPAVYNGKLVGLVSYGFGCAGKNRPGVYTNIPISRDWIRSVTYLPL-