Monarch geneset OGS2.0

DPOGS204146
TranscriptDPOGS204146-TA1218 bp
ProteinDPOGS204146-PA405 aa
Genomic positionDPSCF300034 - 1160540-1165601
RNAseq coverage221x (Rank: top 45%)
Annotation
HeliconiusHMEL0164823e-11872.70% 
BombyxBGIBMGA005172-TA2e-7755.83% 
DrosophilaSer7-PA5e-5642.81% 
EBI UniRef50UniRef50_G9F9I41e-13659.50%Seminal fluid protein CSSFP032 n=1 Tax=Chilo suppressalis RepID=G9F9I4_9NEOP
NCBI RefSeqXP_972679.14e-7739.95%PREDICTED: similar to hemolymph proteinase 5 [Tribolium castaneum]
NCBI nr blastpgi|3640236154e-13659.50%seminal fluid protein CSSFP032 [Chilo suppressalis]
NCBI nr blastxgi|3640236152e-14258.44%seminal fluid protein CSSFP032 [Chilo suppressalis]
Group
Gene OntologyGO:00038247.5e-86catalytic activity
GO:00042524.6e-73serine-type endopeptidase activity
GO:00065084.6e-73proteolysis
KEGG pathway 
InterPro domain[138-404] IPR0090037.5e-86Peptidase cysteine/serine, trypsin-like
[145-399] IPR0012544.6e-73Peptidase S1/S6, chymotrypsin/Hap
[176-191] IPR0013143.8e-15Peptidase S1A, chymotrypsin-type
[27-79] IPR0227006.7e-10Proteinase, regulatory CLIP domain
[26-80] IPR0066043.2e-07Disulphide knot CLIP
Orthology groupMCL10599 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204146-TA
ATGTTGTCTGTTTGGTTTGTGTGTGTAGTGATCTTGATCTTTGTGCCCGCCGACTGTTTATATTCTGGTGAAAGCTGCACGGTCAACGGTCGACCTGGGATTTGTAGGTTGCTCTCACAATGTTCTCATTTAGTTAATGAGATCCAAGATGCTGGAACACCGATGCCGCCGTATCTGAGAAGAAAACTACAAAACCTCTCGTGTGGCTTCGACGATGACGAGCCTATGGTTTGCTGTATTTCTAACCCTGGAGATACGGGAGATCCAAATCATAATGGGTTGCTTAAACCTTATAATAATGACGATATTGATACAGGAAAAAATAGAGTTGATGATAAAAGTGGTGCAAATACAATAGACTCCATACCGGATATTCGCTATCACCCAAAACTAAATCTGCTACCAACGAATTGCGGTGTCATTGAAAATGACAGGATTTTCGGAGGAAATAGGACAAGGCTATTTGAAATGCCGTGGATGGTGCTACTGTCATACGACTCTCCTCGCGGTACAAAATTAAGTTGCGGTGGTACTATTATAACCAGACGGTACATCTTGACAGCAGCACACTGTGTGTCATTCCTGGGATCAAGACTTACATTACGTGACGTCATCCTTGGAGAGTACGACATTAGGTCTGACCCAGATTGTGAGAGGGTTGAAGGAGAAGTGTTTTGTGCACCAAGAGTTCGGAATGTATCTATAGATGAGACTATACCTCATCCGGGGTATTCTCCCACGAGGCTAAGAGATGATATTGCTTTAATAAGGCTCTCAGAACCAGTAGATTTCACCTTGGACAGCATGAAACCAATCTGCTTGCCGACGACACCAACATTGTTATCAGAGCAGCTGGAAGGTTTGCAGGGTGTAGTGGCAGGCTGGGGCACCACCGAGGATGGACTTCAGTCACCTGTGCTGCTCAGTGTTGATCTACCAATACTCACCAATTCGCAGTGCCAGTCGGTTTATCACGGATCGCTTCAAATTTACGATACTCAACTGTGCGCAGGAGGAGTTGTGGATAAAGACTCCTGTGGTGGTGATTCTGGAGGACCATTGATGTACCCTGGAAGAACACAATCTGTTGGAGTCAGATACGTTCAACGGGGCATAGTGTCTTACGGCTCCAAGCGTTGTGGGATTGGAGGATTACCTGGAGTATACACTAGAGTATCCTATTACATGAAATGGATTTTAGATAATATAAGAGACTAG

Protein sequence:

>DPOGS204146-PA
MLSVWFVCVVILIFVPADCLYSGESCTVNGRPGICRLLSQCSHLVNEIQDAGTPMPPYLRRKLQNLSCGFDDDEPMVCCISNPGDTGDPNHNGLLKPYNNDDIDTGKNRVDDKSGANTIDSIPDIRYHPKLNLLPTNCGVIENDRIFGGNRTRLFEMPWMVLLSYDSPRGTKLSCGGTIITRRYILTAAHCVSFLGSRLTLRDVILGEYDIRSDPDCERVEGEVFCAPRVRNVSIDETIPHPGYSPTRLRDDIALIRLSEPVDFTLDSMKPICLPTTPTLLSEQLEGLQGVVAGWGTTEDGLQSPVLLSVDLPILTNSQCQSVYHGSLQIYDTQLCAGGVVDKDSCGGDSGGPLMYPGRTQSVGVRYVQRGIVSYGSKRCGIGGLPGVYTRVSYYMKWILDNIRD-