Monarch geneset OGS2.0

DPOGS210979
TranscriptDPOGS210979-TA1113 bp
ProteinDPOGS210979-PA370 aa
Genomic positionDPSCF300004 - 161961-171624
RNAseq coverage184x (Rank: top 49%)
Annotation
HeliconiusHMEL0250092e-9470.50% 
BombyxBGIBMGA006406-TA2e-7466.99% 
DrosophilaCG11836-PH6e-10767.74% 
EBI UniRef50UniRef50_G9F9F21e-15976.01%Seminal fluid protein CSSFP002 n=1 Tax=Chilo suppressalis RepID=G9F9F2_9NEOP
NCBI RefSeqXP_313874.43e-11276.61%AGAP004570-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3640235515e-15976.01%seminal fluid protein CSSFP002 [Chilo suppressalis]
NCBI nr blastxgi|3640235512e-15775.68%seminal fluid protein CSSFP002 [Chilo suppressalis]
Group
Gene OntologyGO:00038243.8e-92catalytic activity
GO:00042521.1e-86serine-type endopeptidase activity
GO:00065081.1e-86proteolysis
KEGG pathway 
InterPro domain[123-363] IPR0090033.8e-92Peptidase cysteine/serine, trypsin-like
[131-358] IPR0012541.1e-86Peptidase S1/S6, chymotrypsin/Hap
[158-173] IPR0013141.1e-16Peptidase S1A, chymotrypsin-type
Orthology groupMCL16467 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210979-TA
ATGTGCCCATGGAGAACAGTGGCTATATTTTGTGTGCTTCTTACATTAAATATTTCAGAGCCCAAGCAAATTAGCCTACAGGACGCCGTAAAAGATAACCATGTATCCACGGGCAACAGGACACAAAGATTTTTGTTCGACGCCATCTTCGGTCTTGAAGTGCCGATTTTGGATGATCCAAGCGTGGAGGAAGATGAAGAAGAGGAACCCCAAGTCAGGAATTGTTCCTGTGCATGTGGCCGAGCAAATCTACTTCCTAGAAAGATAGCTCTAACATTTCTCCCAGTAGTTGACGATTCAGTAGAAGTCGTAACACAGGCTTATCATCCTGCATTCAGTTCATTACACTTCACTTTCTCCTCAGAGTGTGGTGGGCCGAACCAAGAGAACCGCATAGTTGGTGGTATGCCAGCCGGCGTGAACCGGTATCCCTGGATGGCAAGGCTTGTTTATGATGGTCAGTTCCACTGTGGAGCATCCTTGCTAACCAAGGAGTATGTATTGACAGCCGCTCATTGTGTGCGTAAACTAAAACGCTCCAAAATCCGTGTCATTCTTGGCGACCATGATCAAACCATAACGTCTGAAAGTCCGGCCATAATGCGTGCTGTCACCGCCATAGTAAGACATCGAAGTTTCGACTCCGACTCATACAACAACGACATTGCATTGTTGAAACTACGGAAACCAGTGACCTTCTCAAAGATCATCAAACCAGTTTGCCTACCACCAGCAAGTATTGAACCATCTGGAAAAGAGGGAATCGTGGTAGGCTGGGGCCGTACTTCGGAGGGGGGTCAGCTACCTGCTGTGGTTCAGGAAGTCAGGGTGCCAATTCTATCACTGTCGCAGTGCCGCGGAATGAAGTATAGGGCTACAAGAATCACTAATAATAGATCGCTTTGCGCGGGGCGATCGTCAACTGACTCCTGTCAAGGAGACAGCGGAGGCCCCTTGCTCATACAACAGGGTGACAGATTCCAAATAGTCGGTATAGTGTCATGGGGCGTAGGTTGTGGAAGGCCGGGCTACCCCGGCGTGTACACTCGCATTACACGCTACTTACCGTGGTTGCGCGCTAATTTGAAAGACACGTGCCTGTGTGCTAATTAA

Protein sequence:

>DPOGS210979-PA
MCPWRTVAIFCVLLTLNISEPKQISLQDAVKDNHVSTGNRTQRFLFDAIFGLEVPILDDPSVEEDEEEEPQVRNCSCACGRANLLPRKIALTFLPVVDDSVEVVTQAYHPAFSSLHFTFSSECGGPNQENRIVGGMPAGVNRYPWMARLVYDGQFHCGASLLTKEYVLTAAHCVRKLKRSKIRVILGDHDQTITSESPAIMRAVTAIVRHRSFDSDSYNNDIALLKLRKPVTFSKIIKPVCLPPASIEPSGKEGIVVGWGRTSEGGQLPAVVQEVRVPILSLSQCRGMKYRATRITNNRSLCAGRSSTDSCQGDSGGPLLIQQGDRFQIVGIVSWGVGCGRPGYPGVYTRITRYLPWLRANLKDTCLCAN-