Monarch geneset OGS2.0

DPOGS203664
TranscriptDPOGS203664-TA1578 bp
ProteinDPOGS203664-PA525 aa
Genomic positionDPSCF300010 - 2499362-2506528
RNAseq coverage612x (Rank: top 21%)
Annotation
HeliconiusHMEL0133550.069.25% 
BombyxBGIBMGA003461-TA1e-17264.84% 
DrosophilaCG31728-PC9e-10656.94% 
EBI UniRef50UniRef50_G9F9H42e-16363.38%Seminal fluid protein CSSFP022 (Fragment) n=1 Tax=Chilo suppressalis RepID=G9F9H4_9NEOP
NCBI RefSeqXP_625051.14e-13853.45%PREDICTED: similar to CG31728-PA [Apis mellifera]
NCBI nr blastpgi|3640235957e-16363.38%seminal fluid protein CSSFP022 [Chilo suppressalis]
NCBI nr blastxgi|3640235953e-16663.58%seminal fluid protein CSSFP022 [Chilo suppressalis]
Group
Gene OntologyGO:00038242.1e-95catalytic activity
GO:00042524.4e-91serine-type endopeptidase activity
GO:00065084.4e-91proteolysis
KEGG pathway 
InterPro domain[269-525] IPR0090032.1e-95Peptidase cysteine/serine, trypsin-like
[289-520] IPR0012544.4e-91Peptidase S1/S6, chymotrypsin/Hap
[316-331] IPR0013146.1e-14Peptidase S1A, chymotrypsin-type
Orthology groupMCL15750 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203664-TA
ATGAAATTAAAAAATAAACTCATTATTTTGTTCATACATTTTTTAACTGACGTCTCAAGAAGTGATTTGACACACGACGATAAGACACTTGGATTTTCTGCTCATGATGGCGCCCTTTACACCGAAGATGCAGTTTTTATTAGTGCTGAACCGAATCGTATGAAACGTGAATCAAACAGAACAAGGGATGACAAACAACTATTATTTTTGCAAGCTAGACAAGCAAATGATGAAACCTTTGGAAGAGATTGTGAAACAACTATAGGGAAAAAGGGAATTTGCAAGAGCTTTCGCGACTGCTACCCATTATTTAAGATTGTGGATTTATCCGGTTACGATGGTTGGGTCATGGGTCACTATGACACATGTAGTTTTGTAAACAGGGAAAATTCAGAGCTATTCGGAGTGTGTTGTACTGAGCCCGTTGGCACACCGCCGCAGCAGGAACCAGACGTGCAACGACTTGGTGTTTTTAGACCTCCTTATCCGATTTCAATGAACAATTACCAACATCCATCACCTCTCTTACCTAAATGGATGAACATGAACGAGCCTCTTCATCGCCAGTTCTTTTCTCAATGGCCGCCAACTATACCACCACTACCCACACATCCACCGGACCACACAGCTCCAACACATCCGCCATCTATCGTTGCCGGTATTCCGACGACAACTAAACCGTCAAACGGCTTACCATCTACAACTTGGGGTACGAAACCTCCAGCAACGACAAAACAGACCTGGTCTCCTGCATATCCAACACAGCCAACAAAGCCAACCGGCCAACCGGGCGTGGATTCGTCGTGCGGAATTAAGAACGGACCACAGACCTACGGAAGTACGTATGAATCTCTTGACGAGGAGCGTATAGTGGGGGGTCATAACGCGGAGCTAAACGAGTGGCCATGGATAGTAGCGCTGTTCAATAATGGAAGACAATTCTGCGGAGGATCCCTCATAGACGATAGACATGTTTTAACAGCAGCTCATTGTGTAGCTCATATGACATCGTTGGATGTCGCTCGACTCACGGCGAGACTGGGAGACTACAACATACGGACGAACACAGAGACACAACACGTTGAGAGAAGAATCAAGAGAGTTGTCAGACATCGCGGTTTCGACATGAGGACATTATACAACGACGTAGCTGTTCTAACTTTAGACCAACCTGTGACTTTCACAAAAAACATTCGACCGGTTTGTCTTCCCGGAGGAGCCAGAGCTTATTCAGGACTAATAGCGACGGTAATAGGATGGGGAAGCTTGAGAGAAAGTGGTCCTCAACCGTCTATTCTACAAGAAGTGTCAATACCAATTTGGACTAACAACGAGTGTCGTCTCAAGTACGGCTCCGCGGCCCCTGGTGGGATCGTTGACCACATGCTGTGCGCTGGTAAAGCCAGTATGGATTCATGCAGTGGCGACAGCGGTGGACCTTTGATGGTGAATGAAGGCGGTCGTTGGACTCAAGTCGGCGTCGTGTCATGGGGTATCGGATGTGGTAAGGGTCAGTACCCTGGGGTCTACACACGAATCACCTCTTTCCTCCCCTGGATACAAAAGAACGCTAAGTGA

Protein sequence:

>DPOGS203664-PA
MKLKNKLIILFIHFLTDVSRSDLTHDDKTLGFSAHDGALYTEDAVFISAEPNRMKRESNRTRDDKQLLFLQARQANDETFGRDCETTIGKKGICKSFRDCYPLFKIVDLSGYDGWVMGHYDTCSFVNRENSELFGVCCTEPVGTPPQQEPDVQRLGVFRPPYPISMNNYQHPSPLLPKWMNMNEPLHRQFFSQWPPTIPPLPTHPPDHTAPTHPPSIVAGIPTTTKPSNGLPSTTWGTKPPATTKQTWSPAYPTQPTKPTGQPGVDSSCGIKNGPQTYGSTYESLDEERIVGGHNAELNEWPWIVALFNNGRQFCGGSLIDDRHVLTAAHCVAHMTSLDVARLTARLGDYNIRTNTETQHVERRIKRVVRHRGFDMRTLYNDVAVLTLDQPVTFTKNIRPVCLPGGARAYSGLIATVIGWGSLRESGPQPSILQEVSIPIWTNNECRLKYGSAAPGGIVDHMLCAGKASMDSCSGDSGGPLMVNEGGRWTQVGVVSWGIGCGKGQYPGVYTRITSFLPWIQKNAK-