Monarch geneset OGS2.0

DPOGS200031
TranscriptDPOGS200031-TA1137 bp
ProteinDPOGS200031-PA378 aa
Genomic positionDPSCF300337 + 85048-87769
RNAseq coverage213x (Rank: top 46%)
Annotation
HeliconiusHMEL0130643e-9964.98% 
BombyxBGIBMGA012427-TA3e-10756.86% 
Drosophilasnk-PB4e-4437.75% 
EBI UniRef50UniRef50_Q1HPQ68e-10556.86%Serine protease 7 n=4 Tax=Obtectomera RepID=Q1HPQ6_BOMMO
NCBI RefSeqNP_001040537.12e-10556.86%serine protease 7 [Bombyx mori]
NCBI nr blastpgi|564183998e-11057.35%hemolymph proteinase 9 [Manduca sexta]
NCBI nr blastxgi|564183998e-11157.94%hemolymph proteinase 9 [Manduca sexta]
Group
Gene OntologyGO:00038242.3e-72catalytic activity
GO:00042522.3e-70serine-type endopeptidase activity
GO:00065082.3e-70proteolysis
KEGG pathwaydpo:Dpse_GA195432e-39 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[104-374] IPR0090032.3e-72Peptidase cysteine/serine, trypsin-like
[114-369] IPR0012542.3e-70Peptidase S1/S6, chymotrypsin/Hap
[146-161] IPR0013142e-11Peptidase S1A, chymotrypsin-type
Orthology groupMCL21003 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200031-TA
ATGCATTTATGGGGATTCGCGACAATATTATTAGCGACAACATATAAAGGAGAAACTATTGAAAGGAAACCATCTTTAAAAGTGGACAAACCGTGGATCTGGTCGGAGTATAGCACTCGTCCTGTCAAGAAAAACTTCACTCAAAACTATGTACAGCCGCAACAGTCACCATTGATAATTTTACCGGAAATACCGAAACCTAATTTCATAAAAGCCGGTCGAAGGATTAGTGAATCAAAATGCCTAGAATACTTATGGGAACGTAGGAATCGTTTAGCATCTAAAGCTCCTGATGCTACACTTGAAATATTTGTTTTAGAACCAGGTTTCTTCCAAATAGCAGAAATTGGGGGAAGAGATGCTGAAGATGGCGAATTCCCCCATATGGGAGCTCTTGGTTGGAGGGCAGTTTTAGGTACTTGGATATTCAAATGTGGAAGTAGTCTAATAAGCTCGAAATTTACATTGACTGCTGCACACTGTTCCAAAATACCTTCAGATCCCACCATTGTAAATCAAGTTCCGGAGATAGTGAGACTGGGTGAAAAAAATATTATTGATATCTTTGCCAATGGTGTTCAACCTACGGATGCAAAAATTCTAAGAATAATTATTCATCCGTTATATTCTCCACCAAAAAAATACTACGACATAGCTTTGGTAGAGCTTGCATCCGAGTTGGTTTTTTCGAGTAACGTGCAACCCGTATGTCTGTGGAATAATTTCGACACCAGCAAACTCGGATCTAAAGTGTCGTCGACTGGTTGGGGTGTTGTTGATACAGCAACTAAAAAGACGTCGCCAATACTTCAGGCTATTGAAATCGAATTGATAGACTCTGGTAGATGTAATCAACTGTTGAGACACGCTTGTAATAGAAGATGGTGTGGACTTCAAGACCATCAGTTCTGTGCAGGAAAACTGGAAGGAGGCGTTGACGCTTGCCAGGGCGATTCTGGAGGTCCATTGCAAGTAAAAATTAATTTACCTGATGCTGGTGAAGGGACAATGCATTATTTGTTAGGCGTTACATCCTTTGGTATTGGATGTGCTCGTCCCAACCTTCCAGGTGTTTATACCAGGGTGTCATCATTTATTGACTGGATTGAAGACAACGTGTGGGGAAAACGTTTATAG

Protein sequence:

>DPOGS200031-PA
MHLWGFATILLATTYKGETIERKPSLKVDKPWIWSEYSTRPVKKNFTQNYVQPQQSPLIILPEIPKPNFIKAGRRISESKCLEYLWERRNRLASKAPDATLEIFVLEPGFFQIAEIGGRDAEDGEFPHMGALGWRAVLGTWIFKCGSSLISSKFTLTAAHCSKIPSDPTIVNQVPEIVRLGEKNIIDIFANGVQPTDAKILRIIIHPLYSPPKKYYDIALVELASELVFSSNVQPVCLWNNFDTSKLGSKVSSTGWGVVDTATKKTSPILQAIEIELIDSGRCNQLLRHACNRRWCGLQDHQFCAGKLEGGVDACQGDSGGPLQVKINLPDAGEGTMHYLLGVTSFGIGCARPNLPGVYTRVSSFIDWIEDNVWGKRL-