Monarch geneset OGS2.0

DPOGS215119
TranscriptDPOGS215119-TA1170 bp
ProteinDPOGS215119-PA389 aa
Genomic positionDPSCF300139 + 360951-367277
RNAseq coverage98x (Rank: top 61%)
Annotation
HeliconiusHMEL0078003e-8642.39% 
BombyxBGIBMGA009610-TA5e-7847.12% 
DrosophilaCG11843-PA1e-4040.24% 
EBI UniRef50UniRef50_Q5MPB32e-7840.38%Hemolymph proteinase 21 n=5 Tax=Bombycoidea RepID=Q5MPB3_MANSE
NCBI RefSeqXP_001655816.13e-5234.57%serine protease [Aedes aegypti]
NCBI nr blastpgi|3796990222e-7840.82%serine protease HP21 precursor [Bombyx mori]
NCBI nr blastxgi|564184252e-7943.14%hemolymph proteinase 22, partial [Manduca sexta]
Group
Gene OntologyGO:00038241.4e-75catalytic activity
GO:00042524.3e-64serine-type endopeptidase activity
GO:00065084.3e-64proteolysis
KEGG pathwaydpo:Dpse_GA195431e-30 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[136-388] IPR0090031.4e-75Peptidase cysteine/serine, trypsin-like
[143-383] IPR0012544.3e-64Peptidase S1/S6, chymotrypsin/Hap
[173-188] IPR0013143.5e-11Peptidase S1A, chymotrypsin-type
Orthology groupMCL26343 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215119-TA
ATGACCATAACCTTGAAATTGCTGTTCGCCATCACAAGCATAGCGTTTTCTCCATTTTTGAGTAGTACTGAAATAGGGGATCCATGTCCTGGAGGCCATCCTGGGGTGTGTGCTGACATCTACAATTGCACGTCAGCTTTAATTAACATCAAACTTAGAAAACCACTGAGCGTGCCAAGGAAAATTACGACTACAACTACCGCTCCAGTCGCTAGAAAAAAAATAAAACGAGTCAGGTTCAGCCATTACAAGATACCGAGGAATTGCTCCCCTATAGCAAGCAACCTCACTGTTCCAAAAACAGGACAAAAGGCGTTCGACAAATGTTTGTACTATCAGGAAAAATACGTGTATCCCTGCCTGGATAGCCCTCTTCCGGGACAAGCCAAGGCGAGGGCGAATTACTGTACATGGAGTTCCGAAGGCCTGATCGTTAATGGAGAGAATGCTTCTCGCGGGGAGTTCCCACACATGGCACTGCTCGGTTTTGGAACGCGCAAAATAGAATGGAAATGTGGTGGAACTATAATCAGCGAGAGGTTTGTTCTTACTGCGGCACACTGCACCAAGACTGCTATACGGGGTTCTGTCACCAAAATCAAGATAGGAATCTTAAAATCGTCTGAGCCAGATACGGATTTCAACGTGTACAATGTTTTTAAGATTCACGTCCACGAAAATTATCATTCTCCACTAAAATACAATGACATAGCTTTGCTTGAGACGGACAGAGAGATGCTGCTAGGTCCAGAAGCATTCCCAGCATGTCTCAATGATGGGACTGAGGTCAGCGACACAGTCATAGTGTCTGGCTGGGGTCAAACCAGCACTACCAGGAGAATCATGTCCGACGTCTTGCAGAAGGCGTACTTGAAAAACTTCGATGAGTCCGAATGCCACAGCTATCACGAGGTTTACAGTCACAGAAACATGCCGGATGGAATAGATAGCGAGACGCAAATATGTTTCGGAAACAAAAACAATGCAAGTGATACTTGCGGAGGCGATAGCGGAGGTCCAGCCCAGATCAAACATCCGAAAGTGTACTGCATGTACTTGGTGATGGGTGTGACCTCTTTCGGACGATCCTGTGGGTTGCAGGGAGCGCCAGGAGTTTACACTAGAGTATCTCATTTTTTGCCCTGGATCGAGAGAACTGTGTGGCCATAA

Protein sequence:

>DPOGS215119-PA
MTITLKLLFAITSIAFSPFLSSTEIGDPCPGGHPGVCADIYNCTSALINIKLRKPLSVPRKITTTTTAPVARKKIKRVRFSHYKIPRNCSPIASNLTVPKTGQKAFDKCLYYQEKYVYPCLDSPLPGQAKARANYCTWSSEGLIVNGENASRGEFPHMALLGFGTRKIEWKCGGTIISERFVLTAAHCTKTAIRGSVTKIKIGILKSSEPDTDFNVYNVFKIHVHENYHSPLKYNDIALLETDREMLLGPEAFPACLNDGTEVSDTVIVSGWGQTSTTRRIMSDVLQKAYLKNFDESECHSYHEVYSHRNMPDGIDSETQICFGNKNNASDTCGGDSGGPAQIKHPKVYCMYLVMGVTSFGRSCGLQGAPGVYTRVSHFLPWIERTVWP-