Monarch geneset OGS2.0

DPOGS207338
TranscriptDPOGS207338-TA1269 bp
ProteinDPOGS207338-PA422 aa
Genomic positionDPSCF300188 - 6960-57803
RNAseq coverage540x (Rank: top 23%)
Annotation
HeliconiusHMEL0052242e-12958.06% 
BombyxBGIBMGA009551-TA2e-12255.92% 
DrosophilaCG40160-PD2e-8347.56% 
EBI UniRef50UniRef50_Q1HPQ54e-12055.92%Serine proteinase-like protein n=4 Tax=Obtectomera RepID=Q1HPQ5_BOMMO
NCBI RefSeqNP_001040462.18e-12155.92%serine proteinase-like protein [Bombyx mori]
NCBI nr blastpgi|216302334e-12158.85%serine proteinase-like protein 2 [Manduca sexta]
NCBI nr blastxgi|216302336e-12357.89%serine proteinase-like protein 2 [Manduca sexta]
Group
Gene OntologyGO:00038243.3e-71catalytic activity
GO:00042527.5e-56serine-type endopeptidase activity
GO:00065087.5e-56proteolysis
KEGG pathway 
InterPro domain[153-409] IPR0090033.3e-71Peptidase cysteine/serine, trypsin-like
[164-405] IPR0012547.5e-56Peptidase S1/S6, chymotrypsin/Hap
[195-210] IPR0013147.7e-09Peptidase S1A, chymotrypsin-type
Orthology groupMCL18768 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207338-TA
ATGGCCCGGCCGAGGCTGGAAAAGGGCGACCGCGTTCCGCCCACCGTCACGGAATCTGAAATGGAGGCGATTCTGGCTCTCCTCCAGAGTAAGAAGAGAGCACTGGGTTCGGACGGGGTGCAAGGGAGAATTTTCGCTCTCTCCCTAGTGCACCTCGGGGGGAGCCCTCAGGGAGCTATGCTGTTACCACTTCTATCTGTTTTGGCGGTGACGATGGCTTATCCCCCGACCACCGTGGACCCCTCGGTCCTGGTGGACGTCTTTGGAACCCCACCACCCACAGTCAGGCCCACGAACCTACCACCCAGGCTTGAGGACTACAATTTCAAACATCTTACCGTCCCCACCATTAGATTCGGCGTGAACGATGACTGCCAGGAGAGTGTGGAGAGATGCTGCAAGAAATCCAAGCCTTACGTGGAGCCGACGACCCCCGCACCCCCGACGGAGAGGGGTTGCGGGTACCGGAACCCCCACGGCATCGGCTTCACAATAACCGGCGGGAATGGTGACGAGGCCCAGTTCGGCGAGTTCCCGTGGGTGGTGGCCTTGCTGGACGCCGCTAACGGCAGCTACGTGGGGGTCGGGGTGCTCGTCCACCCCCTCGCGGTGATGACGGGGGCGCACGTGGTGTACAAATACCCGGCCGGCGGGCTAATAGCTCGGGCCGGTGAGTGGGATACGCAGACCACGAAAGAGCCCCTCAAGTCGCAGGAGCGAGTCGTGGAAGACATTGTTATAAAGGAAGGTTTCAACCCTAAGACTCTCCATAACGACATGGCCCTCCTCCGGCTCCAGCGACCCCTGGAGCTGGCGGCTCACATCAACGTCATCTGTCTCCCGGAACAAGACGAATCCTTTGAGCAGAGCCGGCACTGTGTGGCCAACGGTTGGGGGAAAAACGTCTTCGGTGGCAGAGGACGGTACGCTGTGATCCTGAAGCGCGTGGAGGTGGACATGGTGCCCTTCGACCAGTGCGCCAACCAGTTGAAGAGGACCCGCCTAGGGAGCCGCTTCCAGCTGCACAGGAGCTTCGTCTGCGCTGGTGGGGAAGAAGGAAGAGATACCTGTCAGGGCGACGGCGGCGCGCCCCTCGCTTGTCCGTTCGGTGAGAATCGGTACAAGCTGAGCGGTCTGGTCGCGTGGGGTATAGGGTGCGGGGAGAAGGACGTACCAGCGGCGTATGTCAGGGTGTCCATGTTCAGGAGGTGGACGGACGACGTCATGCTGTCCTGGGGCCTCAACACCGACGGATACACGGCCTACTGA

Protein sequence:

>DPOGS207338-PA
MARPRLEKGDRVPPTVTESEMEAILALLQSKKRALGSDGVQGRIFALSLVHLGGSPQGAMLLPLLSVLAVTMAYPPTTVDPSVLVDVFGTPPPTVRPTNLPPRLEDYNFKHLTVPTIRFGVNDDCQESVERCCKKSKPYVEPTTPAPPTERGCGYRNPHGIGFTITGGNGDEAQFGEFPWVVALLDAANGSYVGVGVLVHPLAVMTGAHVVYKYPAGGLIARAGEWDTQTTKEPLKSQERVVEDIVIKEGFNPKTLHNDMALLRLQRPLELAAHINVICLPEQDESFEQSRHCVANGWGKNVFGGRGRYAVILKRVEVDMVPFDQCANQLKRTRLGSRFQLHRSFVCAGGEEGRDTCQGDGGAPLACPFGENRYKLSGLVAWGIGCGEKDVPAAYVRVSMFRRWTDDVMLSWGLNTDGYTAY-