Monarch geneset OGS2.0

DPOGS214492
TranscriptDPOGS214492-TA807 bp
ProteinDPOGS214492-PA268 aa
Genomic positionDPSCF300122 + 233234-236797
RNAseq coverage150x (Rank: top 53%)
Annotation
HeliconiusHMEL0139281e-5747.31% 
BombyxBGIBMGA003580-TA1e-5043.44% 
DrosophilaCG16998-PA4e-3935.20% 
EBI UniRef50UniRef50_Q4L1K09e-5850.92%Trypsin-like protein (Fragment) n=1 Tax=Sesamia nonagrioides RepID=Q4L1K0_9NEOP
NCBI RefSeqXP_001814174.13e-4540.41%PREDICTED: similar to Trypsin alpha [Tribolium castaneum]
NCBI nr blastpgi|510943783e-5750.92%trypsin-like protein precursor [Sesamia nonagrioides]
NCBI nr blastxgi|510943787e-5650.92%trypsin-like protein precursor [Sesamia nonagrioides]
Group
Gene OntologyGO:00038241.3e-83catalytic activity
GO:00042521.4e-83serine-type endopeptidase activity
GO:00065081.4e-83proteolysis
KEGG pathwayani:AN2366.28e-41 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[12-263] IPR0090031.3e-83Peptidase cysteine/serine, trypsin-like
[24-258] IPR0012541.4e-83Peptidase S1/S6, chymotrypsin/Hap
[51-66] IPR0013142e-14Peptidase S1A, chymotrypsin-type
Orthology groupMCL19891 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214492-TA
ATGTCGGTTGCGCGGGCGCAGGACGTCTTACAAGTAGCCGACTTCACGACACCCGACGTCCCAAATACAAGAATTATTGGAGGAAATAATACATCTATAGAGAAGTATCCTTACACAGTACAGATCCTGTATGATTCTCAACTGTCATGTGGAGGAGCCCTCATCACATCTCGTCACGTGCTCACCGCGGCACACTGCTTTGTGGCATCGAACGGCAAGATTGTGAGTCCGAGATACTTTACAGTGAGAGCTGGTTCCACTTATCTAGACAGAGGTGGTAATGTGTATAGCGTGTCAGCGATTACAGTCCACGGCTCATACAACACGCCGGTGCGCAGTAACGATATCGCTATAGTAACACTGAGCAAAGCGGTGAATTTCACGAGCAGTGTAAGGAGCGCGGTAATAGCGTCACCGGGGGCGGTGGTACCAGACAACGGATCGGTGGTGGCAGTGGGATGGGGACGAACTACGCTCGATGGACCTGAATCCCAGGTGCTCCAGGCGGTGGAGGTATTCAAAGTGTCTCAGCAGGAGTGCTCCAAGCGCTACGACGAGCTTCACCAATCTACCAACAGCCCCTTCCCCGTCACCGACGGCATGATATGCGCCGGATTACTTGACGTCGGAGGTAAAGACGCGTGTCAAGGCGATAGCGGCGGTCCGCTGATACATAGAGGGGTGGTAGTGGGGCTGAGCTCGTGGGGCTACAGCTGCGCGCAACCCTACTTCCCCGGAGTGTACACCCGAGTCGCTACATACGCCACCTGGATCAACAGCACGGTGAGCAGCACGCGCTCGCTATAG

Protein sequence:

>DPOGS214492-PA
MSVARAQDVLQVADFTTPDVPNTRIIGGNNTSIEKYPYTVQILYDSQLSCGGALITSRHVLTAAHCFVASNGKIVSPRYFTVRAGSTYLDRGGNVYSVSAITVHGSYNTPVRSNDIAIVTLSKAVNFTSSVRSAVIASPGAVVPDNGSVVAVGWGRTTLDGPESQVLQAVEVFKVSQQECSKRYDELHQSTNSPFPVTDGMICAGLLDVGGKDACQGDSGGPLIHRGVVVGLSSWGYSCAQPYFPGVYTRVATYATWINSTVSSTRSL-