Monarch geneset OGS2.0

DPOGS204782
TranscriptDPOGS204782-TA1044 bp
ProteinDPOGS204782-PA347 aa
Genomic positionDPSCF300231 + 622091-624420
RNAseq coverage28x (Rank: top 76%)
Annotation
HeliconiusHMEL0176962e-13284.00% 
BombyxBGIBMGA013720-TA9e-11274.82% 
DrosophilaepsilonTry-PA6e-3936.75% 
EBI UniRef50UniRef50_O973997e-4136.84%Trypsin n=1 Tax=Phaedon cochleariae RepID=TRYP_PHACE
NCBI RefSeqXP_967253.14e-4239.74%PREDICTED: similar to putative trypsin-like proteinase [Tribolium castaneum]
NCBI nr blastpgi|3454828004e-4136.30%PREDICTED: trypsin-7 [Nasonia vitripennis]
NCBI nr blastxgi|45300588e-4341.92%trypsin-like serine protease [Ctenocephalides felis]
Group
Gene OntologyGO:00038241.2e-77catalytic activity
GO:00042521.7e-70serine-type endopeptidase activity
GO:00065081.7e-70proteolysis
KEGG pathwayani:AN2366.23e-38 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[39-274] IPR0090031.2e-77Peptidase cysteine/serine, trypsin-like
[49-269] IPR0012541.7e-70Peptidase S1/S6, chymotrypsin/Hap
[76-91] IPR0013143.4e-14Peptidase S1A, chymotrypsin-type
Orthology groupMCL24887 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204782-TA
ATGACTCGCCATCTACTTTGGCTTGTTCTTTATCTCGTACCGTTTGTGAACTTACAAGAGGCGGTTGATAGTCAAAAAGCAGAAGAGACGGAAAATGAAGTTCAAGATATCGAATTGGAGGCAGCTTGTCCCGAACTCAGCGGACAGATCATCGGAGGCCGACCGAGTTCAGTGTCGCGACATCCTTACCAAGTGTCCATGGTGCTGAACGGAAACTCCTTCTGTGGTGGCTTTATAATAAGCAGAGATTATGTGCTAACAGCGGCACATTGTGTCCAGAATGTAGCACCGCAAGCTGTTCGTCTGCGTGTGGGCAGCACTCGTCGTGACTCTGGAGGTCGTATTGTGGCCGTCTCCAATGTAACATGGCATGCGTCGTACGGCCAGCCGCAGTTCGACAACGATATAGCAGCCTTAAGACTGGCCCAGCCTCTGGTCTTCGGAGACTCCATACAACCCATTAGACTTCCGAGACCCAGACAACCTGTTCCTTTGGTCAGGCTGACGGTTACTGGATGGGGTCTCACTGCTTTGGGGGGACGTAGAATTCCCAGGATCATGATGGAGGCTAACGTTCCAGTAGTCCCTCACTGGCTCTGCCGACTGTCTTACGGAGACGCTCTCACAAATAACATGTTTTGTGGTGGACATTTTCTTATAGGAGGAGTTTCATCTTGTCAGGGTGACTCCGGAGGGCCGGCGGTGTTCCGCGGGACAGCTTTCGGCATCGTGTCATTCGCTCGGGGCTGCGCCCTGCCGCTATCACCGACTGTGTTCAGTAACATCGCTGCCCTGCGGGATTGGGTCACACAGAACACCGGAGTATACGAGATGTTTCTTATTGGATACGAGTTCAAGGTCCTGTCTCCAATCTTCACAATCCACTGGGGTCTGCAGGCGCGTCGGACCAGACCTCTGTGGAGGGAGAAACAAAACCAGAAGAACAGAAAACACTTCGAGACCTTTAAACGGGAACTGTACGCGAGGTACAGGAGGGACCCTCTCCATCTACTGAGGAGACCTCAGCAGGCGAAGAAAACGTAG

Protein sequence:

>DPOGS204782-PA
MTRHLLWLVLYLVPFVNLQEAVDSQKAEETENEVQDIELEAACPELSGQIIGGRPSSVSRHPYQVSMVLNGNSFCGGFIISRDYVLTAAHCVQNVAPQAVRLRVGSTRRDSGGRIVAVSNVTWHASYGQPQFDNDIAALRLAQPLVFGDSIQPIRLPRPRQPVPLVRLTVTGWGLTALGGRRIPRIMMEANVPVVPHWLCRLSYGDALTNNMFCGGHFLIGGVSSCQGDSGGPAVFRGTAFGIVSFARGCALPLSPTVFSNIAALRDWVTQNTGVYEMFLIGYEFKVLSPIFTIHWGLQARRTRPLWREKQNQKNRKHFETFKRELYARYRRDPLHLLRRPQQAKKT-