Monarch geneset OGS2.0

DPOGS212233
TranscriptDPOGS212233-TA3375 bp
ProteinDPOGS212233-PA1124 aa
Genomic positionDPSCF300263 + 187102-209283
RNAseq coverage95x (Rank: top 62%)
Annotation
HeliconiusHMEL0036066e-11751.61% 
BombyxBGIBMGA004445-TA5e-8356.55% 
DrosophilaCG30371-PA3e-5733.33% 
EBI UniRef50UniRef50_Q6R5582e-13561.01%Trypsin-like proteinase T2b n=7 Tax=Obtectomera RepID=Q6R558_OSTNU
NCBI RefSeqNP_001155191.12e-11555.44%silk gland derived serine protease [Bombyx mori]
NCBI nr blastpgi|679068257e-13561.01%trypsin-like proteinase T2b precursor [Ostrinia nubilalis]
NCBI nr blastxgi|679068258e-13862.08%trypsin-like proteinase T2b precursor [Ostrinia nubilalis]
Group
Gene OntologyGO:00038246.2e-85catalytic activity
GO:00042527.7e-79serine-type endopeptidase activity
GO:00065087.7e-79proteolysis
KEGG pathway 
InterPro domain[108-353] IPR0090036.2e-85Peptidase cysteine/serine, trypsin-like
[115-345] IPR0012547.7e-79Peptidase S1/S6, chymotrypsin/Hap
[756-870] IPR0008598.9e-17CUB
[145-160] IPR0013146.8e-11Peptidase S1A, chymotrypsin-type
Orthology groupMCL14738 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212233-TA
ATGAGCCCGAACTTCCCCCGTAGTTACCCAGCCGGGGTAGCCTGTCGATGGATTGTGCAGTGTCCTAACGGCTATCAATGCAGGGCTAGGTGTGGTTTAGGTATGCCACAGACTCCAACCTGCTCCATGGATCGCCTCTACATCTCCAGAACAGGAGACAGTGAGCTGAATTCAGCAGAATATCATTGTGGACGCGGATTAGTCAACGCTGTGTCTATTGATTCACGACTTACTGTTGGGCTAGTAACATCGACAAACAGTACTGGAGGTAGATTCAAGTGTCTTATCTCAACCCAACCGATTACCACTCCAACATGCAATTGTGGCTACAAGAAAACGAACCGAATCGTTGGTGGGGTAGAGACGAGACCCCACGAGTATCCCATGATGGCTGGTATAAGATTTTCGGATTTCGGTGAGGCCATCAAATGCGGGGCAGTTATTATTGATAGGAAGTATGTACTGACAGCTGCTCATTGTGTCGAAAACAAGAAATTGGATGAACTAACCGTGGTTGTTGGCGAACATAACGTGACCACCGACGTCGATACTCCCGATGGAAAGGTTTTACGAGTGAGAAAAGTCATAATTCATCCAAATTATAACCCTAACACACAGGATAATGATATTGCGATTATTCGAACAGCTGGCTCCATCGATTACAACGAAAATGTTGGCCCAGTCTGTCTTCCTTTTAAATACGTCAATCATGCTTTCAATGGATCTACAGTCACAATTTTAGGCTGGGGAACTACGACCTTTGGAGGGCCAAAACCAAAGGTGCTGATTAAGGCTAACGTCGAGGTTATTAGTCAGGAACGTTGCCAACGTAATATTTCTACTCTAACACCCCGGCAAATATGTACATACACGCCAGGAAAAGATTCCTGTCAGGACGATTCCGGTGGACCCTTACTATATACTGACTCGTACAATGGTCGTCTGTACAACGTAGGAGTGGTCAGCTTTGGTCAATTCTGCGCGTCAACTGAACCAGCCATCAATACGAGAGTCACTTCTTTCTTGAACTGGATTAGGAAGGAGGCACCAGCTAATTACTGTATGTATATCTTAGGTGCGCTTCTATTGTTCATACTGCAGGGTGCGCACGCGCAACAGCCAAATTGTGATTATGCAATCTCCGTTACCCCCCAGAAGAGTGAACCAATATCCAGCCCCGGATATCCTAATAGTTACCCACGTGGATCAGCCTGCCGTTGGATAGCCGTGTGTCCTTCCGGCTATAAATGTAAAGTCAATTGTGACGATGTCTATTTACCTGGGGGCCAAGATTGTCCGGTAGATCGCTTACTGATCTCAAAAACTGGTGACAGTCAATTGACATCAGCTGAGTATTATTGCGGACAAGGCTCGGTGTCTGCTATATCAACAGGAACACGGATAAGCATAGGTCTCGTCGTCTCAAATCGAAGCCCGGGAGGTAGATTTAGATGCGAAATTAGCGCCCAACGAGACAGTGCTTCAACTTGCAACTGTGGTTACAGGAAAGTGAGTCGTATCGTCGGAGGCGAAGAAACGAAGCCAAATGAATTTCCTATGATGGCTGGTATCATTTATTTGGGAGAAAACACTATCAAATGTGGTGCAGTCATCATTGATAATATATATGTACTGTCAGCCGCACATTGTGTTATCACTAAGGGAGTAAACGACATCGCTGTTGTGGTTGGCGAACATGACGTCAGGGTTGGAACTGATTCTCCAGACATCCAAGTATTCAGGGTGTCCAGGATCATTACTCATCCGAACTACAATCCCAACACATATGATAATGACATCGCGATAGTTAAAATACAAGGCGTCATCAGGTACAGTGAGAACGTTGGTCCAGTTTGTCTGCCATTCAGTTTTAAGGACGTCGACCTTTCTGGAGCCGTCGTAACAATTTTAGGGTGGGGTACTTTATTCCCTGGCGGCCCAAGCTCCAGTGTTCTCCGGAAGGTGAACGTGAACGTCATCAGTCAGTCGACATGTAGACAAAATGTCTTATCACTAACTCCCCGACAAATTTGCACATTCACCAGAGGGAAAGACGCGTGTCAGGACGATTCCGGCGGTCCTCTTCTCTACCTGGATACTTTTACTGGTCGTCTGTTCAACGTGGGAATTGTCAGCTTCGGCCAGTTATGCGCCTCGAACAGTCCCGGCATCAACACACGAGTCACCGACTTCCTAGACTGGATTATGTTCGGATCAGGAGCTATTTTTTTGTTGGTTGTGGGCTATGCTCAGGCGCAAGATGCGAATTGCGATTTCTTTTTAAATGTTGCTGCCGGGAGAAGTTATCCTATCTCCAGTCCGAACTATCCATACAGCTACAGGCCAGGAGTAACTTGTCGCTGGATTGCACAATGCCCAAATGGATATAATTGCAGATTGGATTGCAGCGAAATAAATTTACCGCAGACGCAAAACTGCTACATGGATAGGTTACTAGTATCCAAGACAGGGGATAGTCAATTGGGGTCATCGGAATACCATTGTGGTTATGGCACATTGACAGCTGTCTCCGTAGCAAACAGGATAAGTGTTGGTCTCGTTACGTCGAGGTCCAGTCGCGGCGGCAGATTTACCTGCACTGTTACTGCCCAAGCGTCGTCAACCTGCAGCTGTGGTTACAGAAATAGTTATATCGTCGGCGGTGAGGAGACACGTCCTAATGAATATCCCATGATGGCTGGCATCGTGTATGTGGGAGAGAACACCATCAAATGCGGTGCAGTCATCATTGATAACGGATACGTATTGACAGCTGCTCACTGCGTCGTCGGCAAAAATCTCGGTGAACTCGCTGTGGTCGTTGGCGAACATGACGTCAGCACCGGAGCGGATTCGCCGTCCTTGCAAGTTTTCAGAGTTGCTTCGGTTATAATTCATCCTCAATTTAACTCGGATACATATGACAACGACATCGCCATCATACAGATATATGGCAGTATAGTGTACAGTCAGAAAGTAGGACCTGTCTGTCTGCCATTTAAGTTCATAAACGACGACTTCACCGGATCCAAAGTTACCATTTTGGGTTGGGGGACGACATTCCCCGGAGGTCCAACATCGAACGTGCTCCGGAAGGTGGACGTGAATGTCGTCAGTCAGGCTTCATGCAGCAGAAGTTATCCAAGTCTCAGTAACAATCAGATGTGTACATTTGCTCAAGGGAAAGATGCTTGTCAGGACGATTCTGGCGGTCCTCTGCTCTACCAGGACCCTTCGAACGGCCGTTTATACAGTGCTGGTATCGTTAGCTTCGGCCGATTCTGTGCCTCCAGTTATCCCGGGGTGAACACAAGAGTCACCTCATACCTCTATTGGATCCTTAACAACGCCCCGGCCAACTACTGCAATATTTAA

Protein sequence:

>DPOGS212233-PA
MSPNFPRSYPAGVACRWIVQCPNGYQCRARCGLGMPQTPTCSMDRLYISRTGDSELNSAEYHCGRGLVNAVSIDSRLTVGLVTSTNSTGGRFKCLISTQPITTPTCNCGYKKTNRIVGGVETRPHEYPMMAGIRFSDFGEAIKCGAVIIDRKYVLTAAHCVENKKLDELTVVVGEHNVTTDVDTPDGKVLRVRKVIIHPNYNPNTQDNDIAIIRTAGSIDYNENVGPVCLPFKYVNHAFNGSTVTILGWGTTTFGGPKPKVLIKANVEVISQERCQRNISTLTPRQICTYTPGKDSCQDDSGGPLLYTDSYNGRLYNVGVVSFGQFCASTEPAINTRVTSFLNWIRKEAPANYCMYILGALLLFILQGAHAQQPNCDYAISVTPQKSEPISSPGYPNSYPRGSACRWIAVCPSGYKCKVNCDDVYLPGGQDCPVDRLLISKTGDSQLTSAEYYCGQGSVSAISTGTRISIGLVVSNRSPGGRFRCEISAQRDSASTCNCGYRKVSRIVGGEETKPNEFPMMAGIIYLGENTIKCGAVIIDNIYVLSAAHCVITKGVNDIAVVVGEHDVRVGTDSPDIQVFRVSRIITHPNYNPNTYDNDIAIVKIQGVIRYSENVGPVCLPFSFKDVDLSGAVVTILGWGTLFPGGPSSSVLRKVNVNVISQSTCRQNVLSLTPRQICTFTRGKDACQDDSGGPLLYLDTFTGRLFNVGIVSFGQLCASNSPGINTRVTDFLDWIMFGSGAIFLLVVGYAQAQDANCDFFLNVAAGRSYPISSPNYPYSYRPGVTCRWIAQCPNGYNCRLDCSEINLPQTQNCYMDRLLVSKTGDSQLGSSEYHCGYGTLTAVSVANRISVGLVTSRSSRGGRFTCTVTAQASSTCSCGYRNSYIVGGEETRPNEYPMMAGIVYVGENTIKCGAVIIDNGYVLTAAHCVVGKNLGELAVVVGEHDVSTGADSPSLQVFRVASVIIHPQFNSDTYDNDIAIIQIYGSIVYSQKVGPVCLPFKFINDDFTGSKVTILGWGTTFPGGPTSNVLRKVDVNVVSQASCSRSYPSLSNNQMCTFAQGKDACQDDSGGPLLYQDPSNGRLYSAGIVSFGRFCASSYPGVNTRVTSYLYWILNNAPANYCNI-