Monarch geneset OGS2.0

DPOGS209266
TranscriptDPOGS209266-TA1857 bp
ProteinDPOGS209266-PA618 aa
Genomic positionDPSCF300111 + 596896-600105
RNAseq coverage815x (Rank: top 16%)
Annotation
HeliconiusHMEL0082669e-12673.65% 
BombyxBGIBMGA007046-TA6e-10854.15% 
DrosophilaMP1-PC2e-4040.00% 
EBI UniRef50UniRef50_A1IIA58e-3936.30%Prophenoloxidase-activating proteinase n=3 Tax=Obtectomera RepID=A1IIA5_SAMCR
NCBI RefSeqXP_002102076.11e-4040.51%GD19715 [Drosophila simulans]
NCBI nr blastpgi|739135648e-4138.16%prophenoloxidase activating proteinase-2 [Manduca sexta]
NCBI nr blastxgi|739135647e-4138.16%prophenoloxidase activating proteinase-2 [Manduca sexta]
Group
Gene OntologyGO:00038242.7e-70catalytic activity
GO:00042528.3e-60serine-type endopeptidase activity
GO:00065088.3e-60proteolysis
KEGG pathway 
InterPro domain[360-617] IPR0090032.7e-70Peptidase cysteine/serine, trypsin-like
[364-612] IPR0012548.3e-60Peptidase S1/S6, chymotrypsin/Hap
[398-413] IPR0013144e-13Peptidase S1A, chymotrypsin-type
Orthology groupMCL25309 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209266-TA
ATGAGGAACCAAAGTCTGTGTATCACCTTCATTCTTTTTTGGATGAACATTGTGTTTATTGGCGCTGTACGAAAAGATATTTGTGACAAATGTGTCCGTATTTCTGATTGCCCAGCCTTTGCGAAAATGAATTCCCGACAACAACAGGCATGGCTTCAACAATTTCCTTGTAAAGGCCCAAGTGATAGCGAAAGACCCTCGATTTTTGGATTCTCACCCGTTGCGAAAGGAGACTATGTATGTTGTCCAAATTCTAATATTTGGGGAATAGATAATGGGTACCAAAACCAACATCAGAAGCCTGTTCCAATAAGACCTAATGAAAGGGGACATCGTAACTATCAAAGTCCTGATTTTAATAATCCAGGTGCATTTACTGGACAGCAAAACAATTTACCAGGAAATCCTGATTTTGAAAATGGTATGGGAGTTGATACAAAAATACCCCATCCATTTGGAAAACATCCAGGAACTTTTGGTGGAGACTTTAATGGAAACCCACAAAATGGACAAAATAATCAAGGCATATTTGACAATATGCCAAACTTAAATCTTCCGCAATTCCCTAACAATGGAAATAACTTTGGTGGCCAACCGCAAAACGGGCAATATCCAAATGGTCAATTTCCAAATAATAATCAGTTTCTCAATAGTAATTTCCCAAATAGTCAATTTCCAAATGGTCAATATCCAAGTTATCAATTTCCAAGCAGTCAATCTCCAAACAGCCATTTTCCGAATAATCAGTTCCCAAATGATCAATTTCCTAGCTTTTCTGGACAAACACAATATCCCAATAGTGGGGAACAAAATAGCGGGATTTTCAATCAAGGTGGATACCAACAATGTCCGTCTCATACAAACATGATTCCAGATCCTTCTGCAGGTTGTTGCGGAAAAGATGACTCTGATTCTGTAAGAATAACAGATTTACAAAAAGTCCTTAGCATGTATGCACCTGATAACTCCAATAGATATCCAAGGCCTAACTATTCTCCACGCCAAAAACCGCAAAGATATCCATACTATCAAAATAGGCAAAAACGATCCTTTGATCAAAATAACACATCGGATGATAGTCTAGAAGATAGAATAGCAGGTGGAAAAGAAACCGAATTAGATCAGTTTCCATGGACTGCTCTATTGAAGGTAACCTTCGATTATGGTAACAGAGAAGCTGCTTTTAGTTGCGGTGGTTCTCTGATAAGCCAACGATTTATCCTAACTGCTGGTCACTGCGTTTATGAATCTGGAGCAAAAGTATCAAGCGTTGAAATTACACTAGCTGAGTATGACAAAAGAACCTTTCCCAAAGACTGCATATCGGAAATGGGCGGAAGGCGAGAATGTATTGAAAATATAAGAATGTATTCGGAAAATATTATACATCATCCTGAATATGATGACGATCAGCTACATAATGATATTGCACTTATAAAAATTCGTGGATATGCTCCCTATACGCGTTTTATAAGGCCTATCTGCCTTCCGCCGTTAAATATCGATGACCCTGATTTATCAAACCTTCCCCTCTCTGTGGCGGGATGGGGTCGCAACGGTGCTTATGAAACTAATATCAAACAATCGACTGTAGTTCATTTGGTGCCCCATGACAAATGTTTGAAGTCATATCCTCAATTGACGTCTTCTCACCTATGTGCAGCCGGTCGCACCGGTGAAGATACTTGTAAAGGCGACTCAGGAGGTCCTTTAATGATGTTATATCGAGGAAATTATTATATTATTGGTGTTGTTAGTGGCAAAAGAGCTGACAGTCCATGTGGAACGTCAGTACCTTCACTTTACACGAATGTCTATCAATATGTACCTTGGATAACAAGTAGTTTAAGAAATTGA

Protein sequence:

>DPOGS209266-PA
MRNQSLCITFILFWMNIVFIGAVRKDICDKCVRISDCPAFAKMNSRQQQAWLQQFPCKGPSDSERPSIFGFSPVAKGDYVCCPNSNIWGIDNGYQNQHQKPVPIRPNERGHRNYQSPDFNNPGAFTGQQNNLPGNPDFENGMGVDTKIPHPFGKHPGTFGGDFNGNPQNGQNNQGIFDNMPNLNLPQFPNNGNNFGGQPQNGQYPNGQFPNNNQFLNSNFPNSQFPNGQYPSYQFPSSQSPNSHFPNNQFPNDQFPSFSGQTQYPNSGEQNSGIFNQGGYQQCPSHTNMIPDPSAGCCGKDDSDSVRITDLQKVLSMYAPDNSNRYPRPNYSPRQKPQRYPYYQNRQKRSFDQNNTSDDSLEDRIAGGKETELDQFPWTALLKVTFDYGNREAAFSCGGSLISQRFILTAGHCVYESGAKVSSVEITLAEYDKRTFPKDCISEMGGRRECIENIRMYSENIIHHPEYDDDQLHNDIALIKIRGYAPYTRFIRPICLPPLNIDDPDLSNLPLSVAGWGRNGAYETNIKQSTVVHLVPHDKCLKSYPQLTSSHLCAAGRTGEDTCKGDSGGPLMMLYRGNYYIIGVVSGKRADSPCGTSVPSLYTNVYQYVPWITSSLRN-