Monarch geneset OGS2.0

DPOGS213646
TranscriptDPOGS213646-TA1119 bp
ProteinDPOGS213646-PA372 aa
Genomic positionDPSCF300165 - 4125-7050
RNAseq coverage2x (Rank: top 92%)
Annotation
HeliconiusHMEL0045952e-7868.06% 
BombyxBGIBMGA004622-TA1e-9266.67% 
DrosophilaCG5367-PA1e-7139.25% 
EBI UniRef50UniRef50_A0FDR37e-13360.54%Cathepsin L-like proteinase n=3 Tax=Endopterygota RepID=A0FDR3_BOMMO
NCBI RefSeqNP_001091806.11e-13360.54%cathepsin L-like proteinase [Bombyx mori]
NCBI nr blastpgi|1482987242e-13260.54%cathepsin L-like proteinase precursor [Bombyx mori]
NCBI nr blastxgi|1482987241e-13360.86%cathepsin L-like proteinase precursor [Bombyx mori]
Group
Gene OntologyGO:00082341.5e-89cysteine-type peptidase activity
GO:00065084.4e-63proteolysis
KEGG pathwaymmu:130388e-57 
 K01371 (CTSK)maps-> Toll-like receptor signaling pathway
    Lysosome
InterPro domain[46-370] IPR0131281.5e-89Peptidase C1A, papain
[144-371] IPR0006684.4e-63Peptidase C1A, papain C-terminal
[51-111] IPR0132015.4e-11Proteinase inhibitor I29, cathepsin propeptide
Orthology groupMCL17345 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213646-TA
ATGACTTCTTTACCGCTGTTTAAAAAAACTGCCAGAACTAGAGATTATCTGTCGCCGTTCAATGAAAGTGGAGTGCAAAGAAGTCGAGAGAAATGCAAATATAGAAAGGATTATATTGTTCGAGTTCCTGAAAATTTGCTGGACGTACACTGGGAGGAGTATAAAACGAAATACAATAAAAAGTATCAGAGCAGGTTTCACGAGAGGTCAGCGTTATCGACTTGGAGAAGCAATATGAAGAGGGTGGCGGGTCACAACCAGGAGTACCTCGCCGGCAAACAGGCGTACACACTGCACCTGAACCACTTCGGGGACTGGTCCATATTTAGTTACATCAAGCAGTTGTTGAAATTGATCAGAACCTTACCCCTGTTCGACCCGGCAGAGGACCGCCGCAGGACGACCTACCGGAACACCTTTGACACCCGACTGCCAGAGAGGGTGGACTGGCGGGAGAAGGGTTTCCGTCCTCGTTTGGAGGAGCAGTTCCACTGTGGAGCGTGCTACGCCTTCGCCATCACACACGCTGTTCAGGCTCAAGTCTACAAGCGACACGGAGACTGGAGGGAGTTGAGCCCGCAGCAGATAGTGGATTGTAGTTTTAAGGATGGCAACTTCGGCTGCGATGGAGGTTCCTTACAAGCGGCTCTGAGGTACGTGGCGAGGGATGGTCTCATCAGGGAGACATATTACCCTTACATCGGTCAAAGAGGCGCTTGTCATTACAACAGTGAGTCCGTGTCGGCCCGCGTCCGGCGCTGGTCCTCCCTCCCTCCGGGAGACGAAGCAGCCATGGAGCGAGCACTGGCCACGCTCGGACCACTAGCCGTCGCCGTCAATGCAGCGCCGTTCACCTTCCAGTTATACAGGTACCGGGGGGGGACTGAGCTCCGATTAAAACTCATACTCATTCCAGATCAAAGAAGTGGCATATACGATGATCCATTCTGCGTCTCCTGGCACCTCAACCACGCCATGTTGCTAGTGGGATACACGCCAGAGTATTGGATACTACTTAACTGGTGGGGAGAGCAGTGGGGGGAAAACGGATACATGAGGATAAAAAGAGGACTGAATATTTGCGGGGTGGCAAATATGGCGACTTATGTGGAACTATAA

Protein sequence:

>DPOGS213646-PA
MTSLPLFKKTARTRDYLSPFNESGVQRSREKCKYRKDYIVRVPENLLDVHWEEYKTKYNKKYQSRFHERSALSTWRSNMKRVAGHNQEYLAGKQAYTLHLNHFGDWSIFSYIKQLLKLIRTLPLFDPAEDRRRTTYRNTFDTRLPERVDWREKGFRPRLEEQFHCGACYAFAITHAVQAQVYKRHGDWRELSPQQIVDCSFKDGNFGCDGGSLQAALRYVARDGLIRETYYPYIGQRGACHYNSESVSARVRRWSSLPPGDEAAMERALATLGPLAVAVNAAPFTFQLYRYRGGTELRLKLILIPDQRSGIYDDPFCVSWHLNHAMLLVGYTPEYWILLNWWGEQWGENGYMRIKRGLNICGVANMATYVEL-