Monarch geneset OGS2.0

DPOGS214724
TranscriptDPOGS214724-TA2679 bp
ProteinDPOGS214724-PA892 aa
Genomic positionDPSCF300022 + 649-11573
RNAseq coverage1113x (Rank: top 11%)
Annotation
HeliconiusHMEL0027620.061.51% 
BombyxBGIBMGA005131-TA0.052.99% 
DrosophilaCG12163-PA6e-7766.49% 
EBI UniRef50UniRef50_D6WPZ32e-10530.58%Cystatin n=19 Tax=Eukaryota RepID=D6WPZ3_TRICA
NCBI RefSeqXP_973607.23e-10630.58%PREDICTED: similar to cathepsin F-like cysteine protease [Tribolium castaneum]
NCBI nr blastpgi|2602341130.056.00%cysteine proteinase inhibitor precursor [Manduca sexta]
NCBI nr blastxgi|2602341130.056.56%cysteine proteinase inhibitor precursor [Manduca sexta]
Group
Gene OntologyGO:00082341.1e-159cysteine-type peptidase activity
GO:00065088e-99proteolysis
GO:00048696.7e-09cysteine-type endopeptidase inhibitor activity
KEGG pathwaytca:6624178e-106 
 K01373 (CTSF)maps-> Lysosome
InterPro domain[492-892] IPR0131281.1e-159Peptidase C1A, papain
[613-891] IPR0006688e-99Peptidase C1A, papain C-terminal
[526-584] IPR0132011.4e-10Proteinase inhibitor I29, cathepsin propeptide
[418-495] IPR0000106.7e-09Proteinase inhibitor I25, cystatin
Orthology groupMCL12457 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214724-TA
ATGCATATCGATGGTTGTCAGACACAATCATCCGATAAAATATTACATTGTAATGCGCAAGTGTGGGTTCAACCATGGATTCAATCATCTAAAGAAATTGAAATAAAGTGTGAAAAAAATAGTGAGGAAAGTAAAGGGGACTTAGAGAATGATAAGTTTTCATCTGATCGAATAAAGAGGCAAATTACGCATGATGAAGATGATATTGATGAAGACACAAAGTATTATTACGCTGATCGAGCTGTACATTACATAAATGAAAAAGAATCGACAAACAATTTGAATAAACTCATTACTATCCACGCGTTTGAAAGCAGTACTAATATGGGAGTCAACATGATAAAAATGTACATCGAAATAGGTTTAACTTATTGCTTAAGACATGAAGATGAAGCGGAACTACAGAACTGTGAGGAATTGTCTGGTATCTATCACAAACTTTGTTATGTTCGGTTGTGGCCATCACCCGATGATGAACTAGTAGTTCAAAGCTTGGCTGTGGTCTGCGACGACGAACGGGACTTCAAGAGCGTCACAGGCCTATCGATAACGAATCTTATCAAAGAAGCTGTTAAGGAATTAGAATCTTCGCCTAAAATTAAAAACAAGTTGGTTCACCTCGGTGAACCACACGTAGTACCTAGCCTGGATTCCCGTAAGCCCACGCAATTAAGTTTTATAGTGCGAGCTACAAATTGTTCCAAGTACGTAGATATTGAAAAAGACCGTTTCCAATGTTACATTGATAATTCTAGACTTCCTAAACCTTGCACATCAAGTATCTGGATGGCAGCCAACACAAAAAAAATAAGAAAAGTCACAACACGCTGCAGTAGATCATTACCGAATCGTAGCAGAAGATCGCTTTCGTTTGATACGACAAACACAACTTCCGACGAAAAACTTATCCAAGGTATGGTAAGGGAATCATTGGACAAGCTAGAAATGTCGTCGCTGTTAAACTATAAACAGAAGTTGCTACAGATTAACAGTTTTATGACTAATATAACCAGAGGAAGACTAACAACTATAGACTTTGATGTAGCTTATACGACTTGCTTAAAATACGAATGGGTCGATAATATGACTGCTTGTGAGATAATAGAGCACCTGCCCAGAAGACATTGCATATCACAGGTGAGGGAGCGGCTGTGGATACAAAATGGCAGAGAAATAACAGTGAACTGCGACGACGACGAAACGCCGCTAGAATCTCATATAGAGTATGAGACCGCTGATAACGGAATGGCTTTGGCTAACGAGGCTTTGAAGCACATCGAAGCTAAATATCCTCATCCAAATAAACAGAAAATTGTAAGAGTGTTTTCGTTGGAAAAACAGCAGGTTGCTGGGTTGCATTTTAGATTGAAATTAGAGGTAGGCATTACAGACTGTTTAGCTTTGAGTGCCAAGAAGGACTGTAAGCTAACAAAAAACATGTCAACAAATAAGTTCTGTCGAGTAAATATTTGGTTGCGTCCATGGTCTGAACATCCACCGCTTTATAGGGTGATATGTGACTATCAGGATGAAGCGTCACACGAGTTTTTCTTCGAAGTTCAAGCTGAACGTCTCTTTTCCGACTTCCTAACTACCTACATGCCGGATTACATCGATAATAAATCAGAAATGGTCAAAAGATACAACATATTTAAGGACAACGTTAAAAGAATACACGAATTAAATATCCACGAGCGTGGAACAGCAACTTACGGAGTTACTAGGTATTCGGATCTGACTTATGACGAGTTCGTATCAAAATATATGGGCCTTAAGACACATATGAGAAATGAGAATCTGATTCCGATGAGACAAGCGGACATCCCAGAGGTGGCTCTTCCTGAAAACTTTGATTGGCGCGAATATAATGCTGTCACTGAGGTCAAAGATCAAGGTTCCTGTGGAAGCTGCTGGGCGTTCAGTGTTACCGGTAATATAGAAGGTCAATATAAGATCCAGAACGACGAGCTGGTCTCTCTGTCGGAGCAAGAATTGGTAGACTGTGACAAACTGGACGACGGCTGCAACGGAGGCCTCCCAGACAACGCCTACAGGTACATTCTCAGTGCTAGCTATTTCTCGCAGCATCAGACTTGTGTATATGCAGGTGTATGTAATATAGAAGGTCAGTATAAGATCCAGAACGACGAGCTGGTCTCTCTGTCGGAGCAAGAATTGGTAGACTGTGACAAACTGGACGACGGCTGCAACGGAGGCCTCCCAGACAACGCCTACAGGGCTATAGAGCAGCTGGGCGGCTTAGAACTGGAATCTGATTACCCTTACGAGGGCGAGAACGACAAGTGCGTGTTCAACAAAACGATGTCCAAGGTCCAGATCAGCGGCGCCGTTAATATATCGTCCAACGAAACAGATATGGCTAAATGGCTCACACAGAACGGACCCATCTCTATTGGTATAAACGCTAATGCGATGCAGTTCTACATGGGGGGTATCTCACACCCGTGGAAGGTCCTCTGTAACCCCACCAACTTGGACCACGGCGTACTTATAGTGGGCTACGGAGTTAAGAACTACCCTCTCTTCCACAAGCGTCTGCCCTACTGGATTGTGAAGAATTCGTGGGGAAAGTCGTGGGGCGAGCAGGGCTACTACCGGGTGTACCGAGGCGACGGCACTTGCGGCGTCAATCAAATGGCCAGCTCCGCCGTCATATAG

Protein sequence:

>DPOGS214724-PA
MHIDGCQTQSSDKILHCNAQVWVQPWIQSSKEIEIKCEKNSEESKGDLENDKFSSDRIKRQITHDEDDIDEDTKYYYADRAVHYINEKESTNNLNKLITIHAFESSTNMGVNMIKMYIEIGLTYCLRHEDEAELQNCEELSGIYHKLCYVRLWPSPDDELVVQSLAVVCDDERDFKSVTGLSITNLIKEAVKELESSPKIKNKLVHLGEPHVVPSLDSRKPTQLSFIVRATNCSKYVDIEKDRFQCYIDNSRLPKPCTSSIWMAANTKKIRKVTTRCSRSLPNRSRRSLSFDTTNTTSDEKLIQGMVRESLDKLEMSSLLNYKQKLLQINSFMTNITRGRLTTIDFDVAYTTCLKYEWVDNMTACEIIEHLPRRHCISQVRERLWIQNGREITVNCDDDETPLESHIEYETADNGMALANEALKHIEAKYPHPNKQKIVRVFSLEKQQVAGLHFRLKLEVGITDCLALSAKKDCKLTKNMSTNKFCRVNIWLRPWSEHPPLYRVICDYQDEASHEFFFEVQAERLFSDFLTTYMPDYIDNKSEMVKRYNIFKDNVKRIHELNIHERGTATYGVTRYSDLTYDEFVSKYMGLKTHMRNENLIPMRQADIPEVALPENFDWREYNAVTEVKDQGSCGSCWAFSVTGNIEGQYKIQNDELVSLSEQELVDCDKLDDGCNGGLPDNAYRYILSASYFSQHQTCVYAGVCNIEGQYKIQNDELVSLSEQELVDCDKLDDGCNGGLPDNAYRAIEQLGGLELESDYPYEGENDKCVFNKTMSKVQISGAVNISSNETDMAKWLTQNGPISIGINANAMQFYMGGISHPWKVLCNPTNLDHGVLIVGYGVKNYPLFHKRLPYWIVKNSWGKSWGEQGYYRVYRGDGTCGVNQMASSAVI-