Monarch geneset OGS2.0

DPOGS211806
TranscriptDPOGS211806-TA984 bp
ProteinDPOGS211806-PA327 aa
Genomic positionDPSCF300031 - 693921-695523
RNAseq coverage99x (Rank: top 61%)
Annotation
HeliconiusHMEL0027624e-3939.29% 
BombyxBGIBMGA005131-TA1e-5039.42% 
DrosophilaCG12163-PA2e-4134.62% 
EBI UniRef50UniRef50_UPI000234FE625e-5039.14%UPI000234FE62 related cluster n=1 Tax=unknown RepID=UPI000234FE62
NCBI RefSeqXP_002734978.11e-4835.41%PREDICTED: cysteine proteinase inhibitor-like [Saccoglossus kowalevskii]
NCBI nr blastpgi|2288616494e-5136.94%cathepsin [Euproctis pseudoconspersa nucleopolyhedrovirus]
NCBI nr blastxgi|2240833622e-5241.81%predicted protein [Populus trichocarpa]
Group
Gene OntologyGO:00082346.5e-110cysteine-type peptidase activity
GO:00065088.4e-79proteolysis
KEGG pathwaynvi:1001236493e-47 
 K01373 (CTSF)maps-> Lysosome
InterPro domain[1-327] IPR0131286.5e-110Peptidase C1A, papain
[117-327] IPR0006688.4e-79Peptidase C1A, papain C-terminal
[37-92] IPR0132011.4e-11Proteinase inhibitor I29, cathepsin propeptide
Orthology groupMCL27835 Specific divergent
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211806-TA
ATGAAGCTCTTGTTTTTCGTCTTAATATTTGTCATTGGACACGTCGCGGGTTCTGATTCGGATCTGCCACCGAAAATATTTTACGATTTAAACGGCTCAGAGGACCTTTTCCAGAAATATGTGATTGAATATGACAAACACTATAACGAAGAAGAATATTGGGCGCATTATGAGATATTCAAGGATAATTTAGAAAAAATTAATGAACTCAACAAAAATAGTAATTCAACTGTATATGACATCAATCAATTTACCGATCTCAAGTTCGAAGAAGTTGCTAATACTTACATGGGAATGAGTTTAAAAATTGACGTCACGAATGTGAAAACTTACGAGCCTAAAGGTTTTGCTCCAGCTAGTTTAGACTACCGTAGAAAAGGTTGGGTTACTCCTATAAAAGATCAAGGTCACTGTAATACGTGTTATATTTTCAGCGCAGTTGGTGCCATCGAAGGCTGGCTGGCCAGGCGCACTGGAAGACTGATTTCTTTATCAGAACAAGAAGCATTAGATTGTGATATCTATCAAGATGGTTGTAAAACAGGTGGCTTTCCGGTGAATGCCATGAATGTAATTAGTCAACAAGGAGGATGTATGACAGAAGTAGACTATCCATATGAACAAAAAAAGGGTCAATGCCGTACTAAAAGCTCAAAGATTGTAGCCAAAATTTCTGGAGGATTACAAATCGATGTAAAAAACGAAAATGATTTAAAAGACGCGTTGGCCAATCATGGCCCATTGTCCATTGGACTAATTGTTGGTGAAAACTTTCGTCATTATAAGGGAGATATATTCAGAGGCAGCTGTGAAGGAAATGGGGGGCATGCTGTGTTGCTTGTAGGTTACGACTCAGTTAATGGTGAGGAATTTTGGATCATAAAGAATTCGTGGTCCGAACGCTGGGGAGAGAGAGGCTACATGCGTATGAAGATGGGAGCCAAGCTTTGTGGTATTGGAAACTATGTTGCTGCAGCTGTATAA

Protein sequence:

>DPOGS211806-PA
MKLLFFVLIFVIGHVAGSDSDLPPKIFYDLNGSEDLFQKYVIEYDKHYNEEEYWAHYEIFKDNLEKINELNKNSNSTVYDINQFTDLKFEEVANTYMGMSLKIDVTNVKTYEPKGFAPASLDYRRKGWVTPIKDQGHCNTCYIFSAVGAIEGWLARRTGRLISLSEQEALDCDIYQDGCKTGGFPVNAMNVISQQGGCMTEVDYPYEQKKGQCRTKSSKIVAKISGGLQIDVKNENDLKDALANHGPLSIGLIVGENFRHYKGDIFRGSCEGNGGHAVLLVGYDSVNGEEFWIIKNSWSERWGERGYMRMKMGAKLCGIGNYVAAAV-