Monarch geneset OGS2.0

DPOGS200202
TranscriptDPOGS200202-TA756 bp
ProteinDPOGS200202-PA251 aa
Genomic positionDPSCF300093 + 133840-135014
RNAseq coverage115x (Rank: top 58%)
Annotation
HeliconiusHMEL0033675e-5143.51% 
BombyxBGIBMGA005131-TA2e-4240.57% 
DrosophilaCG12163-PA1e-4238.89% 
EBI UniRef50UniRef50_E3WZZ71e-4643.93%Putative uncharacterized protein n=1 Tax=Anopheles darlingi RepID=E3WZZ7_ANODA
NCBI RefSeqXP_312034.44e-4644.14%AGAP002879-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3123780844e-4643.93%hypothetical protein AND_10451 [Anopheles darlingi]
NCBI nr blastxgi|3479687312e-4544.50%AGAP002879-PB [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00082345.6e-73cysteine-type peptidase activity
GO:00065081.3e-66proteolysis
KEGG pathwayaga:AgaP_AGAP0028791e-45 
 K01373 (CTSF)maps-> Lysosome
InterPro domain[27-250] IPR0131285.6e-73Peptidase C1A, papain
[30-249] IPR0006681.3e-66Peptidase C1A, papain C-terminal
Orthology groupMCL34344 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200202-TA
ATGTCACTTCCCATTTTTACATATTTGTTTTGTATAAACCTTGTCTTTGGTCTTTCTCATTGTCTGAATAGTTTTGGTGATGGTCCGCTTCCTAGGTACTTTGACTGGAGAGATCATGGTGCGGTGTCTCCCGTGAAGAATCAAGGAGGTTGTGAGGCCTGCTGGGCCTTCAGTGCTGTGGCTTGTATAGAAAGTCATTTAAAAATACACTTGTCCTCCGAAGAAATACTATCGGAGCAATTTCTCATTGATTGTGCTCCCGGTAATATCGGCTGTAATAGCACTAGCGTCTTAAAGACTTTCGGGACTATAGTAAATGATATCGGAGGAGTATTGCGTGACTTAGATTATAAACCATACGAGGCCAAACAGAAAAAGTGCTCCTGGGATCCTTTAAAAAGGCCAATTCCCGTTGTTGGTTACCGAAGAGTCAAACCAGACGAACAAATTATGGCATTATATGTTGTGAATGTCGGACCTCTCTCGGCTGCAATAAACTCGGCGTCTATGGCTAAATACAATGGTGGTATAGACGAACCTACAGATAAATTGTGTTCACCACGACAAACAAACCACGCTGTTCTCATTGTTGGCTTTAGTTTTTACGAGGACCCACAAAGTAAAACCTACGTTCCGTATTGGATAATCAAGAATTCGTGGGGGACGTCCTGGGGTGATAATGGCTACTATTATCTTGTGCGTGGTCGCAATGCTTGCGGGATCGCCACTGATGTATCTTACCCTTACGTCATGTGA

Protein sequence:

>DPOGS200202-PA
MSLPIFTYLFCINLVFGLSHCLNSFGDGPLPRYFDWRDHGAVSPVKNQGGCEACWAFSAVACIESHLKIHLSSEEILSEQFLIDCAPGNIGCNSTSVLKTFGTIVNDIGGVLRDLDYKPYEAKQKKCSWDPLKRPIPVVGYRRVKPDEQIMALYVVNVGPLSAAINSASMAKYNGGIDEPTDKLCSPRQTNHAVLIVGFSFYEDPQSKTYVPYWIIKNSWGTSWGDNGYYYLVRGRNACGIATDVSYPYVM-