Monarch geneset OGS2.0

DPOGS209946
TranscriptDPOGS209946-TA1077 bp
ProteinDPOGS209946-PA358 aa
Genomic positionDPSCF300148 - 352218-354832
RNAseq coverage6659x (Rank: top 2%)
Annotation
HeliconiusHMEL0096902e-13867.60% 
BombyxBGIBMGA011344-TA2e-15177.54% 
DrosophilacathD-PA2e-14565.83% 
EBI UniRef50UniRef50_Q031682e-14466.02%Lysosomal aspartic protease n=44 Tax=Eumetazoa RepID=ASPP_AEDAE
NCBI RefSeqNP_001037351.12e-16777.65%cathepsin D [Bombyx mori]
NCBI nr blastpgi|1129835764e-16677.65%cathepsin D precursor [Bombyx mori]
NCBI nr blastxgi|1129835765e-16377.65%cathepsin D precursor [Bombyx mori]
Group
Gene OntologyGO:00065083.5e-204proteolysis
GO:00041903.5e-204aspartic-type endopeptidase activity
KEGG pathwaytca:6554947e-156 
 K01379 (CTSD)maps-> Lysosome
InterPro domain[1-357] IPR0014613.5e-204Peptidase A1
[15-357] IPR0211094.8e-121Peptidase aspartic
[121-357] IPR0090074.6e-93Peptidase aspartic, catalytic
Orthology groupMCL13285 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209946-TA
ATGAAGACCGTCCGGGAACACTTCCACGAAGTGGGCACGGAGCTGCACATGGCCCGGATCAAGTATGCCACCGGTGGGCCTGCTCCTGAACCTCTCTCCAACTATTTAGATGCTCAATATTACGGTCCCATTTCGATCGGGAACCCTCCCCAGACTTTCAAGGTGGTGTTCGACACCGGTTCCTCCAATCTCTGGGTTCCGTCCAAGAAATGCCACTACACTAACATCGCCTGTCTTCTCCACAACAAGTATGACAGCAGCAAGTCCAAGTCTTATCATAAGAACGGCACGGAGTTCGCCATCCACTACGGCTCCGGCTCCCTGTCAGGATTCCTCTCCGTCGATGACGTCACCCTGGGTGGAATGACGGTGAAGTCGCAGACGTTCGCGGAGGCGATGTCGGAGCCGGGGCTGGCATTCGTGGCCGCCAAGTTTGATGGCATTCTTGGCATGGCATTCGCCAGTATCGCCGTGGACGGGGTGACGCCGGTGTTTGACAACATGGTGAAGCAAGGCCTGGTGGCGCCCGTCTTCAGCTTCTACCTCAACAGGGACGCATCGGCGGCGCAGGGCGGTGAGCTGGTATTGGGAGGCTCGGACCCCGCTCACTACCGCGGTCCGCTCACCTACGTGCCGCTCTCCAAGGACACCTACTGGCAATTCCAAATGGACGGCGTTCTCGTCAACGGATCCAGCTTTTGCAAACGAGGTTGCCAGGCCATCGCGGACACGGGTACCTCCCTGATAGGCGGCCCGGTGGAGGAGGTGGCAGCCCTAAATGCCAAGATCGGCGCGACGCCGATGGCGTTCGGTCAATTCGCACTGGACTGCTCCCTGATCCCTCGCCTGCCGCCCGTCACCTTCACCATCGCCAACCAGAAGTTCACGCTGGAGGGCACCGACTATGTGCTGCGGGTGTCTCAGTTCGGTAAGACGGTGTGTCTGTCCGGCTTCATGGGGCTGGACATCCCGCCGCCGGCCGGTCCGCTGTGGATCCTGGGCGACGTGTTCATCGGCCGTTACTACACGGAGTTCGACGTCGCCAACCGACGCATCGGCTTCGCGCCCGCCCTCTAG

Protein sequence:

>DPOGS209946-PA
MKTVREHFHEVGTELHMARIKYATGGPAPEPLSNYLDAQYYGPISIGNPPQTFKVVFDTGSSNLWVPSKKCHYTNIACLLHNKYDSSKSKSYHKNGTEFAIHYGSGSLSGFLSVDDVTLGGMTVKSQTFAEAMSEPGLAFVAAKFDGILGMAFASIAVDGVTPVFDNMVKQGLVAPVFSFYLNRDASAAQGGELVLGGSDPAHYRGPLTYVPLSKDTYWQFQMDGVLVNGSSFCKRGCQAIADTGTSLIGGPVEEVAALNAKIGATPMAFGQFALDCSLIPRLPPVTFTIANQKFTLEGTDYVLRVSQFGKTVCLSGFMGLDIPPPAGPLWILGDVFIGRYYTEFDVANRRIGFAPAL-