Monarch geneset OGS2.0

DPOGS209137
TranscriptDPOGS209137-TA2541 bp
ProteinDPOGS209137-PA846 aa
Genomic positionDPSCF300061 - 746604-755664
RNAseq coverage457x (Rank: top 27%)
Annotation
HeliconiusHMEL0147871e-14652.87% 
BombyxBGIBMGA001310-TA1e-12855.75% 
DrosophilaCG13340-PA4e-10242.38% 
EBI UniRef50UniRef50_E0VYD52e-11443.71%Cytosol aminopeptidase, putative n=1 Tax=Pediculus humanus corporis RepID=E0VYD5_PEDHC
NCBI RefSeqXP_002431129.14e-11543.71%Cytosol aminopeptidase, putative [Pediculus humanus corporis]
NCBI nr blastpgi|2420213938e-11443.71%Cytosol aminopeptidase, putative [Pediculus humanus corporis]
NCBI nr blastxgi|2420213935e-11243.71%Cytosol aminopeptidase, putative [Pediculus humanus corporis]
Group
Gene OntologyGO:00057374.5e-162cytoplasm
GO:00082354.5e-162metalloexopeptidase activity
GO:00041774.5e-162aminopeptidase activity
GO:00195384.5e-162protein metabolic process
GO:00301454.5e-162manganese ion binding
GO:00056225.5e-75intracellular
GO:00065085.5e-75proteolysis
GO:00551145.1e-06oxidation-reduction process
GO:00167065.1e-06oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, 2-oxoglutarate as one donor, and incorporation of one atom each of oxygen into both donors
GO:00164915.1e-06oxidoreductase activity
KEGG pathwaydan:Dana_GF111655e-104 
 K01255 (CARP, pepA)maps-> Glutathione metabolism
InterPro domain[356-840] IPR0113564.5e-162Peptidase M17
[519-825] IPR0008195.5e-75Peptidase M17, leucyl aminopeptidase, C-terminal
[361-489] IPR0082834.1e-14Peptidase M17, leucyl aminopeptidase, N-terminal
[94-212] IPR0051235.1e-06Oxoglutarate/iron-dependent oxygenase
Orthology groupMCL10353 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209137-TA
ATGAACGATTACTCTTCTTTTATAGATAAACACAAAGTAATTAACGCTGAACCCACAGCATGCTATGTATCAGAGTTTATAACACCAGAAGAAGAAAAATATATATTAAACAACATTTACACTGCTCCCAAACCAAAATGGACGCAACTGTCCAATAGAAGATTACAAAATTGGGGTGGAATTCCACATAACAAAGGAATGATAGCCGAAGATATACCAGGTTGGCTTCAAACCTACTTGGATAAAATACATTCCTTAAATTTAATGAGGGGTAATAAGCCAAACCATGTCCTTGTTAATGAGTATCTGCCGAAGCAAGGAATCCTACCTCATTTAGATGGATTCCTATTTTATCCCACCATTACAACTATATCTGTTGGATCCCATGCAATATTAAAGTTTTTCGAAGCATCTGATAATGGTTCACTGAGTCACGTATTCTCTCTGTTGTTGGAACCGAGAAGTCTATTGGTGCTGCAAGATGAAATGTTCAAACATTATTTACATGGCATAGAAGAGGTCAATGAAGATGCTATAGATGACTCCATTGTTAATCTAAACATGTGTTCAGATCGATATACAAAAGGCACTACAGTGGCTCGTGGAACTAGAGTGTCCTTAACAATAAGACATAGATATTGTAGTGCTGTCTCCGAAGATTCACCCACGTGTGGGAAGAAATCTGATGAAGCTAGGCAGAGCAGTAATGACCAACCAGAGAACAAAAAGGGTTTGGTCCTTGGCGTATATGAAGAGGGGGAAAAGTTTGAATTGACACCAGTCGCTGAGGAAATAAACCAGAAGAGTGGCGGCAAGATATGCAAGCATCTAAACGAAATGTCATGTCACCTGAAACACGGCAAAGCATTCGTGGTGACGGATATTTTGGAGGAGTTTGGACCGGTGGCCATAGCGTCTCTCGGCAAGAAGAATCCAGGATACAATGAGCTGGAGATGTTGGATGAGACCAGGAGATATTGTAGTGCTGTCTCCGAAGATTCACCCACGTGTGGGAAGAAATCTGATGAAGCTAGGCAGAGCGGTCGCAGTAATGACCAGCCAGAGAACAAAAAGGGTTTGGTCCTTGGCGTATATGAAGAGGGGGAAAAGTTTGAATTGACACCAGTCGCTGAGGAAATAAACCAGAAGAGTGGCGGCAAGATATGCAAGCATCTAAACGAAATGTCATGTCACCTGAAACACGGCAAAGCATTCGTGGTGACGGATATTTTGGAGGAGTTTGGACCGGTGGCCATAGCGTCTCTCGGCAAGAAAAATCCAGGGTACAATGAGCTGGAGATGTTGGATGAGACCAGGGAAAATCTCCGCGTGGGTGTGGGTGTGGGGGTGCGTGAGTTGGTGAAGAGAGGTTGTGATCATGTGTACGTGGACGGAGGAACAGAGCCTGACGCCGCCGCCGAGGCCGCCCATCTAGCAGCTTGGAGGTTCGAGGAGTTCAAATCGTCTGGGGCGAAGTCCTTCCAGACAGATGTATTCCTCCAGGGGTCGGGTGAGGAGCTGTGGAAACGCGGCACGATTTTCGGTTCTGGACAAAACTGGGCCAGACACCTCACCGACATGCCTCCCAATAAGATGACGCCCGTTGACTTCGCACAGGCGGTGTTAGACATGTTATGTCCCCTGGGCGTTCACGTGACGGCCCACGACTCGGCGTGGATCGAAGCTCAGCGGATGGAGGCGCTCCTGTCGGTTTCCCGTGGTTCCTGTGAGCCGGCCGTGTTTCTGGAGTGCGAGTACCGAGCGGGCGGGGACCGGCCGCCCGTCCTGCTAGCGGCCAAGGGAATCACATTCGACAGTGGCGGTTTATGTCTGAAGAAGGCTGATGAAATGCGAGAGAACCCGGACAGCCGCGCGGGGGCCGCCGCCACAGTCGGCGCTCTCAAGATACTCGCGGAGATGAAGGTGCCCATTAACGTGGTGGCAGTGATACCGCTGTGCGAGAGTATGGTGAGCGGCAGCTGTATGAAGGTCGGGGACGTCTTGAGAGCACTCAACGGACTCACCATGCAGGTGGAGTGCACAGCCCAAGCAGGCCGCCTCACTCTGGCAGACGCACTGGTCTACGGACAGGCCAAGCATAGACCCTCGCTAGTCGTAGACCTGGCGTCACTAACAAGAGGAGTGCAGCTAGCTACGGGCAGCGCGGCTTTCGGCGTGTTCAGTTCCAGCGGCGAGGCGTGGGCGGCGCTCGCACAGTCCGCGGCACGAGCTGGGGACAGAGGCTGGAGGCTGCCTCTCTGGAGCTATTACCGCGCTATGATCGATGATGACCCCTCTGTGGATCTGAGGAACAGGGGTCCAGGAACGGCTGCACCATGCGTGGGAGCCGCGTTTCTCAAGAACTTCGTGTGTGCACCGTGGCTTCACCTGGACGTGTCGGGCGTGTCCCGGGGCGGCACTCCCTACCTGCCCGCGCCCCGGGCCGCCGGTCGGCCTGCGAGGACACTCGCAGAATTCCTCACCGCCGCCGGCACAGCAAGTGCAAATGTCAAGGACTCCGACTCACCAGCTACATCTTAA

Protein sequence:

>DPOGS209137-PA
MNDYSSFIDKHKVINAEPTACYVSEFITPEEEKYILNNIYTAPKPKWTQLSNRRLQNWGGIPHNKGMIAEDIPGWLQTYLDKIHSLNLMRGNKPNHVLVNEYLPKQGILPHLDGFLFYPTITTISVGSHAILKFFEASDNGSLSHVFSLLLEPRSLLVLQDEMFKHYLHGIEEVNEDAIDDSIVNLNMCSDRYTKGTTVARGTRVSLTIRHRYCSAVSEDSPTCGKKSDEARQSSNDQPENKKGLVLGVYEEGEKFELTPVAEEINQKSGGKICKHLNEMSCHLKHGKAFVVTDILEEFGPVAIASLGKKNPGYNELEMLDETRRYCSAVSEDSPTCGKKSDEARQSGRSNDQPENKKGLVLGVYEEGEKFELTPVAEEINQKSGGKICKHLNEMSCHLKHGKAFVVTDILEEFGPVAIASLGKKNPGYNELEMLDETRENLRVGVGVGVRELVKRGCDHVYVDGGTEPDAAAEAAHLAAWRFEEFKSSGAKSFQTDVFLQGSGEELWKRGTIFGSGQNWARHLTDMPPNKMTPVDFAQAVLDMLCPLGVHVTAHDSAWIEAQRMEALLSVSRGSCEPAVFLECEYRAGGDRPPVLLAAKGITFDSGGLCLKKADEMRENPDSRAGAAATVGALKILAEMKVPINVVAVIPLCESMVSGSCMKVGDVLRALNGLTMQVECTAQAGRLTLADALVYGQAKHRPSLVVDLASLTRGVQLATGSAAFGVFSSSGEAWAALAQSAARAGDRGWRLPLWSYYRAMIDDDPSVDLRNRGPGTAAPCVGAAFLKNFVCAPWLHLDVSGVSRGGTPYLPAPRAAGRPARTLAEFLTAAGTASANVKDSDSPATS-