Monarch geneset OGS2.0

DPOGS210662
TranscriptDPOGS210662-TA3381 bp
ProteinDPOGS210662-PA1126 aa
Genomic positionDPSCF300401 + 179470-191229
RNAseq coverage158x (Rank: top 52%)
Annotation
HeliconiusHMEL0107840.060.46% 
BombyxBGIBMGA001803-TA0.070.00% 
DrosophilaDip-B-PB5e-10745.11% 
EBI UniRef50UniRef50_E2C7181e-10942.62%Putative aminopeptidase W07G4.4 n=2 Tax=Formicidae RepID=E2C718_HARSA
NCBI RefSeqXP_969358.12e-11344.25%PREDICTED: similar to Sb:cb283 protein [Tribolium castaneum]
NCBI nr blastpgi|910912704e-11244.25%PREDICTED: similar to Sb:cb283 protein [Tribolium castaneum]
NCBI nr blastxgi|910912702e-10844.33%PREDICTED: similar to Sb:cb283 protein [Tribolium castaneum]
Group
Gene OntologyGO:00056224.5e-50intracellular
GO:00041774.5e-50aminopeptidase activity
GO:00065084.5e-50proteolysis
GO:00057374.5e-28cytoplasm
GO:00082354.5e-28metalloexopeptidase activity
GO:00195384.5e-28protein metabolic process
GO:00301454.5e-28manganese ion binding
GO:00042221.3e-06metalloendopeptidase activity
KEGG pathwaytca:6578306e-113 
 K01255 (CARP, pepA)maps-> Glutathione metabolism
InterPro domain[179-486] IPR0008194.5e-50Peptidase M17, leucyl aminopeptidase, C-terminal
[262-279] IPR0113564.5e-28Peptidase M17
[518-569] IPR0240794.7e-08Metallopeptidase, catalytic domain
[514-568] IPR0015061.3e-06Peptidase M12A, astacin
Orthology groupMCL16105 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210662-TA
ATGTCGCCGTTTAAATTGTACGAGAATATTTTTATTGAGACGAATCTTTTATCCTCGGACTATGACGGCGTTATCCTCATACTGTATCCCAGGGACATGAATGTGGCGTTGCCCAGGCATGTGTCGAGCTTCATAGACAAAATCTTTATCCTGGATAAGAGTATTTACAAGACGCCCAGCGTTTGGAACTGTGATTACGTTTCTGGAGGGCGGTTGGTGCTGTCGCCGGTAGGTAATGTAACTCCATACCATGACGTTACCGTGGTGAGAGAAGCCGCGAAGAGGGGAATGCTGCGAGCAATGGAAGCCGGTATGACCAAACCGTTGCTGATCGTTGAAAACGTAGTCCATTACCCCGACGGGCAATTAGTCTGCATTCTGGGGGCTCTGGAATCCTTATATGTTCCGATACAGATAAGGGAGATGAAACCCCAGAAACAGGTATACAGAATCGGTCTGCATGCTGAGGAAAAAGCAACTGAGTCATTTGAAAAGATAGTTAGAAACGCTATCGCCTTGGAGCGAGCTAGGATCGTAGCTAGAGACATCGCTGGCGGGGATCCCGAGAGAATGGCTCCCGGGAGGATAGCTGATTATGTAGTCAAAGTGTTCGCCGAAGATCCTTGTGTATCCATCAAAATTATTGACAACGATGATATTATAGCGCAGAAATATCCACTGCTGGCAGCTGTATCGCGGGCAGCGAATAACGTGGAGAGACACAAGGCTAGAGTTGTTTTACTGGAGTACAATTCATCTAACCCGGTCAGGGTGACAGAAACCATAATGTTGGTTGGCAAAGGGGTGACGTACGACACTGGCGGCGCTGATATAAAGATATCTGGCAAGATGGCCGGCATGTCCAGGGATAAATGCGGGGCAGCGGCTGTAGCTGGGTTTTTGAAGGCCTGCTCCATACTGAAACCTCCACATCTGAAGGTCATTGGGGTTATGTGTTTGTGTCGCAATTCTATCGGCTCAGATTCCTATGTGTCTGATGAATTGCTAACATCCAGTAGCGGAAAACTGGTCAGGGTTACCAACACGGATGCGGAGGGTAGGCTAGCTATGGCAGATTCTCTTTACATGCTGGCCAATATGGCGGAAAAAGAGCTCAACCCACATCTCTACACCATAGCGACTTTGACCGGACACGCCAGAGCCTGCTACGGTAATTATACAGCAGCTATGGACAATCACAGCGCCAAGGGCACCAACCACTCGAGCAAATTGCAGTTCAGCGGGTCAAGACTCGGAGAAGGATTCGAGATATCTACCGTGAGGGCCGAGGATTTGGCTGTAAATGATGGGAAATGTAGCGGAGATGATCTCGTTCAATATGACACTGACGCGAAATGCCGCAACCACCAGCTAGCTGCAGGGTTTCTGATCAGGGTTGCCGGTTTGGAAGACAAGAATATAAAATACACGCATCTCGATATAGCTGGAGCGGCGGGATGTCCTCCGGAAAAGCCCACAGCGACGCCCGTCTTATCTTTGTGCTCAATTCCGTTATTTACCAAAGAGAAGATACTTCCTCTAGAGCTACGAGCTCTGCCGTATGACGTCGACAGTGTCATGCATTTCAATGAAAGGGATTTCAGCAAGAACGGTCACAGAACTTTATTATTCAAGAACGACAAGACTCCACAGAAAAGAATCGGTCTATCTAAAACCGACTTAAAAAAGATAGAATTAGTTTATGGGCCAGAATGTTTGAAACGAGAGCGACAAGCAAAAATCGATATTTGTAGAAACTTCCCAGCTGTTAGGAGAAAGCGAGAAATCGATTTTGCAACAGTCGGAAGCCTTAGAGTCAATCCGGAAATAACCCCGCCGCCGGATACAAACAATCAGGAGAATCTGACAGACGAATTGACAAATAATCTAAAAGAACTCGGCATAGAAGAAGAGGTGCAGCTACTGATAGAGCAAATACACAAAGTTACGGCTACAGCGCTGACAAACGCCAAACTAAAGCATTGTAACACCACAAAGAATGGCAGCGGAGATAACAAGAAGGCGGATTTAAAGGAAATAATATATAAAGTCAATGAATACGCCAGAGCTGTGGTCCAGAACGCGTTGACAAACATGACTGTGTTCTGCGATGACGCCAATTCTATGGAGAAATTCCAAATTGGGAGGTGCCAATGGGGTCCTAACAGTAGATGTCCCGTGTACTTCAGATCGACTATGCCTGGGCCTGTCAAATATTCCACACAGCACCGTCCATTGATCCGACAATCGACAAAGCATGAGGGTAGAGGGATAAAACATCAATACGTCCCATGGCTGAGGTCCCAAAATGGAACGGAAGAGAAAAACTTGACGAGGGCAAAGCGTGACGTGAATCAGACAAACTCTGGACCAGCCAACGAGACTGTCAAAGAGGTCTTGAGAATGGCGACAAGGATAATGACTGACAAGAAGGTTGACTTTGCACCTGTCAGACGCAGCTACCAAAGTGTTGGACCCAGAGAGAAGAAGAAAGATAGGAAAGAGAGAAAGGAGAGAAAGTTATTCCGTGTACCTAAAACAGTGCAGCTCTCGAAGGAGAACATAGAGTTCTACGCGGAGAGAATATGGCCGGATGGCGTCGTCAACTATGTCATAAAAGACGATGTGAACTACGATTCGAACAAAGTCCGCGAACGCCTGGCGGAGGTGAATAGGATATTACGCCGGCGGACTTGTGTGAGATTGAACGAAATGAGCGAGGAGGGAGCGAGACGGCTGACAGACTATCTCGTTCTAGACACCGGCAGGGATTACGTCACGGGGCGGGTCGGCGGGAAGCAGCCAAGGAAGAAGAAACTAAAACCACGACACCGACCAGAAGAGAAGAGAGTTACGACCTTCGATGAAGACGGAGATAAAACGAAGCAGAATATAGAAGAACGCGTTGAGAGGTATAAAGAAGAAAGACCCAGAGAGAAGAAGAAAGATAGGAAAGAGAGAAAGGAGAGAAAGTTATTCCGTGTACCTAAAACAGTGCAGCTCTCGAAGGAGAACATAGAGTTCTACGCGGAGAGAATATGGCCGGATGGCGTCGTCAACTATGTCATAAAAGACGATGTGAACTACGATTCGAACAAAGTCCGCGAACGCCTGGCGGAGGTGAATAGGATATTACGCCGGCGGACTTGTGTGAGATTGAACGAAATGAGCGAGGAGGGAGCGAGACGGCTGACAGACTATCTCGTTCTAGACACCGGCAGGGATTACGTCACGGGGCGGGTCGGCGGGAAGCAGGAGAGAGATACAAATACAACAAAAGATATGATCTTGTATAAGTTAGTAAGGGATAGCAAACATCGTGACGTCAATCCTTATCATAGGCTTGTGTTGCAAGCATTAAGAAATATCCCTGTTCGATAG

Protein sequence:

>DPOGS210662-PA
MSPFKLYENIFIETNLLSSDYDGVILILYPRDMNVALPRHVSSFIDKIFILDKSIYKTPSVWNCDYVSGGRLVLSPVGNVTPYHDVTVVREAAKRGMLRAMEAGMTKPLLIVENVVHYPDGQLVCILGALESLYVPIQIREMKPQKQVYRIGLHAEEKATESFEKIVRNAIALERARIVARDIAGGDPERMAPGRIADYVVKVFAEDPCVSIKIIDNDDIIAQKYPLLAAVSRAANNVERHKARVVLLEYNSSNPVRVTETIMLVGKGVTYDTGGADIKISGKMAGMSRDKCGAAAVAGFLKACSILKPPHLKVIGVMCLCRNSIGSDSYVSDELLTSSSGKLVRVTNTDAEGRLAMADSLYMLANMAEKELNPHLYTIATLTGHARACYGNYTAAMDNHSAKGTNHSSKLQFSGSRLGEGFEISTVRAEDLAVNDGKCSGDDLVQYDTDAKCRNHQLAAGFLIRVAGLEDKNIKYTHLDIAGAAGCPPEKPTATPVLSLCSIPLFTKEKILPLELRALPYDVDSVMHFNERDFSKNGHRTLLFKNDKTPQKRIGLSKTDLKKIELVYGPECLKRERQAKIDICRNFPAVRRKREIDFATVGSLRVNPEITPPPDTNNQENLTDELTNNLKELGIEEEVQLLIEQIHKVTATALTNAKLKHCNTTKNGSGDNKKADLKEIIYKVNEYARAVVQNALTNMTVFCDDANSMEKFQIGRCQWGPNSRCPVYFRSTMPGPVKYSTQHRPLIRQSTKHEGRGIKHQYVPWLRSQNGTEEKNLTRAKRDVNQTNSGPANETVKEVLRMATRIMTDKKVDFAPVRRSYQSVGPREKKKDRKERKERKLFRVPKTVQLSKENIEFYAERIWPDGVVNYVIKDDVNYDSNKVRERLAEVNRILRRRTCVRLNEMSEEGARRLTDYLVLDTGRDYVTGRVGGKQPRKKKLKPRHRPEEKRVTTFDEDGDKTKQNIEERVERYKEERPREKKKDRKERKERKLFRVPKTVQLSKENIEFYAERIWPDGVVNYVIKDDVNYDSNKVRERLAEVNRILRRRTCVRLNEMSEEGARRLTDYLVLDTGRDYVTGRVGGKQERDTNTTKDMILYKLVRDSKHRDVNPYHRLVLQALRNIPVR-