Monarch geneset OGS2.0

DPOGS200549
TranscriptDPOGS200549-TA3087 bp
ProteinDPOGS200549-PA1028 aa
Genomic positionDPSCF300119 + 28079-34916
RNAseq coverage52x (Rank: top 70%)
Annotation
HeliconiusHMEL0168736e-12046.39% 
BombyxBGIBMGA010763-TA4e-11661.15% 
DrosophilaSP1029-PA2e-11137.70% 
EBI UniRef50UniRef50_D6WB372e-11740.52%Aminopeptidase N-like protein n=2 Tax=Tribolium castaneum RepID=D6WB37_TRICA
NCBI RefSeqXP_968871.25e-11841.28%PREDICTED: similar to protease m1 zinc metalloprotease [Tribolium castaneum]
NCBI nr blastpgi|2700028898e-11740.52%aminopeptidase N-like protein [Tribolium castaneum]
NCBI nr blastxgi|1892341242e-11540.52%PREDICTED: similar to protease m1 zinc metalloprotease [Tribolium castaneum]
Group
Gene OntologyGO:00065085.4e-238proteolysis
GO:00082372.4e-80metallopeptidase activity
GO:00082702.4e-80zinc ion binding
KEGG pathwaytca:6573121e-117 
 K11140 (ANPEP)maps-> Glutathione metabolism
    Renin-angiotensin system
    Hematopoietic cell lineage
InterPro domain[18-1025] IPR0019305.4e-238Peptidase M1, alanine aminopeptidase/leukotriene A4 hydrolase
[143-362] IPR0147822.4e-80Peptidase M1, membrane alanine aminopeptidase, N-terminal
Orthology groupMCL10074 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200549-TA
ATGGAGTGCTTGAAGGTGCTATTCCTCCTGTCCTCCGTCCAGTTGAGCCGGCAGTACTTGCTGCCAGATCACGTCGCTCCCTCACACTACCAACTCAGACTGCTGTACGACATCGACCCCAGCACCAACTTCAACTTCTTCGGCGTCGCTGATATTCAGTTTTCGGCTAAAAAGAGCACTTCGAAGATAATTCTCCATGCGCAAGATTATATGATATCAGAGGACAAAGTGAGTGTCGTTGGACAGAAAGACGTTCCCAAAGTGACGGGCGTGAAACTGAATGATACGTACAACTTCTTAGAAATATCACTTGATAAGGATTTAGAGGAAAATAAAAACTATACTCTCACGATACCCTTCCATGGTAATTTGATTAGAAAGTTGGACGGAGTTTACATCAGCTCGTATAGAGATAAGAAGAATGAAAAATCAGAGAAGGAACGTTCTGCTTTTATTTGGGATGAATTCGACAAATCAGTTCCGATGTCTTCATATCTGGTGGCTTTCGTGGTGTCCAAATACTCTTACAAATCTGCCCCAAATACATCGAATACCAAGTTTAGAATTTGGGCCAGAAGCGACGATATAGAACAGACTTCCTACTCATGTAAAGTAGGGCCAGCAGTGCTGTCTCAATTTGAGAGATGGTTTAATGTGTCGTTCCCGCTTCCCAAACAAGATATGGTGGCTATACCAGACTTCGATGCCGAGGGTATGGAAAACTGGGGCCTCGTCACTTACGAGGATGCTGCACTTCTGTACCATGACAGAGAATCTTCTCTCTCAGACAAGGAAAGAATTGCCTCATTGATAGCTCATGAGCTGGCTCATCAATGGTTCGGTAACCTGGTCACCATGAAGAGCTGGTCGGATCTGTGGCTGAACGAGGGGTTTGCGACTTTCGTAGCCTCACTGGGTGTGAATGCCATTGAACCAACCTGGCATGCGGACATCAACAACGCTGTCGAGAACACACTGACAGTATTCAACTTAGACGTCTTGGAGTCATCTCATCCAGTTTCGGTTCCCTTAGAAGATCCGAGAGACATTACAGAGATCTTCGATGATATTTCTTACAGTAAAGGAGCTACTTTAATTCGAATGATGGAAATGTTTTTGGGAGAAGAGGACTTTAGACAGGCACTTCACAATTATCTTATAAAATATTCATACTCTAACGCTGCACAAGATGATCTCTGGTCGGAACTTAACGCAGTGGTCATGAACAAGGGCGTGTTGAATCGCAACATGACTGTAAAACGAGTCATGGATACTTGGACCAAACAAACTGGATTTCCTTTGTTAACTGTTAACAGGAACTACTCCGACAAATCTGTCAATATTTCACAGAAACGTTATGTATGGCGTCAAGAAATTCTTTCTCCTCAAGGTTGGTGGATTCCTCTTAGTATGAAATGTGAACGTGGAACAGGAGACCAAAAACTACTCTGGTTAAGTGATGAGGAAGGAGTTTTGGTTGAAAAACGTCTCGAGCATGGATGCGGCCAAAATGAATGGTTGTTGTTTAATTACAATATGATGGTACACATCTTCGATGATATTTCTTACAGTAAAGGAGCTACTTTAATTCGAATGATGGAAATGTTTTTGGGAGAAGAGGACTTTAGACAGGCACTTCACAATTATCTTATAAAATATTCATACTCTAACGCTGCACAAGATGATCTCTGGTCGGAACTTAACGCAGTGGTCATGAACAAGGGCGTGTTGAATCGCAACATGACTGTAAAACGAGTCATGGATACTTGGACCAAACAAACTGGATTTCCTTTGTTAACTGTTAACAGGAACTACTCCGACAAATCTGTCAATATTTCACAGAAACGTTATGTATGGCGTCAAGAAATTCTTTCTCCTCAAGGTTGGTGGATTCCTCTTAGTATGAAATGTGAACGTGGAACAGGAGACCAAAAACTACTCTGGTTAAGTGATGAGGAAGGAGTTTTGGTTGAAAAACGTCTCGAGCATGGATGCGGCCAAAATGAATGGTTGTTGTTTAATTACAATATGATGGCACCCTTTAGAGTTAATTACGATGACAACAATTGGAAGCTTCTCATAAACACTCTGACGAGTGACCAGTACACCCTCATCCCGGTCGAAGGTCGAGTGCAGTTGCTGTCCGACGCTTTTGAGCTGGCGTGGAACAATCAGCTCGACTATGGAATGACTTTACAGTTAGCAAGTTACCTCCAAAAAGAACAAGAATATTTACCTCTGTATGCAGGTCTTTCAGGTCTGTCGAAGATATCCAATGTACTAAAACGGAGCGCAGAATATGGTGTGTTCCAAGAATATGTTAGAATATTAATCACCCGGATATATCAGAGTGGAGGACTCGCTCACAAAAATATAGTTAATGGTGCTGACTTAAACGGTGTCAAAATTCAGGGGCTCTCCAGCTCATGGGCTTGCAGTATGAACGTTCCTGGTTGTGAAGACAATGCCTTAGAGATGTTCCATCAGTGGATGAAAATTCAGAATCCAGATGAAAATAATCCCATCCCGGTAGACCTTCGCTCGATAGTGACCTGCGTCGGTATTCATCGTGGTAGTGAGTATCACTGGAGTTGGTCTCTCGAGCGGCGGAAGCATTCTAACGTGGCAGCAAGTCGGGAACACCTCCTGAACTCACTGGCCTGCAGCAGGGATGTTTGGATACTGGCCCAGTACTTAGAATGGACTTTAACAGAGAGTGACGAGCTTCATCGCCAGGAGTCGAGTCGAGTCATCAGCGAGGTGGTGAGCTCTGAGGTGGGCTACTACGTGGCGAGAGACTTCATTTATAACAGGATCAAGGACATATATACTGCTTTTTATGATCAAAGTGAAGGCATCGCTGACATTATGAAGAGCCTTCTCGGGCAGTTTACTTCTCAGAAGGAACTCGATGAGTTCTTGTCTTGGCAAAAGAAGAATGATGAGTTGCTCTCTGATTCAAAAATGGCAGTGGCACAAGGAATAGAAACTGCTCGCAACAACATTCGATGGGTGGAGACTATGAAACCTATTTTTATGGACAGACTGAAAGAAGTTATTCGATCCAGTTCCTCTACATCGGTCTAG

Protein sequence:

>DPOGS200549-PA
MECLKVLFLLSSVQLSRQYLLPDHVAPSHYQLRLLYDIDPSTNFNFFGVADIQFSAKKSTSKIILHAQDYMISEDKVSVVGQKDVPKVTGVKLNDTYNFLEISLDKDLEENKNYTLTIPFHGNLIRKLDGVYISSYRDKKNEKSEKERSAFIWDEFDKSVPMSSYLVAFVVSKYSYKSAPNTSNTKFRIWARSDDIEQTSYSCKVGPAVLSQFERWFNVSFPLPKQDMVAIPDFDAEGMENWGLVTYEDAALLYHDRESSLSDKERIASLIAHELAHQWFGNLVTMKSWSDLWLNEGFATFVASLGVNAIEPTWHADINNAVENTLTVFNLDVLESSHPVSVPLEDPRDITEIFDDISYSKGATLIRMMEMFLGEEDFRQALHNYLIKYSYSNAAQDDLWSELNAVVMNKGVLNRNMTVKRVMDTWTKQTGFPLLTVNRNYSDKSVNISQKRYVWRQEILSPQGWWIPLSMKCERGTGDQKLLWLSDEEGVLVEKRLEHGCGQNEWLLFNYNMMVHIFDDISYSKGATLIRMMEMFLGEEDFRQALHNYLIKYSYSNAAQDDLWSELNAVVMNKGVLNRNMTVKRVMDTWTKQTGFPLLTVNRNYSDKSVNISQKRYVWRQEILSPQGWWIPLSMKCERGTGDQKLLWLSDEEGVLVEKRLEHGCGQNEWLLFNYNMMAPFRVNYDDNNWKLLINTLTSDQYTLIPVEGRVQLLSDAFELAWNNQLDYGMTLQLASYLQKEQEYLPLYAGLSGLSKISNVLKRSAEYGVFQEYVRILITRIYQSGGLAHKNIVNGADLNGVKIQGLSSSWACSMNVPGCEDNALEMFHQWMKIQNPDENNPIPVDLRSIVTCVGIHRGSEYHWSWSLERRKHSNVAASREHLLNSLACSRDVWILAQYLEWTLTESDELHRQESSRVISEVVSSEVGYYVARDFIYNRIKDIYTAFYDQSEGIADIMKSLLGQFTSQKELDEFLSWQKKNDELLSDSKMAVAQGIETARNNIRWVETMKPIFMDRLKEVIRSSSSTSV-