Monarch geneset OGS2.0

DPOGS200548
TranscriptDPOGS200548-TA2850 bp
ProteinDPOGS200548-PA949 aa
Genomic positionDPSCF300119 + 10566-23081
RNAseq coverage947x (Rank: top 13%)
Annotation
HeliconiusHMEL0168736e-13770.77% 
BombyxBGIBMGA010763-TA0.065.52% 
DrosophilaSP1029-PA0.038.75% 
EBI UniRef50UniRef50_D6WB370.040.79%Aminopeptidase N-like protein n=2 Tax=Tribolium castaneum RepID=D6WB37_TRICA
NCBI RefSeqXP_968871.20.040.79%PREDICTED: similar to protease m1 zinc metalloprotease [Tribolium castaneum]
NCBI nr blastpgi|2700028890.040.79%aminopeptidase N-like protein [Tribolium castaneum]
NCBI nr blastxgi|1892341240.040.79%PREDICTED: similar to protease m1 zinc metalloprotease [Tribolium castaneum]
Group
Gene OntologyGO:00065081.4e-296proteolysis
GO:00082378.8e-82metallopeptidase activity
GO:00082708.8e-82zinc ion binding
KEGG pathwaytca:6573120.0 
 K11140 (ANPEP)maps-> Glutathione metabolism
    Renin-angiotensin system
    Hematopoietic cell lineage
InterPro domain[18-949] IPR0019301.4e-296Peptidase M1, alanine aminopeptidase/leukotriene A4 hydrolase
[214-432] IPR0147828.8e-82Peptidase M1, membrane alanine aminopeptidase, N-terminal
Orthology groupMCL10074 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200548-TA
ATGGAGTGCTTGAAGGTGCTGTTCCTCCTGTCCTCCGTCCAGTTGAGCCGGCAGTACTTGCTGCCAGATCACATCGCTCCCTCACACTACCAACTCAGACTCCTGTACGACATCGACCCCAGCACCAACTTCAGCTTCTTCGGCGTCGCTGATATTCAGCTAACAGTAAAAAAGAGCACTTCGAAGATAATTCTCCATGCGCAAGATTATATGATATCAGATGACAAAGTGAGTGTCGTTGGACAAAAAGAGGTTCCCAAAGTGACGGGAGTAAAACTGAATGATACGTACAACTTCTTAGAAATATCACTTGATAAGGATTTAGAGGAAAATGGGAAGTACAAACTCACGATACCCTTCTACGGCAACCTGGTCAAAGGTTTGGACGGAGCCTACATAAGCTCCTACACGAACAGACAGACTCAGAAGACAGAGTATTTAATTTCCACTCAGTTTGAGGCGATATCAGCTCGCAAGGGTTTCCCGTGTTTCGACGAACCCATGTACAAAGCCACCTACTCTATCATCATCGGTCACAGCAAGGAGTACACGGCCGTCTCCAACATGCCACTAGCGGCGTCCGCCTCTGAAAATGCCCTAGAAGATTACTGGCCCTGGGACGTAGTCGGAAAGAGGTTTAGGAAGGAGAGATCTTCATTTGTCTGGGATCAGTTCGCCAAGTCTGTGCCTATGTCTACATATCTGGTCGCGTTCGTGGTGTCCAAGTTCTCGCACGTGGTCAGCCCTCCGGAACTATCGAAGACACAGTTCAGGATATGGGCCAGAGGAGACGCCATCGATCAGACATCCTACGCGGCTAAGATCGGTCCTCAAGTGTTGTCCTACTTTGAGAAGTGGTTCAACGTGTCGTTTCCTCTGCCGAAGCAGGACATGATGGCCATACCAGACTTCTCAGCGGGGGCTATGGAGAACTGGGGCCTCATCACGTACAGAGAGACGGCACTCCTGTACAGCGATAAGGAATCGTCGTTCTTGAACAAGGAGAGGATAGCTGAGGTGGTAGCTCATGAGCTGGCCCATCAGTGGTTCGGTAACCTGGTGACCATGAAGTGGTGGTCGGACCTGTGGCTGAACGAGGGGTTCGCGACCTTCGTGTCTAGTGTGGGCGTGTCGGCCGTGGAGCCGACCTGGCGAGCTGATCGGTCCTACGCCGTGGAGAACACGCTCTCCGTGTTGAGTTTAGACGCCTTGGAGTCATCTCATCCCGTGTCAGCGCCTCTCGATGATCCGAAGCGCATCTCGGAGATCTTCGACGCGATCTCTTACAGGAAGGGCTCCACTCTCATCCGCATGATGCTGATGTTCCTCGGAGAAGGTGTCTTCAGGCAGGCGCTGCACAACTACCTGATGAAGTATTCGTATTCAAACGCCGAGCAGGATGATCTCTGGGCGGAGCTGACGGCAGCCAGCCTGAGGAGTGGAAGCCTTACGAGGAACATCACCGTTAAAGAGGTGATGGACACCTGGACCACACAGACGGGATACCCGATCCTCACCGTCACCAGGGACTACTCCGACAAGTCGCTTACAATCTCACAGAAGCGTTACCTGTCTCTGGGCGTCGGTCGGACCTCCCAAGCGTGGTGGGTCCCTCTAAGCGTTCTCTGTGAGAAAGACAGAAAAAGCGAGAGCGAGAGCGTCCAGTGGTTAGGAGATACGGAGGGAGTGACGAACGAACATAGATACGAACACGGCTCTGGAGCGAGCGAGTGGGTTCTGTTCAACTACAACATGATCGCTCCATACAGAGTCAACTACGATCAGAGAAATTGGAAGCTTCTCATACAGACTCTGACGAGTGACCAGTACACCCTCATCCCGGTCGAAGGTCGAGTGCAGTTGCTGTCCGACGCTTTTGAGCTGGCGTGGAACAATCAGCTCGACTATGGAATGACTTTACAGTTGGCGAGCTACCTGAAGAGGGAGACGGAATACTTGCCTCTCTACACGGGGCTGTCGGCTTTAGCTAAGATTGAGAACGTACTGAAACGAAGTTCCGAGTACGGAGCCTTCCAGAAGTTTATCAGAAGACTCCTCAACAACGTCTACCAGAAAGGAGGTTTGGCTCTGAAGAGGATCGTCGACGGCGACGACTTGAACAGCGTCAAGCTTCAGACGACTGTGAGCTCTTGGGCCTGCAGCGTGAAGATCCCCGGCTGTGAGGAGAACGCTATAGACATGTTCAACGACTGGATGAGGACGGACAGACCCGACGAAAACAATCCGTATGTAGTCCCGCCCTCCGCCCTCCGCCATGGAATCCCTCTATACTCATGTGTTAATCTGATTCCCGTGGACCTCCGCCGCACTGTATATTGTTCGGCTATCCGTCGTGGCGGGGTGTCGTTGTGGCGCTGGTCCCTCGCCCGCCGCCGGGCCTCCAACGTGGCGACTTCCCGGGACGCCCTGCAGCACGCCCTGGCCTGCAGCAGAGACGTCTGGGTTCTGGCGCAGTACTTGGAGTGGACGGTGTCTGACGGCAGCGAGGTGCGTCGTCAGGATGCCGGCAACGTCATCGCAGCCGTCACCCGGTCTGCCACCGGATACTATGTGGCTAAGGACTTCATATACGGACGAATCCAGGAAATTAGCAAAGCGTTCAACGGCCAGGACAGGAGAATGGGCGGCATCATAAAGACCCTGTTGGGGCAGTTCACGACCAAGAAGGAACTCGATGAGTTCTTGGAGTGGAAGAAGCTGAACGAAAAATATTTGTCGGCTTCAAAGATAGCGGTCGCTCAGGGGATAGAGAACGCTAGAGTGAACATAGAGTGGATCCAGAGAAACAAACGTACCGTAGTGGATAAGATGAGGGAGTACTCCATGTGA

Protein sequence:

>DPOGS200548-PA
MECLKVLFLLSSVQLSRQYLLPDHIAPSHYQLRLLYDIDPSTNFSFFGVADIQLTVKKSTSKIILHAQDYMISDDKVSVVGQKEVPKVTGVKLNDTYNFLEISLDKDLEENGKYKLTIPFYGNLVKGLDGAYISSYTNRQTQKTEYLISTQFEAISARKGFPCFDEPMYKATYSIIIGHSKEYTAVSNMPLAASASENALEDYWPWDVVGKRFRKERSSFVWDQFAKSVPMSTYLVAFVVSKFSHVVSPPELSKTQFRIWARGDAIDQTSYAAKIGPQVLSYFEKWFNVSFPLPKQDMMAIPDFSAGAMENWGLITYRETALLYSDKESSFLNKERIAEVVAHELAHQWFGNLVTMKWWSDLWLNEGFATFVSSVGVSAVEPTWRADRSYAVENTLSVLSLDALESSHPVSAPLDDPKRISEIFDAISYRKGSTLIRMMLMFLGEGVFRQALHNYLMKYSYSNAEQDDLWAELTAASLRSGSLTRNITVKEVMDTWTTQTGYPILTVTRDYSDKSLTISQKRYLSLGVGRTSQAWWVPLSVLCEKDRKSESESVQWLGDTEGVTNEHRYEHGSGASEWVLFNYNMIAPYRVNYDQRNWKLLIQTLTSDQYTLIPVEGRVQLLSDAFELAWNNQLDYGMTLQLASYLKRETEYLPLYTGLSALAKIENVLKRSSEYGAFQKFIRRLLNNVYQKGGLALKRIVDGDDLNSVKLQTTVSSWACSVKIPGCEENAIDMFNDWMRTDRPDENNPYVVPPSALRHGIPLYSCVNLIPVDLRRTVYCSAIRRGGVSLWRWSLARRRASNVATSRDALQHALACSRDVWVLAQYLEWTVSDGSEVRRQDAGNVIAAVTRSATGYYVAKDFIYGRIQEISKAFNGQDRRMGGIIKTLLGQFTTKKELDEFLEWKKLNEKYLSASKIAVAQGIENARVNIEWIQRNKRTVVDKMREYSM-