Monarch geneset OGS2.0

DPOGS209793
TranscriptDPOGS209793-TA2940 bp
ProteinDPOGS209793-PA979 aa
Genomic positionDPSCF300117 - 794990-803034
RNAseq coverage87x (Rank: top 63%)
Annotation
HeliconiusHMEL0084072e-6825.73% 
BombyxBGIBMGA008018-TA0.046.98% 
DrosophilaCG14516-PB7e-5626.14% 
EBI UniRef50UniRef50_Q7QC912e-6525.63%AGAP002508-PA n=1 Tax=Anopheles gambiae RepID=Q7QC91_ANOGA
NCBI RefSeqXP_312430.44e-6625.63%AGAP002508-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3838620065e-6525.81%PREDICTED: aminopeptidase N-like [Megachile rotundata]
NCBI nr blastxgi|3479679861e-6525.38%AGAP002508-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00065082.8e-50proteolysis
GO:00082374.4e-31metallopeptidase activity
GO:00082704.4e-31zinc ion binding
KEGG pathwaynvi:1001242863e-61 
 K11140 (ANPEP)maps-> Glutathione metabolism
    Renin-angiotensin system
    Hematopoietic cell lineage
InterPro domain[103-960] IPR0019302.8e-50Peptidase M1, alanine aminopeptidase/leukotriene A4 hydrolase
[94-429] IPR0147824.4e-31Peptidase M1, membrane alanine aminopeptidase, N-terminal
Orthology groupMCL30537 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209793-TA
ATGGCCCGGACAAAGTTTGAGTGTTACCTTATTTTTTTTATGTCTTTAATATTATATATTGATACGAGAAGTCTTCCGATAAGTGAATACAGTGTGAAAGTAACTGAAAATCAATATTCCAATGAAGGTGTTTCACAGCTAGGAGAACAAAAAACAATTTATCAGAAGAAAGTAAATAATGTTGAAGAATCTAATGTGATAAATAAAGAACAATCAATTATTAAGGCAACTCTAGAATCGAATATTCGGAATTCTAATGTCAGACCGGGCCAAACAGTAAGATCGTATGCAGTTTCTATAACAAGAAATGGAAACAACTTCAATGGAATTGCAACAATAGATTTAGATTTGACTGAACAGACCAGGGACAATGCTATAAGATTTAATATACTAGACATGAATGTACAATCTGTGAAAGTCGGAGTCTTAACAGATGCAAATGCTATTAATACTAATTTCAATATAAACAACAATAACTTAGTAATTCAACCAAGTCAATCAGGAACTCGTTTTGTTGTAATTATTGAATACACCGGACAGCTGCGGAGAGATGGTAATGGACTGTACTTAGGACAATATGATAACAGTGACTATGTAGCAATGAATCTTCATCCAATGAATGCAAGAAGAGTATTCCCTTGTATGGACGAGCCTTCTCTTAGATCAAGGATAACGTTCACGTTCAATAACATGGAATATGAGACTATGGTCTCAAACAGTTTATTGCAAGCTGAAAGCCAGACAGCATTCAGAGCCTTAGAAACTCCCTCTCATGTCTGGGGTATGGTGGCTCATAATTTAGTTAGCATTAGTGTACCAGTGGGAAATGTGATAATATATGGACGGCAAGGAATAGGAAATCAGGATGCTCAAGCGGCTGCAGCGATAAACTATTACTTTAATGCATTTTACGAATGGACTCAAACATCTTATTTCCAAATCGTTAACGAACAAGACAATAGGTTGCATATTATGGCTGTTCCTGATATTAATAGGGAATGGTACGCGTTATCTACAATCTGCATTTGGGAACCCTACATTTTTATGACAACCAACCATGCGGTCAAACAAAGGAAAATTGCTCTTGTCACAATAGCGGAGGCAATGTCACGACATTGGTTCGGATATGTTGTTTATCCTGAAAATTGGCGTCACCAGTGGGTCGTTACAGGTTTAGGAAGCTACGTAGCATATGATATTGTAAGAGAGTTCCAAACATCGTCAGATCCAGAAGAAAATGCAAACATGCTGGACGTCAATACAATATTTACTACTGACATCATTCAGGAGAGTCTTCTTCAGGATGCTTATTCCATTTACGATCCACTACTTGTAGGAAATAATATCTTTCAAAACAATGCCGTACGTTTCCATTTGAACAGTCTGCTAAAATATAAAGCACCTGCAATAATACGAATGTTAAGTGGGGTTCTGGGAGGTGATACGGATTTAGTCAAAAATATAACCAGGGCCTTAATCCCTGCAAATCATTTGGAGGCAGTTTCTACCCAAAGTTTATTTAACGCTGTCAATAGCGTCTTCTCTGGGAATAATATACTTAATAATGCTCAGGAGTTTATTACAAATTGGATAGATAAAACTGGTTACCCCGTGATTCGGGTCGTACATAGACCAGGAGGAGTTCAAGTTACGCAGGAACATTTTAGTTTTACGACTTCTTCAAGACCATCAAACTTCCGAATTCCTTTGACATATACAACAAGAAATGAGCCCAATTTTAATAACCTATTTCCATCAGAAATTATGGATTTAGTGAGCGTGGTTACTACGAACCTTGGGGAAGATGATTGGATAATATTTAATATACAAGGACAGGGTTATTATCGAGTTAACTATGATGATGTTTTATGGCAGAGGATTATCAATGCATTACAGGATGAAGAGCAAAGAGAGAAAATACATCCGCTTAATCGTGCTTCTATTTTAGATGACGCTCTTAATTTAGCAAGAGCTGGAAAATTGGATTACAGTATAGCGTTTGAAATAGTCCTAACTATGGAATTGGAAACAGATTACGGTGTTTGGAAGACTTTCGTTAGAAATATGGACTTCATAAGGAAACGATTAATGACATTTGCAAACGATAACGACAGAGATAGTCAAGCCTATGCGAAACTTTTGGAGAAACTTATTGTGTCAGTTGAAGAGAAATTAGGTTTTGAGCCACAAAATTCGGACACAGCAATGGTATCACTCACAAGAGCCTTGGTGATGGAGCATGCATGTGTATCCGGCTATGAACCTTGTATTGCTGCAGCCGTTGATATGTTTTATGATCCAAACAATGATGGCGAAGTGAATCCCGAAATACCGTTAGAAATAAGGCCAGTTGTATATTGTACAATGGCTAGAGAAGGAGATGAAGAAGTGAGAGCAGCCCTCAGAAGACGCTTAGAACAAGAAACATCCAGGTACGAGCGACTGGTTATATTGGAGTCTTTTGCCTGTTCAGAAGATGCTGCGTTTATTGATGGGTTACTGGCAGATACAATTGCGGCGAACAGCCCTTACGTTATTGAAGAGAGGTTCAAGATATTTGTCGCAGTCGCCTCCTCTAGTTTTAGAAATACAAATCGCGCACTAAACTTTTTGAGGCAACGAACAAATGAAATAAGAAATATGTACGGTGGCGGTGAAAAACTCGATCAGGTTATATTGATTATGGCAGAAAATGCAGTGAATCAACAGATAGGGGAAGATTTTACCACATGGGTGAATTCGCAACCGGTGATTAATAACCTTGAAGATTCCAGCATGGTCGCTGTTAAAGCAAGGAGTTTAATAGCAGAAAACGTAAGTTGGAAAAATGCACATCTGTCGAATGTATACGATTGGGTAGAGCAAAATAATGGAAACACCTTTTTGGTGTCTTTTGCTTTGCTAGGTCTATCTTTGATCGTTACCATTTTTAATAATTAA

Protein sequence:

>DPOGS209793-PA
MARTKFECYLIFFMSLILYIDTRSLPISEYSVKVTENQYSNEGVSQLGEQKTIYQKKVNNVEESNVINKEQSIIKATLESNIRNSNVRPGQTVRSYAVSITRNGNNFNGIATIDLDLTEQTRDNAIRFNILDMNVQSVKVGVLTDANAINTNFNINNNNLVIQPSQSGTRFVVIIEYTGQLRRDGNGLYLGQYDNSDYVAMNLHPMNARRVFPCMDEPSLRSRITFTFNNMEYETMVSNSLLQAESQTAFRALETPSHVWGMVAHNLVSISVPVGNVIIYGRQGIGNQDAQAAAAINYYFNAFYEWTQTSYFQIVNEQDNRLHIMAVPDINREWYALSTICIWEPYIFMTTNHAVKQRKIALVTIAEAMSRHWFGYVVYPENWRHQWVVTGLGSYVAYDIVREFQTSSDPEENANMLDVNTIFTTDIIQESLLQDAYSIYDPLLVGNNIFQNNAVRFHLNSLLKYKAPAIIRMLSGVLGGDTDLVKNITRALIPANHLEAVSTQSLFNAVNSVFSGNNILNNAQEFITNWIDKTGYPVIRVVHRPGGVQVTQEHFSFTTSSRPSNFRIPLTYTTRNEPNFNNLFPSEIMDLVSVVTTNLGEDDWIIFNIQGQGYYRVNYDDVLWQRIINALQDEEQREKIHPLNRASILDDALNLARAGKLDYSIAFEIVLTMELETDYGVWKTFVRNMDFIRKRLMTFANDNDRDSQAYAKLLEKLIVSVEEKLGFEPQNSDTAMVSLTRALVMEHACVSGYEPCIAAAVDMFYDPNNDGEVNPEIPLEIRPVVYCTMAREGDEEVRAALRRRLEQETSRYERLVILESFACSEDAAFIDGLLADTIAANSPYVIEERFKIFVAVASSSFRNTNRALNFLRQRTNEIRNMYGGGEKLDQVILIMAENAVNQQIGEDFTTWVNSQPVINNLEDSSMVAVKARSLIAENVSWKNAHLSNVYDWVEQNNGNTFLVSFALLGLSLIVTIFNN-