Monarch geneset OGS2.0

DPOGS204431
TranscriptDPOGS204431-TA1302 bp
ProteinDPOGS204431-PA433 aa
Genomic positionDPSCF300002 - 345659-351137
RNAseq coverage1211x (Rank: top 10%)
Annotation
HeliconiusHMEL0062371e-10388.32% 
BombyxBGIBMGA007728-TA0.088.76% 
DrosophilaCG17337-PA7e-16462.12% 
EBI UniRef50UniRef50_Q96KP44e-15260.71%Cytosolic non-specific dipeptidase n=176 Tax=Opisthokonta RepID=CNDP2_HUMAN
NCBI RefSeqXP_001653563.13e-17165.36%glutamate carboxypeptidase [Aedes aegypti]
NCBI nr blastpgi|3838585892e-17267.67%PREDICTED: cytosolic non-specific dipeptidase [Megachile rotundata]
NCBI nr blastxgi|3838585894e-17067.67%PREDICTED: cytosolic non-specific dipeptidase [Megachile rotundata]
Group
Gene OntologyGO:00167876.5e-30hydrolase activity
GO:00081526.5e-30metabolic process
KEGG pathway 
InterPro domain[51-426] IPR0029336.5e-30Peptidase M20
[166-325] IPR0116504e-13Peptidase M20, dimerisation
Orthology groupMCL13266 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204431-TA
ATGGTATACTGGATGAGGGACAAGTTAAAAGAAGTCGGTGCATCGACTGAAATAAGAGATGTAGGTTATCAGAATTTCGATGGCAAGGAAGTAAAACTACCACCGGTTCTGGTTGGCGTTCTTGAAAATGATCCCAAAAAAAATACAATCTGCATTTATGGCCATTTAGATGTCCAACCTGCTCTTAAATCTGATGGCTGGGAATCTGAACCATTTGATTTAGTGGAGCGTGATGGAAAGTTGTTTGGTAGAGGTGCTACAGATGATAAAGGACCAGTACTCGGTTGGCTTCATGCTATCAATGCATACAAAGCTACTGGCGAGGAGCTGCCAGTGAATCTCAAATTCGTATTTGAATGTATGGAAGAATCTGGTTCAGAGGGTCTTGATGAGTTGCTAATGCAGAAATTGAAGCCGGAAGGTTTCTTTGATTCCGTGGACTTTGTCTGTATTTCTGACAACTATTGGCTGGGAACCACTAAACCTTGCATCACTTACGGTCTGAGAGGCATTAGCTATTATTTCTTGGAGGTTGAATGCGCTAAAATGGATCTCCACAGTGGTGTATATGGAGGAACTGTACATGAAGCCATGTCCGATCTCATATACCTTATGAACACTCTGGTTGATAAAGATGGTAAGATCTTAATCACCGACATATACAAGTCGGTAGCACCGCTCACAGATAATGAACAGAAACTGTACAATACAATCGACTTCAACCCAGAGGCCTACAGACAATCAATAAGCGCCCATAAACTGGCCCACAATGGTGTAAAGGAACAACTACTGATGCACCGATGGAGGTATCCAAGCCTGTCACTCCATGGAATTGAAGGCGCTGCCTTCCAGCCTGGTGCGAAGACTGTCATCCCCGGGAAGGTCATTGGCAAATTCTCAATTCGTATCGTCCCTAACCAGGAGCCGGAGGAAGTCGAGAAACTTGTGTTTGACTATGTTCACAAGAAGTGGGAAGAACGCGGGTCTCCCAACAAGATGCGTATAACTGCTCAGTCCGGACGCGCTTGGACCGAGAACCCTGAACATCCACACTACCAGGCCGCTGCTAGAGCCACACGACTCATATACAAGACTGAGCCGGACATGTCTCGTGAGGGTGGATCCATACCAGTGACGATCACGCTCCAAGAGGCCAGCGCCAAGAACGTGCTGCTGCTGCCCATGGGCGCGGGAGACGATATGGCGCACTCACAGAACGAGAAGATCAACGTCCGGAACTATATAGAGGGGATCAAACTCTTCGCTGCATACTTATATGAAGTCGGTAAACTACCTAAATAG

Protein sequence:

>DPOGS204431-PA
MVYWMRDKLKEVGASTEIRDVGYQNFDGKEVKLPPVLVGVLENDPKKNTICIYGHLDVQPALKSDGWESEPFDLVERDGKLFGRGATDDKGPVLGWLHAINAYKATGEELPVNLKFVFECMEESGSEGLDELLMQKLKPEGFFDSVDFVCISDNYWLGTTKPCITYGLRGISYYFLEVECAKMDLHSGVYGGTVHEAMSDLIYLMNTLVDKDGKILITDIYKSVAPLTDNEQKLYNTIDFNPEAYRQSISAHKLAHNGVKEQLLMHRWRYPSLSLHGIEGAAFQPGAKTVIPGKVIGKFSIRIVPNQEPEEVEKLVFDYVHKKWEERGSPNKMRITAQSGRAWTENPEHPHYQAAARATRLIYKTEPDMSREGGSIPVTITLQEASAKNVLLLPMGAGDDMAHSQNEKINVRNYIEGIKLFAAYLYEVGKLPK-