Monarch geneset OGS2.0

DPOGS203390
TranscriptDPOGS203390-TA1224 bp
ProteinDPOGS203390-PA407 aa
Genomic positionDPSCF300003 + 656122-663021
RNAseq coverage215x (Rank: top 45%)
Annotation
HeliconiusHMEL0057220.091.56% 
BombyxBGIBMGA011789-TA2e-12052.79% 
DrosophilaCG42750-PB5e-13255.15% 
EBI UniRef50UniRef50_Q9VJ941e-12955.15%CG42400 n=20 Tax=Endopterygota RepID=Q9VJ94_DROME
NCBI RefSeqXP_393772.29e-14764.63%PREDICTED: similar to CG6154-PA, isoform A [Apis mellifera]
NCBI nr blastpgi|3287761051e-14564.63%PREDICTED: dipeptidase 1-like [Apis mellifera]
NCBI nr blastxgi|3072026362e-13962.83%Dipeptidase 2 [Harpegnathos saltator]
Group
Gene OntologyGO:00065081.3e-210proteolysis
GO:00168051.3e-210dipeptidase activity
GO:00082391.3e-210dipeptidyl-peptidase activity
GO:00082351.3e-210metalloexopeptidase activity
KEGG pathway 
InterPro domain[20-386] IPR0082571.3e-210Peptidase M19, renal dipeptidase
Orthology groupMCL12028 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203390-TA
ATGGTGAATCGCAATGAGAGACGTGCTGTGGCCTGCATATTGCTGACGGTGATCGCCGTCGTTGCGGCCGCTTCATATGATCGCGAACGACTTGAGATTGCGAAGCAAATTCTCGAGGAAGTACCTCTCACCGACGGACACAACGATTTGCCCTGGAACATCCGGAAGTTTCTTCGCAATCAAATCAACGATTTCGAACTGGACACCGATTTAACCCAAGTGGAACCTTGGTCCAAATCAAAATATTCACATACCGATCTTCCTAGACTTAGACAGGGCATGGTCGGAGCTCAGTTCTGGTCAGCCTTCGTACCGTGTGCAGCTCAAAATAAGGACGCTGTTCAGTTGACCCTTGAACAGATCGATGTCATTCGTCGCCTGGTAGCCAAATATCCTCACCAGTTTCAACTCGCTACGTCTGTTAGTGATATCCTCGAAGCTCATAGTGCTAGACCTCGTAAAATCGCTTCTTTGATCGGCATTGAAGGTGGACACTCTATTGGCAACTCCTTAGGCATTCTTCGCAGCTACTATCAACTCGGAGTACGCTACATGACTCTAACCCATACATGCAACACTCCATGGGCTGATTCTGCCAACGAAGCACCAGTCGCTAACGGACTCACGGAATTTGGAGAGAAAGTTGTCCGTGAGATGAACCGTCTTGGCATGCTGATTGATTTATCTCACGTGGGAGAGAACACTACTAGAGCAGCCATACGTCTCTCGAAAGCACCGGTCATTTTCAGTCATTCTTCAGTCTACAGTTTATGTCCTCACAAACGAAATGTCCCCGATGACATCATACAATCCCTGAAAGTTAATGGTGGAATTATCATGGTTAACTTTTTTCCTGATTTTGTGAAATGTGCGCCAAACGCTACCATATCCGATGTTGCTGAACATTTCCATTACCTGAAGAGGATGATCGGAGCTGATTATGTTGGAGTTGGCGGTGACTTCGACGGCGTTAATAGAGTTCCCCGCGGCTTGGAAGACGTTTCCAAATATCCCGAATTGTTTGCTGAATTACTGCGAAGTGGTCAGTGGAGTGTTCAGGAACTGAAGAACCTTGCCGGCTTGAATATACTACGAGTTATGCGCCAAGTTGAAAAGATCCGTGACGACATGCGAACCAATGGCTCCGAGCCTGAGGAACACCCCGATTCTCCTAACGACAACGGCAGCTGCACCAGCAATGCTTTCTATTCAGACGACGTTTAA

Protein sequence:

>DPOGS203390-PA
MVNRNERRAVACILLTVIAVVAAASYDRERLEIAKQILEEVPLTDGHNDLPWNIRKFLRNQINDFELDTDLTQVEPWSKSKYSHTDLPRLRQGMVGAQFWSAFVPCAAQNKDAVQLTLEQIDVIRRLVAKYPHQFQLATSVSDILEAHSARPRKIASLIGIEGGHSIGNSLGILRSYYQLGVRYMTLTHTCNTPWADSANEAPVANGLTEFGEKVVREMNRLGMLIDLSHVGENTTRAAIRLSKAPVIFSHSSVYSLCPHKRNVPDDIIQSLKVNGGIIMVNFFPDFVKCAPNATISDVAEHFHYLKRMIGADYVGVGGDFDGVNRVPRGLEDVSKYPELFAELLRSGQWSVQELKNLAGLNILRVMRQVEKIRDDMRTNGSEPEEHPDSPNDNGSCTSNAFYSDDV-