Monarch geneset OGS2.0

DPOGS207580
TranscriptDPOGS207580-TA3177 bp
ProteinDPOGS207580-PA1058 aa
Genomic positionDPSCF300072 + 723542-732661
RNAseq coverage521x (Rank: top 24%)
Annotation
HeliconiusHMEL0171460.083.61% 
BombyxBGIBMGA004688-TA0.085.74% 
DrosophilaDppIII-PC0.059.04% 
EBI UniRef50UniRef50_Q17GV20.062.46%Dipeptidyl peptidase iii n=8 Tax=Coelomata RepID=Q17GV2_AEDAE
NCBI RefSeqXP_001601820.10.062.42%PREDICTED: similar to dipeptidyl peptidase iii [Nasonia vitripennis]
NCBI nr blastpgi|1565516920.062.42%PREDICTED: dipeptidyl peptidase 3-like [Nasonia vitripennis]
NCBI nr blastxgi|1565516920.062.42%PREDICTED: dipeptidyl peptidase 3-like [Nasonia vitripennis]
Group
Gene OntologyGO:00065085.9e-185proteolysis
GO:00057375.9e-185cytoplasm
GO:00082395.9e-185dipeptidyl-peptidase activity
KEGG pathway 
InterPro domain[1-618] IPR0053170Peptidase M49, dipeptidyl-peptidase III
Orthology groupMCL13483 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207580-TA
ATGGAGGATAAGTCGATCTTTCTTTTGCCAAACAGTCAGAAATTTGTTGAACTAGATAGTTCACAGGCATTTACAAAATTAACTAAACAAGAAAAGTTGTATGCTCACTATTTGAGTCAGGCTGCTTGGAATGGTGGTTTAATTGTTCTTCTACAAACAAGTCCAGAATCACCAAGGATTTTTTCACTTTTGCACAGAATTTTTAAATCAGAAGGATTAGCTGATTTAAAAAAAGTTTCCCTTGGAGCTGGTGTATCCGAGGATGATTTTCAGGCCTTCTTAGTTTATGCGGGTGGATTATTTGCTAACAGCGGTAATTACAAAGGCTTTGGTGATACAAAATTCATTCCTAACTTGCCAAAAGAATGTTTTGAAGTTATCGTTAAATCATCAAAAGCATTTAAAAATGATGAAGCACATATAAGTAAACTGTGGGAGAACACTAAAAATGCTATGTACAGTACTGCACCCAGATTAGCCAGCCTTGGTTTAGCCGATAAGGGTATAACAACATATTTCTCAAGTAACTGTACAGAGGCGGACTCGACTCTTGTAAATGACTGGATGAAAACAAAACGCATTGAAGCATACATTTGTAGAACTTTCAAGACAACCGCTGACGATGGATTACCTTTGTATACGATACACCTGGCCAGTGTCGAGAAAAGCTCAAAGCCGCCCCTTACTATGGATAAAGAAAAATACAAAAATGCGTACTTCCAAGTGACTCGGGGAGATTATTCGCCATTATTGAGTTTGGTCAACGAAAATCTTGCAAAAGCTATGGAGTATGCGGCAAATGAGAATGAAAAGAATATGATTAAACATTACATTAACAGTTTTAAAGAGGGAGATTTAAGTGAACATAAAGAAGGCAGCAGGTTCTGGGTGAAGGACAAAGGACCGATTATAGAGACATATCAAGGCTTCATAGAGACATACCGCGATCCCAGCGGACAAAGAGGTGAATTCGAGGGTTTTGTGGCTATGGTCAATAAAGATATGTCAAAAAAGTTTGGGGAACTCGTCCATGGTGCTGAAAACTTCATAAAGCTGTTACCGTGGGGGGAGGGGCTTGAGAAGGATTCCTTCCTCCGACCGGACTTCACTAGTCTAGACGTACTGACGTTCTCAGGGAGCGGTATACCAGCCGGAATTAACATACCTAACTATGATGAGATCCGACAAAATGAAGGCTTTAAGAACGTGTCCCTGGGTAACGTGTTCCCCGCCGCTTATAAGGAGTCCGTTATACCATTCCTCTCTGATAGTGATAAAGTTCTTTTAGAAAAATACAGGGTTGCTGCATTTGAGGTTCAAGTAGGACTTCATGAACTGCTGGGTCATGGCAGCGGGAAGCTTCTCAGACAAAACGCAGACGGGACATTCAACTTCGACAAGGAGAAAGTTAAAAATCCTCTAACTGGCAAGGAGATCGAGTCGTGGTATTCAGAAGGCGAGAATTACGACAGCAAGTTCACCACTTTGGGATCCGCCTTCGAGGAATGCCGGGCGGAGGCTGTTGGATTGTATCTGTCGTTACGACCTGAGATACTCAAAATCTTCGGTTACGAGGGTCAGGAAGCAGAGGACGTGATGTACGTCAACTGGCTCAGTCTACTGTGGAACGGAGCCGCCAAGGCCACGGAAATGTACCAGCCGGCTACGAAAACGTGGCTACAGGCCCACGCGAGAGCTCGTTTTGTTTTAATGAGACTGTTGGAATTGGAAGGTAACGGAATACTAACAGTCACCGAGGTTGATCCCGGCAAGAACCTGTTGCTTACTTTAGACAGGAAACGTTTGGCTACTGACGGAAAACGAATTGTCGTTCTCGAGCTCGCCGAGCTGCGATACGAGTTGGCTGTTCGCGCCGAACTACTTCAACCTTTCTTTTTTGTTTTGGATGTAATTTTTTTAAATGCTGTAATAAGATATCAAGGCTTCATAGAGACATACCGCGATCCCAGCGGACAAAGAGGTGAATTCGAGGGTTTTGTGGCTATGGTCAATAAAGATATGTCAAAAAAGTTTGGGGAACTCGTCCATGGTGCTGAAAACTTCATAAAGCTGTTACCGTGGGGGGAGGGGCTTGAGAAGGATTCCTTCCTCCGACCGGACTTCACTAGTCTAGACGTACTGACGTTCTCAGGGAGCGGTATACCAGCCGGAATTAACATACCTAACTATGATGAGATCCGACAAAATGAAGGCTTTAAGAACGTGTCCCTGGGTAACGTGTTCCCCGCCGCTTATAAGGAGTCCGTTATACCATTCCTCTCTGATAGTGATAAAGTTCTTTTAGAAAAATACAGGGTTGCTGCATTTGAGGTTCAAGTAGGACTTCATGAACTGCTGGGTCATGGCAGCGGGAAGCTTCTCAGACAAAACGCAGACGGGACATTCAACTTCGACAAGGAGAAAGTTAAAAATCCTCTAACTGGCAAGGAGATCGAGTCGTGGTATTCAGAAGGCGAGAATTACGACAGCAAGTTCACCACTTTGGGATCCGCCTTCGAGGAATGCCGGGCGGAGGCTGTTGGATTGTATCTGTCGTTACGACCTGAGATACTCAAAATCTTCGGTTACGAGGGTCAGGAAGCAGAGGACGTGATGTACGTCAACTGGCTCAGTCTACTGTGGAACGGAGCCGCCAAGGCCACGGAAATGTACCAGCCGGCTACGAAAACGTGGCTACAGGCCCACGCGAGAGCTCGTTTTGTTTTAATGAGACTGTTGGAATTGGAAGGTAACGGAATACTAACAGTCACCGAGGTTGATCCCGGCAAGAACCTGTTGCTTACTTTAGACAGGAAACGTTTGGCTACTGACGGAAAACGAATTGTCGGCGACTTCTTAGTAAAGCTGCAGACTATCAAATCTACTGGCGACGTGTCGTCGGGCGAACAGTTGTTCACTCGACTCAGCAGCTTAGAGGAACCCTGGCTGAGGTGGAGGGACATCGTCATGATGCACAAACAGCCACGGAATATATTCGTACAACCCAACACGGTTCTCAAAGATGATGACGTTGTTTTGAAACGCTACGAGGCAAGTGCTTCAGGGATGGTGACGTCATCTGTGGAGCGATACACGCTGGCTATAGACGACGCGCTCGAGTCCCTCGCGGCACAAGACCAACAGTACTTTGAAGAACTCAGCAAACTAGCCATCTGA

Protein sequence:

>DPOGS207580-PA
MEDKSIFLLPNSQKFVELDSSQAFTKLTKQEKLYAHYLSQAAWNGGLIVLLQTSPESPRIFSLLHRIFKSEGLADLKKVSLGAGVSEDDFQAFLVYAGGLFANSGNYKGFGDTKFIPNLPKECFEVIVKSSKAFKNDEAHISKLWENTKNAMYSTAPRLASLGLADKGITTYFSSNCTEADSTLVNDWMKTKRIEAYICRTFKTTADDGLPLYTIHLASVEKSSKPPLTMDKEKYKNAYFQVTRGDYSPLLSLVNENLAKAMEYAANENEKNMIKHYINSFKEGDLSEHKEGSRFWVKDKGPIIETYQGFIETYRDPSGQRGEFEGFVAMVNKDMSKKFGELVHGAENFIKLLPWGEGLEKDSFLRPDFTSLDVLTFSGSGIPAGINIPNYDEIRQNEGFKNVSLGNVFPAAYKESVIPFLSDSDKVLLEKYRVAAFEVQVGLHELLGHGSGKLLRQNADGTFNFDKEKVKNPLTGKEIESWYSEGENYDSKFTTLGSAFEECRAEAVGLYLSLRPEILKIFGYEGQEAEDVMYVNWLSLLWNGAAKATEMYQPATKTWLQAHARARFVLMRLLELEGNGILTVTEVDPGKNLLLTLDRKRLATDGKRIVVLELAELRYELAVRAELLQPFFFVLDVIFLNAVIRYQGFIETYRDPSGQRGEFEGFVAMVNKDMSKKFGELVHGAENFIKLLPWGEGLEKDSFLRPDFTSLDVLTFSGSGIPAGINIPNYDEIRQNEGFKNVSLGNVFPAAYKESVIPFLSDSDKVLLEKYRVAAFEVQVGLHELLGHGSGKLLRQNADGTFNFDKEKVKNPLTGKEIESWYSEGENYDSKFTTLGSAFEECRAEAVGLYLSLRPEILKIFGYEGQEAEDVMYVNWLSLLWNGAAKATEMYQPATKTWLQAHARARFVLMRLLELEGNGILTVTEVDPGKNLLLTLDRKRLATDGKRIVGDFLVKLQTIKSTGDVSSGEQLFTRLSSLEEPWLRWRDIVMMHKQPRNIFVQPNTVLKDDDVVLKRYEASASGMVTSSVERYTLAIDDALESLAAQDQQYFEELSKLAI-