Monarch geneset OGS2.0

DPOGS204049
TranscriptDPOGS204049-TA3579 bp
ProteinDPOGS204049-PA1192 aa
Genomic positionDPSCF300138 + 463683-481049
RNAseq coverage97x (Rank: top 61%)
Annotation
HeliconiusHMEL0116140.033.10% 
BombyxBGIBMGA004800-TA2e-13062.10% 
DrosophilaCG17633-PA6e-6332.38% 
EBI UniRef50UniRef50_E9GAI52e-7626.61%Putative uncharacterized protein n=2 Tax=Branchiopoda RepID=E9GAI5_DAPPU
NCBI RefSeqXP_001606826.15e-8038.38%PREDICTED: similar to putative carboxypeptidase A-like [Nasonia vitripennis]
NCBI nr blastpgi|3227943875e-9430.72%hypothetical protein SINV_02946 [Solenopsis invicta]
NCBI nr blastxgi|3227943875e-9530.72%hypothetical protein SINV_02946 [Solenopsis invicta]
Group
Gene OntologyGO:00065085.1e-101proteolysis
GO:00082705.1e-101zinc ion binding
GO:00041815.1e-101metallocarboxypeptidase activity
GO:00041804.8e-19carboxypeptidase activity
KEGG pathway 
InterPro domain[475-754] IPR0008345.1e-101Peptidase M14, carboxypeptidase A
[800-891] IPR0090201.5e-21Proteinase inhibitor, propeptide
[809-881] IPR0031464.8e-19Proteinase inhibitor, carboxypeptidase propeptide
Orthology groupMCL10178 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204049-TA
ATGTTTAGGGACATCCAAATTTTAGTTCCACGGGAACATTTAAATGTTTTTAAAGAGAGGCTAGGCCATTTCCACCTGAACGCAACAATAACAACAAGCAATATAGAACAGTTATTTCAACAGCAAAAGTTTAAGAGGTATACACGAATGAAATTTGAATCATTTTCTTGGAATTCCTATTATGATCTAGAAAGTGTTTACCAATGGATGACAGATATTGCAACTAAGAATAGTGATAGGGTGAAATTAAAGGCCATCGGCAAGTCTGCAGAAGGTAGAGAGATACTAACGCTTGAGATAGTAACAGAAAATCCTAAAGGAAAAGTTATAGTCGAAGGTGCGATACACGGAAATGAATGGCTTACAACTCAATTTGTTACTTATCTTGCATTCTTTCTGGTATATCCTGAGAAATCTTTTAATTGGCGTTTAAAACAAGTGGCAAAAAAATATACCTGGATCCTCATTCCGGTTGTCAACCCGGATGGCTATGATTACAGCATGAAAGTGGATCGTTTGTGGCGCAAAAATAGAAGAAAAACCTCCAATGCAACGATAGGGGTTGATTTAAATAAAAATTTCAAATACAGGTTTTGCGAGTATGGTGGAAGTAAAGATGTATCAAGTGCATACTACTGTGGACCACAAGCTTTCTCAGAACCCGAGACCCTGGCGATGTCAAAATTTATACATTTTAATAGACATCAGTTAAACTTCTATTTTGCGTTTCACGCCTATGGACAAAAGATAATTATTCCATATTCTGATAGAGTGCAACACATCGAAAATTTTGCTGAAATGGAAAATTATGGGAAACAAGCAATATTGAAAATGTACAAATTGTATGGTATTAAATATGGGATTGGCACTATTTACGACGTATACGGTTACAAATCTTCAGGAGATAGTATATCTTGGGTAAAGAAGACATACGGAGTAAAGTACTCTCTCTCATTCTACCTACGTGACAACGGAACTTATGGATATGCTTTACCTCCGGATAATATACTGCCCGCATGCAAAGAAACCTTGACCGGCTTAATGGAACTTATGACGGTTCGACACAGGAGTGTCGAAAGAGAACTTTTTAGTTCCACCATCTTCTTGTACATAAATCCATGCACATCTAAAAGCTACAAAAACTACACATTGTTTAGAGGTGTGCCCGTGACAGACCATCACTTAGAATTCTTCAAGAATTTAAGTGATGTTTACAAGGCCACATTTTGGAGGTCACCAGGTTTAGTCCATAGGCCAGTAGAATTCATTATTGGTCCAAATAAAAAGAGAATTTTTTTGAGAGATGCCATGTTAAAAGGTATTTATTACACAACGGTGATAGAAGATGTGCAAAGGGCCTTTGATTCTCAAACTGTAAAGACATATGTGCGTCGCAATATGGAATCCTTTGACTGGACCAGCTATTTTAGATTGGATGATATCTATGACTGGCTTCAAGATCTAAGCGTCATGTATCCAAAAGTGATGCATCTACAAAACTTGGGGAAAAGCGTTGAAAAGAGAGATATTCTAATGGCTAAAATAACCCTTCCCGTTCGTAAAAAACGTTCGAGACCTAAGATAATAGTTGAAGGAGGTATCCACTCTCGAGAATGGGTATCCATAGCCTTTGTTACATATTTTCTCCACCAAGTTCTAACTACAGTTGACAAAAAGGAATCCAAGTTAAAATCCATTGCCGAGGAATACGAATGGTATTTTATACCAGTTCTCAATCCAGACGGCTACGAATATACTCACAAGAAGGATCGGATGTATAGAAAAAATATGAAGGGCGTAGATTTGAACAGAAACTTTGACATGCATTTTGGTTCTGTTGGGACAAGTTCAAGAAAACAAGACGAAACCTATGGTGGTCCCAAAGCTTTTTCTGAACCCGAAACTTTGGCGCTGGCCAATTTTGTGAAAGCTAACAGTAAAAATTTGAAGTTCTATTTAGCTTTTCATTCATATGGCCAATATATGATCATACCGTATGCTTATTCAAAGAAACATGAAGGAAATTTTGATGAAGTGCATGAAAGGGGTATCAGAGCTGCAAAGAGAATTTCCAAAAAATATGGCACTCAATACACAGTAGGAACAGCATACGATACAGTGGGTTATGTTACGTCGGGTGTAAGCGGTTGTTGGGTTAAAAAAACTTTTTCAGTTCCCTACGTACTTACGTTTGAATTGAGAGACGACGGCCGATATGGGTTTGCGCTTCCCCCTAATCAAATCCTTCCTACTTGTTGGGAAACTATGGATGGCTTGTTATCACTGCTGGATTTTAAAACGGACAAAATTGGATTCAATTTACATTCGTTCAAATACGTTGTAACAATGTTGTTTAACTTGCTTTTTATTTGTGTTCTTAGTATTGTTAATGCAGAAAAAGTAAGATATGATGATTATTCGTTATATAAAGTGAATCCCGAAGATAATGACCAACTAGAGTTCTTGAAGCAATTGCAAAATTCTGAAGGATTAGATTTCTGGGTACCACCAGCAAGAGTTGGAGATGACATCAATGTAATTGCTGCTCCACAAAGCAAAGATGAATTCGAACATTCCTTAAAAAAAAGAAATATATACCACGATGTGCTTTTTAATAATTTACAAGAAGTCTTCGATAGCCAAGTTTTGAGTAGAAGAAAACGTTCAACAAGAGATCTGTCATGGACTAGATATCAAGATATAGATGATATTTACGAATGGTTTGAGCAACTAGCAAATAATTATTCATTTGTGTCCCTCATTTACGCAGGACAATCCTTTGAAGGCAGAAACATAACAGGAGTAAGAATAAATCGTGGTTCTCAGAGAGCCATTTTCGTTGAAGGTGGTCAAATAGGAGCTGATTGGTTGTCACCAGTTGTTACTACTTATTTAGTAAATCAATTAGTACGAGGTGTAGACAGTGAAGCGCGAGATGCTAGTTCTGATTTCGATTGGCATATTTTTCCAATTCTTAACCCTGATGGACATAAATACACACAGGATAAGGATAGGCTCTGGATTAAGAATCGTAGAATCAACAGAAATGGAACAATTGGAGTTGACCTAAGTAGAAACTGGAATTCTCTTTGGGGAGTAAGTGGTGGCAGTTTTAATGACTCACATAGTAACTACGTAGGTCTGGGACCTTTCTCAGAAAAAGAATCAAGAGCGATTTCATACTACATTGACTCCATTGCACCCCGACTTAAATTTGTTCTTTCGATGAGAAGCTTCGGACAACGGCTGCTTTTACCCTTTGCGCATTCCACTGAACCTTTGTATAATTATAATGATACGATCATTGTAGGCAGGAGAGCTATGGGATCTATGGCAGTTAAATATAATACCCAATACATTGTTGGTACTTCAAAAGAAGTTCATGATGGCTCAACTGGTTCTCTGGCAGATTGGGTAAAACATCGTTACTCTGTGCCTTTTGTTGCCACCTACCTTCTCAGAGACAATGGAACTTCAGGTTATGCCTTACCTGTCAGTCAAGTTCTGCCATCTTGCGAAGAGACCTATGATTCAATTATGGCTATTCTTCGTGAAGCAAAATTTATCCGTTTGATATAA

Protein sequence:

>DPOGS204049-PA
MFRDIQILVPREHLNVFKERLGHFHLNATITTSNIEQLFQQQKFKRYTRMKFESFSWNSYYDLESVYQWMTDIATKNSDRVKLKAIGKSAEGREILTLEIVTENPKGKVIVEGAIHGNEWLTTQFVTYLAFFLVYPEKSFNWRLKQVAKKYTWILIPVVNPDGYDYSMKVDRLWRKNRRKTSNATIGVDLNKNFKYRFCEYGGSKDVSSAYYCGPQAFSEPETLAMSKFIHFNRHQLNFYFAFHAYGQKIIIPYSDRVQHIENFAEMENYGKQAILKMYKLYGIKYGIGTIYDVYGYKSSGDSISWVKKTYGVKYSLSFYLRDNGTYGYALPPDNILPACKETLTGLMELMTVRHRSVERELFSSTIFLYINPCTSKSYKNYTLFRGVPVTDHHLEFFKNLSDVYKATFWRSPGLVHRPVEFIIGPNKKRIFLRDAMLKGIYYTTVIEDVQRAFDSQTVKTYVRRNMESFDWTSYFRLDDIYDWLQDLSVMYPKVMHLQNLGKSVEKRDILMAKITLPVRKKRSRPKIIVEGGIHSREWVSIAFVTYFLHQVLTTVDKKESKLKSIAEEYEWYFIPVLNPDGYEYTHKKDRMYRKNMKGVDLNRNFDMHFGSVGTSSRKQDETYGGPKAFSEPETLALANFVKANSKNLKFYLAFHSYGQYMIIPYAYSKKHEGNFDEVHERGIRAAKRISKKYGTQYTVGTAYDTVGYVTSGVSGCWVKKTFSVPYVLTFELRDDGRYGFALPPNQILPTCWETMDGLLSLLDFKTDKIGFNLHSFKYVVTMLFNLLFICVLSIVNAEKVRYDDYSLYKVNPEDNDQLEFLKQLQNSEGLDFWVPPARVGDDINVIAAPQSKDEFEHSLKKRNIYHDVLFNNLQEVFDSQVLSRRKRSTRDLSWTRYQDIDDIYEWFEQLANNYSFVSLIYAGQSFEGRNITGVRINRGSQRAIFVEGGQIGADWLSPVVTTYLVNQLVRGVDSEARDASSDFDWHIFPILNPDGHKYTQDKDRLWIKNRRINRNGTIGVDLSRNWNSLWGVSGGSFNDSHSNYVGLGPFSEKESRAISYYIDSIAPRLKFVLSMRSFGQRLLLPFAHSTEPLYNYNDTIIVGRRAMGSMAVKYNTQYIVGTSKEVHDGSTGSLADWVKHRYSVPFVATYLLRDNGTSGYALPVSQVLPSCEETYDSIMAILREAKFIRLI-