Monarch geneset OGS2.0

DPOGS200615
TranscriptDPOGS200615-TA1095 bp
ProteinDPOGS200615-PA364 aa
Genomic positionDPSCF300076 - 36024-38647
RNAseq coverage184x (Rank: top 49%)
Annotation
HeliconiusHMEL0147406e-13665.83% 
BombyxBGIBMGA008976-TA2e-17075.34% 
DrosophilaCG3108-PA7e-7344.23% 
EBI UniRef50UniRef50_B0WQK43e-8043.44%Zinc carboxypeptidase n=5 Tax=Endopterygota RepID=B0WQK4_CULQU
NCBI RefSeqXP_310460.49e-8243.93%AGAP000621-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|2195531921e-8345.94%molting carboxypeptidase A [Helicoverpa armigera]
NCBI nr blastxgi|2195531923e-8245.94%molting carboxypeptidase A [Helicoverpa armigera]
Group
Gene OntologyGO:00065082.6e-107proteolysis
GO:00082702.6e-107zinc ion binding
GO:00041812.6e-107metallocarboxypeptidase activity
GO:00041804.7e-06carboxypeptidase activity
KEGG pathway 
InterPro domain[59-338] IPR0008342.6e-107Peptidase M14, carboxypeptidase A
[2-46] IPR0031464.7e-06Proteinase inhibitor, carboxypeptidase propeptide
[1-44] IPR0090207.9e-06Proteinase inhibitor, propeptide
Orthology groupMCL25558 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200615-TA
ATGGACATTATGGTCGAGAGTCCCCATGCCGCTCAGGTCGCTGGGCTGCTGAACGAAAGAGATATCCCATACAGCATAGCTATCAGTGACGTTAGAACTCTGATTGAAAGGGAACAGGGGAATACTTTAAAAAAGAACCTAAATTCCTCCAAAGGTGCAATGGATTGGAAGAACTATCACCGTCTTGATGTTATTTACTCGTTTATGGATGACTTGGCAGCACAGTACCCATATTTATGTACTGTTAACGTTATTGGCAAGTCGGCGGAGGGAAGAGACTTACGGATGTTAAAAATATCAAATGGCAATAACGAAAATATGGGAGTTTGGTTAGATGGATCCATACATCCCCGCGAGTGGGTGAGCACAGCTGTCGTGACGTACTTCGCTGACCGGCTCGTAAGAAGCTTTCACGAACAACCAGACAGCGTGACTAATAAAGACTGGTATATTCTGCCGGTTTTAAATCCCGATGGTTACGAGTACACACACACACACGACAGAATGTGGCGTAAAAACAGAAATCGTTACGGCGAGTGTGTTGGTGTGGATCTAAACAGAAACTTCAGTTATGGTTGGGGCGAAAAGGGCGAAGAAGGATCATCAGAGGACCCTGGCAATATATTTTATAGAGGTCCAAAACCGTTTTCTGAACCTGAAACTGCTGCTTTGAAGCGCGTCATATTGGATGAATCAGCAAAATTCGAGGTGTTTCTATCGTTCCACAGCTATGGTGAAGTGATAATATTCCCATGGGGTTATACTGCGGATCCATGTCCCGATTACGTAGAGCTTTTGGAAGGGGGAACAGCTATGGCGAAGGCAATCTTCGATACAAGCGGTCATACTTACAAAGTTGGCAGCACAAAGGACCTTATGTACTTCGCTGCCGGGACCAGCACTGACTGGAGCTACGCCGTCGCTAATATAAAATATTCGTACATGATAGAACTGAGAGGTAAGCAGCATAGATTTCTGCTGCCTAAAGAACACATCATAGAAACAGCGACTGAAGTCATGAACGGTGTGTTGAGACTCATGGATTTCGTTGACCGACGATGCAGAAGTACGCAGGCCTGTGTTTGTCCAAAATAA

Protein sequence:

>DPOGS200615-PA
MDIMVESPHAAQVAGLLNERDIPYSIAISDVRTLIEREQGNTLKKNLNSSKGAMDWKNYHRLDVIYSFMDDLAAQYPYLCTVNVIGKSAEGRDLRMLKISNGNNENMGVWLDGSIHPREWVSTAVVTYFADRLVRSFHEQPDSVTNKDWYILPVLNPDGYEYTHTHDRMWRKNRNRYGECVGVDLNRNFSYGWGEKGEEGSSEDPGNIFYRGPKPFSEPETAALKRVILDESAKFEVFLSFHSYGEVIIFPWGYTADPCPDYVELLEGGTAMAKAIFDTSGHTYKVGSTKDLMYFAAGTSTDWSYAVANIKYSYMIELRGKQHRFLLPKEHIIETATEVMNGVLRLMDFVDRRCRSTQACVCPK-