Monarch geneset OGS2.0

DPOGS206910
TranscriptDPOGS206910-TA1230 bp
ProteinDPOGS206910-PA409 aa
Genomic positionDPSCF300001 - 1570635-1578554
RNAseq coverage502x (Rank: top 25%)
Annotation
HeliconiusHMEL0094269e-8374.32% 
BombyxBGIBMGA012866-TA2e-3655.20% 
DrosophilaPGRP-LB-PD3e-4955.28% 
EBI UniRef50UniRef50_A7BIV14e-7663.24%Peptidoglycan recognition protein-D n=4 Tax=Obtectomera RepID=A7BIV1_SAMCR
NCBI RefSeqXP_969556.11e-5560.62%PREDICTED: similar to putative peptidoglycan recognition protein [Tribolium castaneum]
NCBI nr blastpgi|1542406581e-7563.24%peptidoglycan recognition protein-D [Samia cynthia ricini]
NCBI nr blastxgi|3155071031e-7569.73%peptidoglycan recognition protein D [Ostrinia nubilalis]
Group
Gene OntologyGO:00087454.9e-86N-acetylmuramoyl-L-alanine amidase activity
GO:00092534.9e-86peptidoglycan catabolic process
GO:00082702e-65zinc ion binding
KEGG pathway 
InterPro domain[226-387] IPR0155104.9e-86Peptidoglycan recognition protein
[224-389] IPR0025023.5e-70N-acetylmuramoyl-L-alanine amidase domain
[225-368] IPR0066192e-65Peptidoglycan recognition protein family domain, metazoa/bacteria
Orthology groupMCL17883 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206910-TA
ATGGCAGCTTGTGGTAGCGAATTATTATTGGTGCTAGTGACGATTAGCGTCTGTTTAGTGGACAGCAATCCAAACGTATACGGTTTAGCTCGAACCCAATCTTACGTGTACTACACGAAAAGCGACTGGGGTGGTCTGCCGTCGACCGACGTTCGGCCGTTGGAGACCCCGGTGCCCTACGTTGTGATCCACCACACGTACATACCAGGAGCTTGCGGCACCCCTGAACAATGTAAAGCAGATATGAGATCAATGCAAAACTATCACATCAGTATGGGCTGGGGAGATATCGGATACAATTTCTGCGTTGGCAGCGACGGCGGCGTTTACGAAGGCCGAGGTTGGGACAACATCGGGATACACGCGGGTCGTGCTAATAATAACAGTATAGGGATTTGTCTCATAGGGGATTGGAGGGTTGAGGATCCACCAGAAGCTATGTTGGAGAGCACTAAAGCTCTAATCAGAACCGGAGTATTAAACGGCAAAGTCAGCACCGCGTACAAGCTGGTGGGTCACAGACAGGTCATGGCGACGGAGTGTCCCGGGAACGCAATAATGACGATTTATATCGAACTCCTAGTTTGTGCGTATATAGTGGATATAGCGAGCAGCGGCGCTGTATTCCGTGGGCAAGATGCTGATGATGACAACGAGGTGTCAAGTTACGACTTTCCCTACGTGACCCGTTCCATGTGGCACGCTAGACCTCCAAAAGAAAAGATACCCTTGCAATCACCAGTACCATATGTAGTCATACATCACTCCTATTCGCCACCGGCCTGCTATGACGGTGTAACCTGCCGTCAAGCTATGAGGTCCATGCAGAACTTCCACATGGACTCTAGAGGCTGGTGGGACATAGGTTACAATTTCGCCGTCGGCAGCGATGGAGCAGCGTACGAAGGCAGAGGATGGACCGTGCTAGGAGCGCACGCTCTCCACTTCAATAACATTAGCATTGGAATATGTCTGATTGGGGATTGGAGATTCTTAGTGCCGCCTTCAAATCAACTGAAGTCAGCCAAAGCTCTAATAAATGCGGGAGTAGAACTGGGATACATCAAAAGCGACTATAAGCTTGTTGGTCATAGACAAGTCCGGGAGACTGAATGCCCTGGTGACGCCCTGTTCCACGAGATACAAACTTGGGACCATTGGTCATCCTTCCCCGCCTCTTATAAGGATTTAGATAAAATTGACTTGACTGGATCTCAGAATAAAAGTTAA

Protein sequence:

>DPOGS206910-PA
MAACGSELLLVLVTISVCLVDSNPNVYGLARTQSYVYYTKSDWGGLPSTDVRPLETPVPYVVIHHTYIPGACGTPEQCKADMRSMQNYHISMGWGDIGYNFCVGSDGGVYEGRGWDNIGIHAGRANNNSIGICLIGDWRVEDPPEAMLESTKALIRTGVLNGKVSTAYKLVGHRQVMATECPGNAIMTIYIELLVCAYIVDIASSGAVFRGQDADDDNEVSSYDFPYVTRSMWHARPPKEKIPLQSPVPYVVIHHSYSPPACYDGVTCRQAMRSMQNFHMDSRGWWDIGYNFAVGSDGAAYEGRGWTVLGAHALHFNNISIGICLIGDWRFLVPPSNQLKSAKALINAGVELGYIKSDYKLVGHRQVRETECPGDALFHEIQTWDHWSSFPASYKDLDKIDLTGSQNKS-