Monarch geneset OGS2.0

DPOGS200105
TranscriptDPOGS200105-TA1290 bp
ProteinDPOGS200105-PA387 aa
Genomic positionDPSCF300044 + 417076-537884
RNAseq coverage469x (Rank: top 26%)
Annotation
HeliconiusHMEL0043182e-4396.47% 
BombyxBGIBMGA004557-TA4e-11993.61% 
DrosophilaPur-alpha-PB5e-10680.09% 
EBI UniRef50UniRef50_Q9V4D97e-10480.09%Purine-rich binding protein-alpha, isoform B n=38 Tax=Pancrustacea RepID=Q9V4D9_DROME
NCBI RefSeqXP_001845827.11e-11088.52%transcriptional activator protein Pur-alpha [Culex quinquefasciatus]
NCBI nr blastpgi|1700359462e-10988.52%transcriptional activator protein Pur-alpha [Culex quinquefasciatus]
NCBI nr blastxgi|1700359469e-10687.26%transcriptional activator protein Pur-alpha [Culex quinquefasciatus]
Group
KEGG pathway 
InterPro domain[1-207] IPR0066281.7e-127PUR-alpha/beta/gamma, DNA/RNA-binding
Orthology groupMCL15635 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200105-TA
ATGCTCCAAATACAGTCCAAGAGGTTTTATCTAGATGTGAAGCAGAATAGGAGAGGACGTTTTATAAAAGTTGCAGAGATCGGCGCTGACGGGCGGCGCAGCCAAATATTCCTGGCCATGTCGACCGCGGCCGAATTCAGGGACCACTTGTCGGCGTTCAGCGATTTCTACTCGTCGCTGGGCCCGCCGAACCCGGACAACGTGCCCGACGACGGAAAACTCAAATCAGAGATGATGCTTAAGGACAACAGACGCTACTACTTAGATTTAAAAGAGAACTCCCGCGGTCGTTTCCTGCGCGTGTCTCAGACACAGACGCGGGGCGGTCCGCGCGCACAGGTTGCTTTGCCAGCGCAGGGCATGATAGAGTTTCGCGACGCGCTCACCGACCTGCTAGACGACTTCGGCTCCGATGACGGAGGGTTCAAGGGCGAGTTACCCGAGGGGCGCCACCTCCGTGTAGACAATAAAAACTTTTATTTCGACATCGGGCAGAACAACCGCGGGGTCTACATGAAAGTCAGCGAGGTGGTGAAGAGCAATTTCCGCACCGCCATCACCGTACCGGAGAAATGTTGGACACGCTTCAGGGACATCCTGGCGGACTACTGCGACAAGATGAACAGGGCGCACGACCCCGACCACCATCAGGTTTTCTTAGCTCTCGTACTATGCGTCGCTTACCTAGACGCCCAGCCGTATGGTTTGTACGGAGGCTTATACGGCATTGGTAGTCGACTCGGAATCGGCAGGGGCCTTGGTTTGGGGCTTGGTTATAGGAGATATGGCTATGGCATGCCACGTCTGGGCTACGGTTACAACAATCTGGGCTCAGGATACGGTGTTGCTGGTTTCGCTGTCATAGTTAAGTCTCAACTGCTCTTTGGTCTTGGAAGATTGGGAGCCGAAATAGGGTCAAGAATAGGGTCCGCTTTAGGGCATGGTATTGATGCTATTGCTGATATTGGAACCGAAGCATTATCATTAGGTGAGGAAGCAATAGACGGACCAAGATTATTTGATCTTGGTTCAAAAGCTGACTCAAATAATTATCATGAAACTGCTGACTCCGGTTCAAATGAATATAATGCTGGATCTGGCAGTGCTGAGAGTAGTGAACTTTCAAGACGTCATCAACGACACTANACTTATCCTAACGTCTGAGCTACATAAATGTAAAGTTACATAAATATCCATCCAGTAGTTTTTGCGTGAACGAGCAACAAACATCCATGGATATATGCATCATCACGAACTAACATTTTTATTACTGGGATTCAATTCGGATAA

Protein sequence:

>DPOGS200105-PA
MLQIQSKRFYLDVKQNRRGRFIKVAEIGADGRRSQIFLAMSTAAEFRDHLSAFSDFYSSLGPPNPDNVPDDGKLKSEMMLKDNRRYYLDLKENSRGRFLRVSQTQTRGGPRAQVALPAQGMIEFRDALTDLLDDFGSDDGGFKGELPEGRHLRVDNKNFYFDIGQNNRGVYMKVSEVVKSNFRTAITVPEKCWTRFRDILADYCDKMNRAHDPDHHQVFLALVLCVAYLDAQPYGLYGGLYGIGSRLGIGRGLGLGLGYRRYGYGMPRLGYGYNNLGSGYGVAGFAVIVKSQLLFGLGRLGAEIGSRIGSALGHGIDAIADIGTEALSLGEEAIDGPRLFDLGSKADSNNYHETADSGSNEYNAGSGSAESSELSRRHQRHXTYPNV-