Monarch geneset OGS2.0

DPOGS203531
TranscriptDPOGS203531-TA1695 bp
ProteinDPOGS203531-PA564 aa
Genomic positionDPSCF300055 - 102312-115438
RNAseq coverage231x (Rank: top 44%)
Annotation
HeliconiusHMEL0057610.081.45% 
BombyxBGIBMGA009175-TA2e-11882.31% 
DrosophilaTrf4-1-PC2e-13561.80% 
EBI UniRef50UniRef50_E0VP783e-15562.84%PAP-associated domain-containing protein, putative n=2 Tax=Neoptera RepID=E0VP78_PEDHC
NCBI RefSeqXP_625041.13e-16961.36%PREDICTED: similar to CG11265-PA, isoform A [Apis mellifera]
NCBI nr blastpgi|665579917e-16861.36%PREDICTED: PAP-associated domain-containing protein 5-like [Apis mellifera]
NCBI nr blastxgi|665579915e-16461.67%PREDICTED: PAP-associated domain-containing protein 5-like [Apis mellifera]
Group
Gene OntologyGO:00167791.7e-12nucleotidyltransferase activity
KEGG pathwayame:5526621e-168 
 K03514 (POLS, TRF4)maps-> RNA degradation
InterPro domain[302-362] IPR0020581.7e-18PAP/25A-associated
[140-232] IPR0029341.7e-12Nucleotidyl transferase domain
Orthology groupMCL11678 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203531-TA
ATGGATCCCTACGTCGGGTGGTATCAGCCCGAGCAAGAAGGACCCGCGAAGCGCTTGTGGCTTCGTATCTGGGAAACTCAAAGTGAAACCGACAAAATGAACCTCAAAAACTTAGAAAATGCTAATCCTAACGTGAATAAAACACCAGACTTTATACCCTTAAACGAGGTGAACGGTGAAAATAGGAATAATAATTACTTTAACCCTACACGAAGGAAAGTGAACGACAATCGTGCTTCTACGTTTAACCTGAACCAAAATCACGATGCTCTCATAGGTGAATACGGTGGCTGTCCTTGGAGAATACCCAACTATAATTACAAGCCTGGAGTTCTGGGTCTGCACGAGGAGATCGAGCACTTCTACATGTACATGTCCCCGTCTGAGACTGAACACCTGGTCAGAACCACTGTCGTGACACGTATCAGGAGCGCCATCCTGTCCCTGTGGCCCCAGGCACGCGTCGAGGTCTTCGGGAGCTTCCGGACTGGCCTCTATCTGCCGACCAGTGACATCGACCTCGTGGTTATAGGTCAATGGGAGAAGCTGCCTCTGTGGACCTTGGAGCGAGAGCTGGTAGCCCAGGACATCGCGGAGCAAGACAGCATTAAAGTATTGGAAAAGGCGACCGTTCCTATTGTCAAGATGACCGACAAGTACTCCGACGTTAAGGTGGACATTTCTTTCAACATGAGCAGTGGCGTCAAGAGTGCGGAGCTGATCAAACAGTTCAAGGAGCAATATCCAGAACTGTCGAGGCTGGTGATGGTGTTGAAGCAGTTCCTGCTGCAGCGCGACCTGAACGAGGTGTTCACCGGAGGCATCTCCTCCTACTCCCTCATCCTCATGTGCATCAGCTTCCTGCAGCTACACCCGCGGCCGGAGAGACTCCGCCAGAGACACAACCTCGGAGTGTTACTGATCGAGTTCTTTGAATTGTACGGAAGGAAATTCAATTATGTGAAGACAGCCATCAGGGTCAAGAACGGAGGCTCATATGTATCTAAGGACGAGATCTCAAAGGAGATGAACGACGGCCATAGACCCTCGCTCCTGTGCATCGAGGACCCGCTGACGCCCGGCAACGACATCGGCCGGTCCAGCTACGGAGCCATACAAGTTAAACAGGCGTTCGACTACGGCTACATAATTCTCCAGCAGGCCGTGGCGCCGCACAACGCGTTACTCGCCCGTCACAGCGTTCTAGGTCGTGTCGTGCGCGTCACGGATCACGTGTTACAATATAGACGCTGGGTGAGAGACACCTTCGAGCCGTTCTTCTTCCCGCACCGCGTGAGGCCGAGGCGGGTAGGGAACACCCGCTCGCCCACCCCCGACCCCACACCCACACCGACGCCCACGCCCTCCGACACTGATCCTGAGTGGTCGGACGGTTCGGGTCCGTCAGGCCCGGCCCGCACGTCGCCGCCGCCGCTGTCGGCTCTGCAGTGCTCGTCGCCCACCCCCCGCAGGGTGTCCGCCCACCAGTCCCTCATAATACACCACATAACAAGCAACTCGGATTTCAACAACATACCATCGGACCCGCTTGCAGGGGTGCTCCGCCCCCGGCCGCGGCCCCGCCGTCGCGCGTCGCCCCCCCGCGGCCGAGCCCAACGCAACGACCGCTCCGACAGACAGGACCGGAACGACAGACCCGACCGCCACCGCAAGAGGCGCGGCTACACCCGATGA

Protein sequence:

>DPOGS203531-PA
MDPYVGWYQPEQEGPAKRLWLRIWETQSETDKMNLKNLENANPNVNKTPDFIPLNEVNGENRNNNYFNPTRRKVNDNRASTFNLNQNHDALIGEYGGCPWRIPNYNYKPGVLGLHEEIEHFYMYMSPSETEHLVRTTVVTRIRSAILSLWPQARVEVFGSFRTGLYLPTSDIDLVVIGQWEKLPLWTLERELVAQDIAEQDSIKVLEKATVPIVKMTDKYSDVKVDISFNMSSGVKSAELIKQFKEQYPELSRLVMVLKQFLLQRDLNEVFTGGISSYSLILMCISFLQLHPRPERLRQRHNLGVLLIEFFELYGRKFNYVKTAIRVKNGGSYVSKDEISKEMNDGHRPSLLCIEDPLTPGNDIGRSSYGAIQVKQAFDYGYIILQQAVAPHNALLARHSVLGRVVRVTDHVLQYRRWVRDTFEPFFFPHRVRPRRVGNTRSPTPDPTPTPTPTPSDTDPEWSDGSGPSGPARTSPPPLSALQCSSPTPRRVSAHQSLIIHHITSNSDFNNIPSDPLAGVLRPRPRPRRRASPPRGRAQRNDRSDRQDRNDRPDRHRKRRGYTR-