Monarch geneset OGS2.0

DPOGS204562
TranscriptDPOGS204562-TA924 bp
ProteinDPOGS204562-PA307 aa
Genomic positionDPSCF301256 - 496-3807
RNAseq coverage114x (Rank: top 59%)
Annotation
HeliconiusHMEL0142245e-11354.89% 
BombyxBGIBMGA009595-TA2e-11058.47% 
DrosophilaCstF-50-PA3e-8545.54% 
EBI UniRef50UniRef50_Q9V9V04e-8345.54%CstF-50, isoform A n=40 Tax=Bilateria RepID=Q9V9V0_DROME
NCBI RefSeqXP_393185.28e-9849.39%PREDICTED: similar to CstF-50 CG2261-PA, isoform A isoform 1 [Apis mellifera]
NCBI nr blastpgi|3072127596e-9750.25%Cleavage stimulation factor 50 kDa subunit [Harpegnathos saltator]
NCBI nr blastxgi|3072127595e-9450.25%Cleavage stimulation factor 50 kDa subunit [Harpegnathos saltator]
Group
Gene OntologyGO:00055152.1e-32protein binding
KEGG pathwaymgr:MGG_088294e-16 
 K06666 (TUP1)maps-> Cell cycle - yeast
InterPro domain[86-303] IPR0159432.1e-32WD40/YVTN repeat-like-containing domain
[89-302] IPR0110467.2e-30WD40 repeat-like-containing domain
[170-211] IPR0016801e-06WD40 repeat
[141-167] IPR0197811.6e-06WD40 repeat, subgroup
Orthology groupMCL13890 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204562-TA
CAGTTACATTACGATGGCTTCCAACCTATAGCCGCGACTCTATCAGCCGCTGTACACGCAGACCCACCGTGCCCTCCGAGCGACAGATTATTAAATCTAATGATGGTCGGTCTACAACATGAACCGGACCGGAAGGACAGGCTGGCGGCATCCAGCGGGGCGGAACATCTGCTGGGAACTACCGGCTTTGATCTCGAGTTTGAGATGGACGCGTCCTCGCTCGCCCCTGAGCCGGCCACGTACGAGACGGCGTATGTGACGTCACACAAGATGTCTTGTAGGGCCGGAGCGTTCAGCGCCTGTGGTCAGCTGGTGGCTACCGGCAGTGTGGATGCTAGCATTAAGATTCTGGACGTGGAGCGGATGTTGGCTAAATCAGCTCCCGAGGAAGTTGATCCCGGGAGAGAGCAACAGGGACATCCGGTGATACGAACATTGTTTTCACCGAACGGTAAGTATTACGCGTCGGGCAGCGCTGATGGTAGCGTCAAGCTCTGGGACACCGTCTCCAACAGATGTTTTAACACGTTCACCAACGCTCACGAGGGTGCAGAAGTGTGTTCAGTGGCATTCACCAGGAACAGCAAGTATCTCCTCACATCTGGTTTGGATTCGTCTATAAAGTTGTGGGAGTTAGCGAGCAGCCGTTGTCTGATACAATATACGGGGGCCGGTACTACAGGTAAGCAGGAACACCACGCCCAGGCGATATTCAATCACACTGAGGACTACGTGATGTTCCCGGACGAGGCGACCACCTCGCTCTGCACCTGGCACTCCAGGTCAGCCAGCAGGTGCCAGCTGATGTCTCTGGGGCATAATGGAGCTGTTAGGTACATAGTCCATTCTGGCACGGCTCCAGCGTTCCTCACCTGTAGCGATGATTACAGAGCCAGGTTTTGGTACAGACGGAACACGCATTAA

Protein sequence:

>DPOGS204562-PA
QLHYDGFQPIAATLSAAVHADPPCPPSDRLLNLMMVGLQHEPDRKDRLAASSGAEHLLGTTGFDLEFEMDASSLAPEPATYETAYVTSHKMSCRAGAFSACGQLVATGSVDASIKILDVERMLAKSAPEEVDPGREQQGHPVIRTLFSPNGKYYASGSADGSVKLWDTVSNRCFNTFTNAHEGAEVCSVAFTRNSKYLLTSGLDSSIKLWELASSRCLIQYTGAGTTGKQEHHAQAIFNHTEDYVMFPDEATTSLCTWHSRSASRCQLMSLGHNGAVRYIVHSGTAPAFLTCSDDYRARFWYRRNTH-