Monarch geneset OGS2.0

DPOGS206268
TranscriptDPOGS206268-TA981 bp
ProteinDPOGS206268-PA326 aa
Genomic positionDPSCF300290 - 248487-251693
RNAseq coverage1x (Rank: top 93%)
Annotation
HeliconiusHMEL0169055e-12973.23% 
BombyxBGIBMGA010800-TA5e-17887.35% 
Drosophilatex-PA1e-12867.94% 
EBI UniRef50UniRef50_Q96J015e-11862.15%THO complex subunit 3 n=43 Tax=Coelomata RepID=THOC3_HUMAN
NCBI RefSeqXP_967851.15e-14374.05%PREDICTED: similar to THO complex subunit 3 [Tribolium castaneum]
NCBI nr blastpgi|910928249e-14274.05%PREDICTED: similar to THO complex subunit 3 [Tribolium castaneum]
NCBI nr blastxgi|910928246e-14174.05%PREDICTED: similar to THO complex subunit 3 [Tribolium castaneum]
Group
Gene OntologyGO:00055157.1e-56protein binding
KEGG pathwaytca:6562151e-142 
 K12880 (THOC3)maps-> Spliceosome
InterPro domain[20-323] IPR0110467.1e-56WD40 repeat-like-containing domain
[141-322] IPR0159432.1e-30WD40/YVTN repeat-like-containing domain
[73-107] IPR0197813.9e-08WD40 repeat, subgroup
[67-107] IPR0016801.6e-07WD40 repeat
[49-63] IPR0204729.6e-06G-protein beta WD-40 repeat
Orthology groupMCL11201 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206268-TA
ATGAAAGAATCTAAACAGGCCCAAACCCCGATTGAGGATTTAGCTGAGTTGAAGAGTTATTTTCAATCTCACAATGTTGTGAGGGAATACATCGCCCACAGTTCTAAAGTGCATTCAGTTGGATGGTCCTGTGATGGAAGAAAACTAGCTTCCGGGTCCTTTGATAAAAGTGTCGTTATATTTAATTTGGAAAGAAATAGATTGGCTCAAGATTTCGTTTTCCGAGGTCACACGGGTTCTGTGGATCAGTTGTGCTGGCATGCATCACATCCCGATTTACTCAGCACTGCTAGTGGTGACAAATCAGTCAGGATATGGGACACTAGGACGCACAAATGTGCGGCGGCAATATCAACTAAGGGTGAAAACATAAACATAGCCTGGTCTCCGAGTGGGGCAACCATAGCCGTCGGAAATAAAGAGGATTTAGTATCATTCATTGATACTAGGAACTATAAAGTAGTTGAAGAACAATTCAACTTTGAAGTGAACGAAATATCTTGGAACAACACCTCGGACCTCTTCTTCCTCACAAACGGTTTAGGCTGCGTACACATTTTGACATATCCACATTTGGAGTTGCAAACAGTCCTGAAGGCCCACCCCGGTACATGTATATGTATTGAGCATGACCCCACCGGCCGGTACTTCGCTACGGGCTCCGCGGACGCGCTCGTCTCCTTATGGGATGTCAACGAACTGGCTTGTTTGAGAGTTTTCTCAAGGTTGGAGTGGCCGGTGAGGACATTATCCTTCAGCTTCGACGGTCGACTCTTGGCTTCAGCGAGCGAAGATCATATCATAGATATAGGGGACACCGAAACAGGAGAAAAAGTGGCGGAAATTCCAGTTCAGGCTGCAACATTCACAGTGGCCTGGCATCCGTCCAGATACTTGGTGGCTTTTGCATGTGAAGACAAAGAACCGCCCGAGAGAAAACGAGATGCTGGGAATCTTAAACTGTGGGGACTTAGCAGTTAG

Protein sequence:

>DPOGS206268-PA
MKESKQAQTPIEDLAELKSYFQSHNVVREYIAHSSKVHSVGWSCDGRKLASGSFDKSVVIFNLERNRLAQDFVFRGHTGSVDQLCWHASHPDLLSTASGDKSVRIWDTRTHKCAAAISTKGENINIAWSPSGATIAVGNKEDLVSFIDTRNYKVVEEQFNFEVNEISWNNTSDLFFLTNGLGCVHILTYPHLELQTVLKAHPGTCICIEHDPTGRYFATGSADALVSLWDVNELACLRVFSRLEWPVRTLSFSFDGRLLASASEDHIIDIGDTETGEKVAEIPVQAATFTVAWHPSRYLVAFACEDKEPPERKRDAGNLKLWGLSS-