Monarch geneset OGS2.0

DPOGS205269
TranscriptDPOGS205269-TA3498 bp
ProteinDPOGS205269-PA1165 aa
Genomic positionDPSCF300021 - 1027709-1035781
RNAseq coverage469x (Rank: top 26%)
Annotation
HeliconiusHMEL0162140.080.68% 
BombyxBGIBMGA011069-TA0.072.00% 
DrosophilaCG2469-PA0.066.33% 
EBI UniRef50UniRef50_Q6PD620.067.00%RNA polymerase-associated protein CTR9 homolog n=120 Tax=Metazoa RepID=CTR9_HUMAN
NCBI RefSeqXP_969441.10.071.09%PREDICTED: similar to tpr repeat nuclear phosphoprotein [Tribolium castaneum]
NCBI nr blastpgi|3320247850.073.34%RNA polymerase-associated protein CTR9-like protein [Acromyrmex echinatior]
NCBI nr blastxgi|3071787120.068.20%RNA polymerase-associated protein CTR9-like protein [Camponotus floridanus]
Group
Gene OntologyGO:00054882.7e-51binding
GO:00055159.9e-06protein binding
KEGG pathway 
InterPro domain[149-462] IPR0119902.7e-51Tetratricopeptide-like helical
[199-228] IPR0014409.9e-06Tetratricopeptide TPR-1
Orthology groupMCL12212 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205269-TA
ATGTCGCTTGAAATTCCATTGATGTCCACCGATGAGGTGATAGAGCTTGATCCCGAATCTTTACCGTGTGGTGAAGAAGTGCTTAGTATATTGCAACAAGAGAGATCCCAGCTAAATGTCTGGATTAATGTCGCCCTTGCCTACTACAAACAAAACAAGATCGATGATTTCCTTAAAATTCTCGAGGCATCCCGAGTGGATGCAAATATTGACTATAGGGACTTTGAAAGGGATCAAATGCGAGCACTTGATATGTTGGCTGCATACTATGTTCAAGAGGCAAATAAAGAAAAGTCTAAAGACAAGAAAAAAGAGCTTTTTACTGAAGCTACTTTGCTTTATACTATGGCAGATAAAATTATTATGTATGATCAGAATCATCTTCTTGGGCGAGCATATTTCTGTCTTCTTGAGGGTGATAAAATGGCACAAGCAGACACACAGTTCAATTTTGTCCTCAATCAATCACCAAATAATGTGCCATCACTGCTCGGTAAAGCCTGTATTGCTTTTAACCGAAAGGATTATAGAGGAGCTCTAGCATTTTACAAAAAAGCTCTAAGGACGAATCCTAACAGCCCAGCTGCTTTACGTTTGGGAATGGGCCATTGTTTCATGAAATTAAATAATCAGGAAAAAGCCAGAATGGCGTTTGAGAGGGCATTGCAACTTGATCCTCAATGTGTTGGAGCTTTAGTTGGGCTGTCGATCTTGAAATTGAATTTACAAGAGAGTGAATCCAATAAGATGGCAGTCATCATGTTGTCTAAGGCATACGCAATTGATCCCAAAAACCCAATGGTTTTAAATCATTTGGCAAATCATTTCTTCTTTAAAAAGGATTACAGCAAAGTCCAACATCTTGCTCTGCACGCTTATCACAACACTGAGAATGACGCAATGAGGGCGGAAAGTTGTCATCATTTGGCCAGAGCTTATCATGCGCAAGGTGATTGCGTTAAAGCATTCCAATACTACTACCAAGCTACGCAGTTTGCGCCACCGAATTTCGTACTACCGCATTATGGCCTCGGACAAATGTATATTTACAGAGGTGACACTGAAAATGCGGCTCAATGCTTTGAAAAAGTTCTTAAAGCTCAACCAGGCAACTACGAAACTATGAAGATTCTAGGATCTCTATATGCTAACTCTCCATCTCAATTACAACGGGATATAGCGAGACAGCATCTCAAGAAAGTAACCGAACAATTTCCTGACGATGTAGAGGCTTGGATTGAATTGGCACAAATTTTGGAACAAAATGATTTACAGGGTTCACTGAATGCATATACCACTGCCATGAAAATACTTAAAGAAAAAGTAAATGCTGAGATTCCAGCAGAAATATTAAACAATGTTGCCGCATTGCATTACCGACTTGGTAATTTAAATGAAGCTATGAAATATTTGGAGGAAGCCTTGGAAAGAGAGAAAGTGGAAGCGGAGACTCTCGATGCCCAGTACTACAATTCAATATTAGTGACAACCATGTACAACCTGGCGAGACTTAACGAAGCGCTCTGTGTATACAACAAGGCTGAGAAACTGTACAAAGATATATTGAAGGAGCATCCCAATTATATTGATTGCTACTTGAGATTGGGCTGCATGGCAAGAGATAAAGGGCAGATATACGAAGCGTCGGACTGGTTCAAAGAAGCTTTGAAAGTGAATATAGAGCACCCGGACACGTGGTCGCTGCTGGGGAACCTTCACCTGGCGCAGCAGGAGTGGGGTCCGGGGCAGAAGAAGTTTGAACGGATCTTACAAAACTCCACCACGTCCAACGACGCCTACTCGCTGATTGCGCTCGGCAACGTGTGGCTTCAGACTCTACACCAGCCAGGTCGTGAGAAGGATCGCGAGAAGAGACATCAAGAACGAGCTCTCGCTTTATATAAACAGGTGCTGAAGAACGATCCGAAGAATATATGGGCAGCCAACGGTATAGGATGCGTGCTCGCGCACAAAGGTTGCATCAACGAAGCTCGCGATATCTTCGCCCAAGTGCGGGAAGCGACGGCTGATTTTCCAGACGTCTGGATGAACATTGCTCATATATACGTGGATCAGAAACAATACATAAATGCCATACAAATGTACGAGAACTGCATAAGAAAGTTTCGGACCCATCACGACGTGGAGTGGTTGACGTGGCTCGGGCGCGCTCAGACACTAGCGGGTAGAGCGCGTGCCGCTCGCACGTCTCTACTGAGAGCACGTCGGGTAGCGCCCCACGACCCCGCCCTACTCTACAACACCGCGCTCGCTCTACGACGCCTGGCTGCCCACGTGCTGAAAGACGAACGATCCGAACTCAGGGTCGTACTGAGAGCGGTTCATGAACTACATGTCTCACATAGATACTTCCAACGTCTTGGGGCAGCGGCCGCGGCCGAGGCCAGGACATGCGCCGACCTACTCTCACAGGCGCAGTGGCATGTAGCGAGAGCGAGACGGCAACACCAGGAGGAACTCACACTCAGGGACAAGCAACGAGAACAACGAGAGGCCTTCAGGAAACAACAGGAGGAAGAACGCAAACGGAGGGAGGAGGAACAAGCGAAGAGCACAGTGGAAATGTTACAGAAGCGACAAGAATACAAGGAGAAGACAAAGAACGCTTTGTTATTCGCGGATATGCCGTCTGAGAGCAAACAGAAAGGACGAGGCCGCAGGAGAGACGAGTATATATCGGACTCTGGCAGCGAACAAGACAGGCCGAGGGAAGAAGGGGAACCGAAACAACGTAAACGCAAGCGCGATGCCGAAGGACGTAAAGGTAAAACTAAGAAGAAGAATCAACGAACCAGCGACACTGATAGTGATGCACCGAGGAATAAGAGCAGAAAGAAGGGCGAAAGGGGTATCGGTAAACGCGAGAAAGCCAAAATGGCTGAGGATAAGCTGGGAGCTAAACAGCGCGCTAAGATCGTATCAAAAGAAACTATATCGACTTCGGAATCAGACTCCGATGGTGATCACGTTCAAAAAGTCGATCGAAATCGAGAAGTCGCTCGCGGTCCGGTTCAGCTAAGAGCCGACCAAAATCAAAGAGTCGATCCAGGTCTAAAGACAGATCCAGATCGAAGAGCCGCTCGAAAAGCCGATCTAGATCAAAAAGTAGATCTCGCTCAAAAAGCCCTTCCAGGTCGAAGAGTCGCTCTCGGTCCAAAAGTGGATCTAGATCCAAAAGCCCATCGAGATCGCGTTCAAGATCCAAAAGCCGTTCTAGATCAAAGTCGAAGAGTGCGTCGCGTTCCAGGTCAAGATCCAAAAGTGGATCACGCAGTCGTTCTGGATCGAGATCTCACTCTGGATCTAGATCTAGATCCGGTTCAAGAAACTCTAGACCCGCGACACCCGAATCTAGAAAATCAGTCTCAGCTAGTGAAGATGAAGCTTAGTCGGATATTTAAGATGACGTATAAACACAAGATATCTCTAAGATATTCGGAAATCGCTTTCATTGTGTTTTTAATTTTCTTCGGTTAA

Protein sequence:

>DPOGS205269-PA
MSLEIPLMSTDEVIELDPESLPCGEEVLSILQQERSQLNVWINVALAYYKQNKIDDFLKILEASRVDANIDYRDFERDQMRALDMLAAYYVQEANKEKSKDKKKELFTEATLLYTMADKIIMYDQNHLLGRAYFCLLEGDKMAQADTQFNFVLNQSPNNVPSLLGKACIAFNRKDYRGALAFYKKALRTNPNSPAALRLGMGHCFMKLNNQEKARMAFERALQLDPQCVGALVGLSILKLNLQESESNKMAVIMLSKAYAIDPKNPMVLNHLANHFFFKKDYSKVQHLALHAYHNTENDAMRAESCHHLARAYHAQGDCVKAFQYYYQATQFAPPNFVLPHYGLGQMYIYRGDTENAAQCFEKVLKAQPGNYETMKILGSLYANSPSQLQRDIARQHLKKVTEQFPDDVEAWIELAQILEQNDLQGSLNAYTTAMKILKEKVNAEIPAEILNNVAALHYRLGNLNEAMKYLEEALEREKVEAETLDAQYYNSILVTTMYNLARLNEALCVYNKAEKLYKDILKEHPNYIDCYLRLGCMARDKGQIYEASDWFKEALKVNIEHPDTWSLLGNLHLAQQEWGPGQKKFERILQNSTTSNDAYSLIALGNVWLQTLHQPGREKDREKRHQERALALYKQVLKNDPKNIWAANGIGCVLAHKGCINEARDIFAQVREATADFPDVWMNIAHIYVDQKQYINAIQMYENCIRKFRTHHDVEWLTWLGRAQTLAGRARAARTSLLRARRVAPHDPALLYNTALALRRLAAHVLKDERSELRVVLRAVHELHVSHRYFQRLGAAAAAEARTCADLLSQAQWHVARARRQHQEELTLRDKQREQREAFRKQQEEERKRREEEQAKSTVEMLQKRQEYKEKTKNALLFADMPSESKQKGRGRRRDEYISDSGSEQDRPREEGEPKQRKRKRDAEGRKGKTKKKNQRTSDTDSDAPRNKSRKKGERGIGKREKAKMAEDKLGAKQRAKIVSKETISTSESDSDGDHVQKVDRNREVARGPVQLRADQNQRVDPGLKTDPDRRAARKADLDQKVDLAQKALPGRRVALGPKVDLDPKAHRDRVQDPKAVLDQSRRVRRVPGQDPKVDHAVVLDRDLTLDLDLDPVQETLDPRHPNLENQSQLVKMKLSRIFKMTYKHKISLRYSEIAFIVFLIFFG-