Monarch geneset OGS2.0

DPOGS206430
TranscriptDPOGS206430-TA4074 bp
ProteinDPOGS206430-PA1357 aa
Genomic positionDPSCF300181 + 325173-343133
RNAseq coverage211x (Rank: top 46%)
Annotation
HeliconiusHMEL0068890.073.91% 
BombyxBGIBMGA013842-TA0.073.62% 
DrosophilaCpsf160-PA0.045.84% 
EBI UniRef50UniRef50_D6WFP30.051.19%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WFP3_TRICA
NCBI RefSeqXP_968117.10.051.83%PREDICTED: similar to cleavage and polyadenylation specificity factor cpsf [Tribolium castaneum]
NCBI nr blastpgi|3071909100.051.97%Cleavage and polyadenylation specificity factor subunit 1 [Camponotus floridanus]
NCBI nr blastxgi|3071909100.051.98%Cleavage and polyadenylation specificity factor subunit 1 [Camponotus floridanus]
Group
Gene OntologyGO:00056342.1e-41nucleus
GO:00036762.1e-41nucleic acid binding
GO:00055155.7e-05protein binding
KEGG pathwayspu:7590054e-53 
 K11251 (H2A)maps-> Systemic lupus erythematosus
InterPro domain[1080-1241] IPR0048712.1e-41Cleavage/polyadenylation specificity factor, A subunit, C-terminal
Orthology groupMCL12065 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206430-TA
ATGTTTTCTATTTGTCGTCAAACTCACCCTGCTACGGGTATTGAGCACGCGATTAGCTGCTGTTTCTTTAATAATGATGAAGTGTGCCTGATTACTGCTGGTGCTAATATAATAAAGGTTTTCAGGCTTTTGCCTGAGGGCCACGCGAAAGAGGTTAATGCCGCTGGTCAACCGATTCCACCTAAAATGAAACTAGAATGTCTAGCCTCATACACTCTCTGGGGCAATGTGATGTCAATAGCATCAGTGAAATGTCCAAGTGCTGGTCGTGACTTACTGCTGGTGTCATTCAAGGAGGCAAAGTTATCTGTAGTGCAATATGATCCGCAAGTTAATAATCTCATTACACTTAGTATGCATTACTTTGAAGAAGATGATATGAAGGGTGGATGGACGACTCATCCCCACATACCCTGGATACGAGTGGACCCAGAATTCAGATGTGCCGTAATGTTATTGTATGGAAGAAAGTTAGCGGTGTTGCCGTTCAGGAAAGATATAACCTCAGAAGAGGGTGACCCTTTGGAGGCTAAGCCATTAGATTGTAAGAAAAATCAACCAATACAAACCATATCCAGAGCACCAACGCTAGCATCTTATGTGATAATATTGAAAGAACTCGATGAAAAGATAGACAATATATTGGACATACAGTTTCTCTATGGTTACTATGAACCGACATTGTTGTTATTGTATGAACCGGTTAGGACATTTGCTGGACGTACGGCCGTTCGTAACGATACCTGTGCGATGGCTGGTGTTAGTCTCAATATGAGCGCCAGAGTACATCCTGTTATTTGGTCTATAGGAGGATTGCCGTTTGACTGTATACAGGCTGTCCCGATTCAGAAACCTTTAGGCGGTTGTTTAATAATGGCTGTGAATTCTTTGATATATTTGAATCAATCTGTGCCGCCATACGGGGTCTCTCTGAACAGTATTGCTACACATACCACTAATTTTCCTCTACGTATTCAAGAAGGCGTCTGTATAACGCTGGACGGCGCTAGAGTGGTAGCTCTAGGTGATACTCGATTGTCGCTCGCCCTCAAGGGTGGTCAACTGTACGTGTTGACCTTACTATCGGACTCCGTGAGGAGTGTACGGAGCTTTCACCTGGACCGCGCCGCAGCCTCTGTGTTGACGTCCTGTATGTGTGTGATCGAAGAAGATTTTCTATTTCTTGGATCGAGACTTGGAAATTCTTTGCTATTGAGAGTGACCGAGAGGGAAAATAGAATGTTGTTCTCAGTGGACAAGCCTTTAGAAGCTACGGTCGACCTGACACTGTCCGAAACCGATAAAGATAAAGAACCATTGCCGAAGGAACCCCAAAAAGAAATGTTAGATCCGCAAGCGAAGAAGCGTCGTTTGGACACCATAAGCGACTGTGTCGCCACCAATGTGGTGGAAATATCGGACAAGGACGAGCTGGAGGTGTACGGCTCTGACATACGGACCTCCACCCAGCTCACCAGCTATGTGTTCGAGGTATGCGATTCCCTGTTGAACATATGTCCTATCGGCGACGTGTCTATGGGGGAGCTCCAGTTGGTGTCCGAGGAGGGAGCGGGGAGGAGGTCGAGGCCGGCCCTCGAAATGGTCGCGTGTAGCGGACGAGGGAAGAACGGAGCTTTGGCTGTGTTACAACGGTCGCTCACACCGCAGCTACTCACTGCCTTCGATCTACCAGGCTGTATCGATATGTGGACGGTGATCGGAGAGGCGACGGAAGTCAATAGAGAAGCCCACAAAGATATGGAAGGCAGCCATGCTTACTTGATACTGACACAGGAAGACTCGAGTATGATTCTTCAAACCGGCCAAGAGATAAACGAAGTGGATAATTCTGGTTTTATGACGAGCGCCCCCACGGTGTTCGCGGGTAACTTAGGGAACAACAGGTTTATGGTCCAAGTTACCACAACAGCTATAAGACTTGTGAGAAATGGCGTGTTGGTTCAGTCTATCACGTTAGAGTGGACGGCCCGCAGCGCGTGCACCGCCGACCCCTACCTGTGTGTGGTGTCCACTTGCGGCCGGGCGCTGGTGCTCGCGCTCAGGGAGCTGCGGGCCAGGGACGCCACGTCAGCTCGGCTCGCGCCAACGAGACAGGCGGTGCCTCACAGACCGGCCTTACTGAAAGCCGTTCCTTATCGAGATCTCAGTGGGCTATTCACCAGCACAGACGACAACATACAGGTCAAAGGTGAGTTCACGGGTAAAATGAAAGAGAAAAATATCAAGGCTGAAGGTTTCAAGGCGGACACAGTGTATGAATTGAACGATGAAGATGAGTTACTGTATGGAGGAGATCAGACGCCAGCGTCCATGGCTAGTGTGAAGATATGGCACATCCCTGATGGTGGCCTATCTATGCACCTCACCGACTGGCTGGTTGAGCTCCACGGGCACAAGAGGCGTGTGGCCTACATAGAGTGGCATCCCACGGCTGAGAACATACTGTTTAGTGCTGGATTCGATTATCTGATTTTAGCTGACAGCTTAGAGTCCGTTCCTATACCGAACCAGACGGATGAAGACGAATTCAATACAGGGCATAGTAGTAACGCGGAGAGACTTCAAGAAATCCTAGTCGTCGGCCTGGGACATAAGGGGTCGAGGGTTCTCATGTTGCTGAGGTGTGATGACGACCAGCTGATGATATATCAGGCCTACAAGTATCCCAGGGGTAACCTTCGGATGCGTTTCTCCCGTATGTCGTTGTCGTTTCCGTTCGGTTATCGTAGCGGCGTCGTGTCGTGTCCGGAGGAAAGCTACGAAGCGTCCACGCTGCGGGAGGGGGTCAGGCAGCTGAGGTATTTTGGTAACGTCGGCGGTTACAGCGGTGTGTTTGTGTGCGGTCGGACCCCTTATATGCTAGTGCTGGGATCTCGAGGCGAACTCCGCATACATCCTTTATCCGACAACGCAGCGACCTTCGCATCATTCAACAACACCAACTGTCCCCAGGGCTTCATGTATTTTAACGACAAGGGTTCGTTGCGTATCTGTGCGCTACCAACCCATCTGTCTTTCGAGGCGTCGTGGGCGGTTAGAAAGGTTCCGTTGCGTGAAACACCCCACTCTGTGACATTCCACCTGGAGTCGAGGACGTACTGCCTGGTGTCGTCGGTGTCCCAGTCGACGCACTCCTATTATAAGTTCAACGGAGAGGATAAAGAGAAATCTAACGAAAATAAAGGCGACAGGTTTCCATATCCCATGATGGAGCGCTTCTCCATAATGCTGTTCTCACCGGTGTCATGGGAAGTCATACCCAACACGAGGATAGAGTTGGATGAATGGGAACATGTGACGTGTCTGAAGAACGTGTCACTGTCATACGAGGGCACCAGGTCTGGTCTCCGCGGTTACATCGCGATAGGGACTAATTATAACTACGGAGAGGATATTACTTCTAGAGGGAGGATTTTGATTTACGATATAATAGATGTTGTACCCGAACCGGGCCAGCCGTTGACCAAAAATAGGTTTAAAGAAATATACGCGAAGGAACAGAAGGGTCCCGTGACAGCTCTCACACAAGTGTTAGGGTTCCTCATATCGGCTGTGGGTCAGAAGATATATCTCTGGCAGCTGAAGGACAACGACCTCGTCGGCGTAGCGTTCATTGACACCCAAATTTACGTCCACAGAATGTTGGCTGTTAAGAATCTGATATTGGTAGCTGATGTTTACAAATCAATATCCCTCCTGAGGATACCAACACCAACACAGGACGCTGTCGCTCGTGTCCAGGGACCTCAGGACGGCTCAGGTGACGTGGTAGCTCAGTCGATATACGACATGCAGTTCATGATAGACAACACGAGTCTGGGCTTCCTCATATACGACATGCAGTTCATGATAGACAACACGAGTCTGGGCTTCCTCGTGAGTGAGTCGGAGGGTAACTTTGCTATGTACATGCACCAGCCTCAAGCCAGAGAGAGTTACGGAGGTCAGCGTTTGATTCGTAAATGTGATTATCATCTGGGACAAAGAGTACACGCCATGTTTCGTTTGGCGGCTAGAGGGGAGAGACAGACACACGTCACTATGTTCAGTTAG

Protein sequence:

>DPOGS206430-PA
MFSICRQTHPATGIEHAISCCFFNNDEVCLITAGANIIKVFRLLPEGHAKEVNAAGQPIPPKMKLECLASYTLWGNVMSIASVKCPSAGRDLLLVSFKEAKLSVVQYDPQVNNLITLSMHYFEEDDMKGGWTTHPHIPWIRVDPEFRCAVMLLYGRKLAVLPFRKDITSEEGDPLEAKPLDCKKNQPIQTISRAPTLASYVIILKELDEKIDNILDIQFLYGYYEPTLLLLYEPVRTFAGRTAVRNDTCAMAGVSLNMSARVHPVIWSIGGLPFDCIQAVPIQKPLGGCLIMAVNSLIYLNQSVPPYGVSLNSIATHTTNFPLRIQEGVCITLDGARVVALGDTRLSLALKGGQLYVLTLLSDSVRSVRSFHLDRAAASVLTSCMCVIEEDFLFLGSRLGNSLLLRVTERENRMLFSVDKPLEATVDLTLSETDKDKEPLPKEPQKEMLDPQAKKRRLDTISDCVATNVVEISDKDELEVYGSDIRTSTQLTSYVFEVCDSLLNICPIGDVSMGELQLVSEEGAGRRSRPALEMVACSGRGKNGALAVLQRSLTPQLLTAFDLPGCIDMWTVIGEATEVNREAHKDMEGSHAYLILTQEDSSMILQTGQEINEVDNSGFMTSAPTVFAGNLGNNRFMVQVTTTAIRLVRNGVLVQSITLEWTARSACTADPYLCVVSTCGRALVLALRELRARDATSARLAPTRQAVPHRPALLKAVPYRDLSGLFTSTDDNIQVKGEFTGKMKEKNIKAEGFKADTVYELNDEDELLYGGDQTPASMASVKIWHIPDGGLSMHLTDWLVELHGHKRRVAYIEWHPTAENILFSAGFDYLILADSLESVPIPNQTDEDEFNTGHSSNAERLQEILVVGLGHKGSRVLMLLRCDDDQLMIYQAYKYPRGNLRMRFSRMSLSFPFGYRSGVVSCPEESYEASTLREGVRQLRYFGNVGGYSGVFVCGRTPYMLVLGSRGELRIHPLSDNAATFASFNNTNCPQGFMYFNDKGSLRICALPTHLSFEASWAVRKVPLRETPHSVTFHLESRTYCLVSSVSQSTHSYYKFNGEDKEKSNENKGDRFPYPMMERFSIMLFSPVSWEVIPNTRIELDEWEHVTCLKNVSLSYEGTRSGLRGYIAIGTNYNYGEDITSRGRILIYDIIDVVPEPGQPLTKNRFKEIYAKEQKGPVTALTQVLGFLISAVGQKIYLWQLKDNDLVGVAFIDTQIYVHRMLAVKNLILVADVYKSISLLRIPTPTQDAVARVQGPQDGSGDVVAQSIYDMQFMIDNTSLGFLIYDMQFMIDNTSLGFLVSESEGNFAMYMHQPQARESYGGQRLIRKCDYHLGQRVHAMFRLAARGERQTHVTMFS-