Monarch geneset OGS2.0

DPOGS203330
TranscriptDPOGS203330-TA1920 bp
ProteinDPOGS203330-PA639 aa
Genomic positionDPSCF300003 - 375564-382133
RNAseq coverage481x (Rank: top 26%)
Annotation
HeliconiusHMEL0072750.079.33% 
BombyxBGIBMGA002091-TA0.074.68% 
DrosophilaCTCF-PA3e-13455.17% 
EBI UniRef50UniRef50_G6DIK80.0100.00%Putative CTCF-like protein n=3 Tax=Endopterygota RepID=G6DIK8_DANPL
NCBI RefSeqXP_001606075.10.056.06%PREDICTED: similar to CTCF-like protein [Nasonia vitripennis]
NCBI nr blastpgi|3454950560.055.83%PREDICTED: transcriptional repressor CTCFL-like [Nasonia vitripennis]
NCBI nr blastxgi|3407166070.056.24%PREDICTED: transcriptional repressor CTCFL-like [Bombus terrestris]
Group
Gene OntologyGO:00036761.3e-12nucleic acid binding
KEGG pathway 
InterPro domain[308-339] IPR0130871.3e-12Zinc finger, C2H2-type/integrase, DNA-binding
Orthology groupMCL11721 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203330-TA
ATGCCGCCTCCAGACAAGAAAACTTCGAAGGAGGAGACCATTTTACAAACCTATTTGAACTCTTTTGATCAAGACAATGAAACAACAACCACCATTGCTGTTACTGGTGAAGTTGAAGAAGAAGCTGATACAGGGGTGACATATTTTGTTGATGAAGAAGGCAGATATTACTATCAACCGGCCGGTGACAATCAGAACATAGTGTCACTGCAGCCCGAAGTCACACAGGACAATGACGGAGAGATTACTGAGGATGCACAGATGCTTGTTGATGGTGAAAGTTACCAGACAGTGACCCTCGTGCCCTCGGACACGGGGAATGGTGAAGTCAGCTATGTGTTAGTAATGCAAGAGGAAAATAAGCCTGTTGTCAACTTAAATATAAAGGTGGATCAGGAGGAGAAAGGTGCTGATGTCTACAACTTTGAGGATGAAGAAGAAGCGGCTGAGGAGGGTTCTGACGACGGAGATGATGCACCAAAGACTAAAACTACTAAGAGGAATAAGTATGTTCGTCCCTACTTCACTTGCAGCTTCTGTTCGTACACGAGTCATAGACGCTACTTGCTTCTGCGTCACATGAAATCCCATTCGGAGGAAAGGCCGCACAAGTGCAGCGTGTGTGAACGAGGCTTTAAAACAATAGCCTCACTCCAGAACCACGTGAATATGCACAACGGTGTTAAACCTCACGTGTGTAAATATTGCAAGAGTCCGTTCACTACATCTGGTGAACTCGTTAGACACGTGCGGTATCGCCACACACATGAAAAGCCTCATAAATGCTCTGAATGCGACTACGCCTCCGTGGAACTGTCCAAATTGAGGCGTCACGTCCGCTGTCACACCGGAGAGAGACCTTATCAGTGCCCTCACTGTACCTACGCTTCACCAGATACTTTCAAATTGAAGAGACACTTGCGTACACACACCGGAGAGAAGCCGTACAAGTGTGATCATTGCAACATGTGCTTCACGCAATCTAACTCTCTGAAAGCTCACAAACTCATACACAATGTGGCCGAGAAGCCTGTGTTTGCTTGCGAGCTCTGTCCGGCCAGATGCGGTCGGAAAACAGATCTACGCATACACGTCCAAAAACTACACACGTCGGATAAACCACTTAAATGCAAGCGCTGTGGTAAATCCTTCCCAGACAGATATTCCTGCAAGATTCATAACAAGACACACGAAGGGGAGAAATGTTTCAAATGTGAAATGTGCCCGTACGCCTCGACGACGCTACGTCATCTGAAGACACACATGCTGAAACACACGGACGAGAAACCCTTCGCTTGTGAGCAGTGCGACCACTCGTTTAGGCAGAAGCAACTGCTGCGTCGCCACCAGAATTTGTACCACAATCCCCATTACGAGCCGAAGCCACCCAAGGAGAAAACGCACACGTGTCACGAGTGCAAGCGGACCTTCGCCCACAAGGGTAACCTGATCCGTCATCTTGCCATCCACGACCCTGAGTCTGGACACCACGAACGAGCACTGGCTCTGAAAATCGGCAGGCAGAGGAAGATCAAGACCAACACGGGAGGACCATCTCAGGTTGTGGATTCTGACGACGATATGATGAAGCTGGGCCTCAATAAGGAGATCAAACGCGGTGAACTGGTCACAGTAGCTGACGGTGATGGTCAACAGTATGTGGTGTTAGAGGTGATTCAACTCGAGGACGGGACGGAACAACAAGTGGCTGTGGTGGCACCAGAGTTCATGGAAGAGGAACAAGAAGAGGAAGAGGAAGAAGAGGAACAGGAAATTGAAACTCCTAAACAGAAAATATTAAACAGAACCATTAAACTAGAGAAGGAAGTTGACACATGCTTTGGATTTGATGAAGAAGAGGAGGAGGAGGCAGAGGAAGACATAACCTACAGCGACAAAGTAGTGTTGCGTTTAGTGTAA

Protein sequence:

>DPOGS203330-PA
MPPPDKKTSKEETILQTYLNSFDQDNETTTTIAVTGEVEEEADTGVTYFVDEEGRYYYQPAGDNQNIVSLQPEVTQDNDGEITEDAQMLVDGESYQTVTLVPSDTGNGEVSYVLVMQEENKPVVNLNIKVDQEEKGADVYNFEDEEEAAEEGSDDGDDAPKTKTTKRNKYVRPYFTCSFCSYTSHRRYLLLRHMKSHSEERPHKCSVCERGFKTIASLQNHVNMHNGVKPHVCKYCKSPFTTSGELVRHVRYRHTHEKPHKCSECDYASVELSKLRRHVRCHTGERPYQCPHCTYASPDTFKLKRHLRTHTGEKPYKCDHCNMCFTQSNSLKAHKLIHNVAEKPVFACELCPARCGRKTDLRIHVQKLHTSDKPLKCKRCGKSFPDRYSCKIHNKTHEGEKCFKCEMCPYASTTLRHLKTHMLKHTDEKPFACEQCDHSFRQKQLLRRHQNLYHNPHYEPKPPKEKTHTCHECKRTFAHKGNLIRHLAIHDPESGHHERALALKIGRQRKIKTNTGGPSQVVDSDDDMMKLGLNKEIKRGELVTVADGDGQQYVVLEVIQLEDGTEQQVAVVAPEFMEEEQEEEEEEEEQEIETPKQKILNRTIKLEKEVDTCFGFDEEEEEEAEEDITYSDKVVLRLV-