Monarch geneset OGS2.0

DPOGS214612
TranscriptDPOGS214612-TA3090 bp
ProteinDPOGS214612-PA1029 aa
Genomic positionDPSCF300050 - 44944-86777
RNAseq coverage1267x (Rank: top 10%)
Annotation
HeliconiusHMEL0225700.070.81% 
BombyxBGIBMGA001765-TA8e-13373.85% 
Drosophilacnc-PC5e-7245.73% 
EBI UniRef50UniRef50_D6W6P72e-9247.47%Cap-n-collar n=2 Tax=Tribolium castaneum RepID=D6W6P7_TRICA
NCBI RefSeqNP_001164113.11e-9347.47%cap-n-collar [Tribolium castaneum]
NCBI nr blastpgi|2821657413e-9247.47%cap-n-collar [Tribolium castaneum]
NCBI nr blastxgi|3407095311e-11133.43%PREDICTED: segmentation protein cap'n'collar-like [Bombus terrestris]
Group
Gene OntologyGO:00036774.3e-33DNA binding
GO:00063554.3e-33regulation of transcription, DNA-dependent
GO:00435658.5e-06sequence-specific DNA binding
GO:00037008.5e-06sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[831-901] IPR0089174.3e-33Eukaryotic transcription factor, Skn-1-like, DNA-binding
[881-945] IPR0048278.5e-06Basic-leucine zipper (bZIP) transcription factor
Orthology groupMCL14509 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214612-TA
ATGATCGCGCTGAAGAAGCTGTACGGAGACGAGCTTTTGCGGCTGGCGCTTGTGCTTAGCTTGCTCAAAGCGAACCCCGAAGAATACCACGAGCTTGAAACCCAGCAGCTAGCTGGGCTTCATATCACCAACGGAACGGATTGGTCCTTGGAACACGAAGCCAGGACGCTGATAAGACCACGACATGTACATCCCAAGTCCCTGGATCATATCCTGATGAACTACGAGAGACAGCTGTTCGAAGAGCTGAACTCCTTAGGGCGGTACAATTATTTAGAGACCAATGACAGACCTTGGTACAATCAGCCCGTCTATACATACCTGTTGAACGACGTTGCCACTGATACGATCGCACCAGCCCTCGGCCAAGAAGAGTCACAACAAGTAGAAGACAACGTGTCCGCTGACGAGAACGTGGCGCAGGAAGTCAAAGTCGAGGAGGAGAAAGAAGAGACTGCCGTCGTGACGCTATCAGCGGATTTTCTCCAGAACCACACTGAGAGCGACATCTTCGCTGAAATAGCTTCTAGATCTTTCGACGTGAACGAGTTTCTCAACCCTGAGATGCAAATCAAAAAGGAAGAAGATTTGATGATAGAGGTCAAGAAGGAGAAAGAAATCGAGGATGTGTTCAGTGATAACGCCATAGATGACTTTGTTCCTTACTTCACGGCCAGGTCGGAGAAGATTGAGCTGGGGGAAGCGAATTCGGTGTTCAACCAAAATATAGCTGATCTCCAGGACTACGACGAGTTCGATGTCAAGGAACTAAATAACCTGGAACATCTGGACGTTAAGCAACAGCAGGAGAACCTAGAACAGTATCTGCTTGAGAACACTCCTTTGGATTCGGAGGTTTACGAACTAGCGCCTATGTTTGAAGAGGAGATGATTGTTAAGAGGGAAAGAGCCAGCACTAGTTTCACTTCAGCCAGTTCCAGCGGTGTGAGCGAGATGGACTCTAGCTCCAAGGATCTGGATGTGAAGCTCGAGCCAGACGAAGGGCATCACAGTGGCGAGGAGCTCACACAGGAGGACATGAATCTGATCGAGGTGCTGTGGAAACAAGACGTGGACATGGGCTTCTCGCTGGAGGATCCTCTCCAAATGAGCAATTATATCAAGGACGGTCCCCAGGCGACCAGGGTCAATGTCTCCGAGGAGCTGAAGAATAAACAGGAGCAGATAGAGAAGGTGAAAGCCACCCTCCTGGATGAGAAGGAGGACGATCCCTGGGCCGGACTCTCGTATACAGTGGACACTGAGACTGGTGAGTACGTGATCCAGGGTGATTTGCCAGGTGAGCTGGTCAACGAGGAGCGATTCAACCTGCTGGAAGAGACGCTCAGGCTGGTGGAGCTCGGAGACGAGGGCGACGCTAAGGATGAACAGCCACAGGCGGTAGAGGGCAGCAGTAGCGGCAGCATGCTCCACCCGGCGATGCGGCACGTGCCGCACCATCCGCTCGCTCACTATCACAACCAGCAACAAGTGTCCGACACCCGTGAGCGATGTCGCCGGGTGATAAATGAAGTGTCGCGGGCCGCGGACCTCACACGAACAGCGACGGACAGACAGACACGCTGGGGCTGGGCCCCAAATGCACTTTTAACAAATGATATGTCGACAGTTGCGGCGAGCAGCGCCCACGGAGGGGCGGGTTACGCTCCAAACTACCACGCCCCCATACCACCTATACCGGAAAAACATCACGAGGCTTACGGTGCCCCAGCGCCGCTGGACGGAGCGTATAAGGTAGAAGCAGCCCATCACCCGCAGCAGCACGATGGACTGTACTATCAGACACCTACGGAACCACAGCAAGACGGCTTCCTCCAGTCCATCCTGAACGACGAGGATCTCCAGCTGATGGATATGGCGATGAATGAGGGCATGTACACGATGCGGATGTTGGACGGCGCGCCCACGGTGCACCAGACACACGCGCACATGCCCGTGGCGGCCGAACGTGATTCGGCATCAGACAGCGCTGTGTCGTCCATGGGCTCGGAGCGCGTACCCTCGCTCTCTGACGGCGAGTGGTGCGACGGGAGTGACTCCGCCCAGGAGTTCCACAGTTCAAAATTCCGACCGTACGAGGCTGCTTACGGCAGAGAGAGATCCCACGCGCCACAGAAGAAACATCACATGTTCGGGAAGCGATCCTTCCAGGAACAGCCGTCCCAAGAAACCAGACCGGTTGTGAAATACGAATGCGAACAAACATACCATGAGATGCATATGCATGCAGATTACACCCCTCGCCAGCACATACCGCCCCAGCTAGGTGTGCAGCCGACGTTGGACATCAATTCACCACACTCAAGCCACGCATTGCAACATACAACGCTGCCGAGCCCGAACCCGCCGCGATTCGGGTTCAGTTCGGGAGATAGAGTGAGACACAACCACACATACAGCGCAGCCCTGCCGCCCACAGAGGAGAGACTACCCACGAGAGATAAGAGAGTCCGCCGTCTAACCGACGGCAGTACTTCCGACAGCGGCAGCGGACATCTCAGTAGAGACGAGAAGAGAGCGAAGGCTTTAGGTATACCGCTGGAGGTCCAGGACATCATCAACCTGCCCATGGACGAGTTCAACGAGCGGCTCTCCAAACACGACTTGAGCGAGGCGCAGCTGTCGCTGATCCGAGACATACGGAGGCGGGGCAAGAACAAGGTGGCAGCACAGAACTGCCGGAAACGGAAGCTGGACCAGATCACCTCCCTGGCAGACGAGGTCCGCACGGTCCGCGACAGGAAGGCCCGCACGCAGAGAGACAGACACAACTTGCTGGCCGACAGGCAGAAGCTCAAGGAGAGGTTCGCCGCGCTCTACAGACACGTGTTCCAGCACCTCCGCGACCCTGAAGGACGACCCTTGTCCTCCAGCCAATACTCCCTACAACAAGCGGCTGACGGCAGCGTAGTTCTCGTGCCCAGGATGGGAGGAGCGACTCATTCGTTCACATTCACCGCCAGACCACTCCATGAACCGGACGGAGGAGGACCTCGAGCGGAAGAACAACTACGAGCACTAGTGCGGCACCGACCGAGGACAGCTCGCGGATGGGTGGCCGCCTTTTAA

Protein sequence:

>DPOGS214612-PA
MIALKKLYGDELLRLALVLSLLKANPEEYHELETQQLAGLHITNGTDWSLEHEARTLIRPRHVHPKSLDHILMNYERQLFEELNSLGRYNYLETNDRPWYNQPVYTYLLNDVATDTIAPALGQEESQQVEDNVSADENVAQEVKVEEEKEETAVVTLSADFLQNHTESDIFAEIASRSFDVNEFLNPEMQIKKEEDLMIEVKKEKEIEDVFSDNAIDDFVPYFTARSEKIELGEANSVFNQNIADLQDYDEFDVKELNNLEHLDVKQQQENLEQYLLENTPLDSEVYELAPMFEEEMIVKRERASTSFTSASSSGVSEMDSSSKDLDVKLEPDEGHHSGEELTQEDMNLIEVLWKQDVDMGFSLEDPLQMSNYIKDGPQATRVNVSEELKNKQEQIEKVKATLLDEKEDDPWAGLSYTVDTETGEYVIQGDLPGELVNEERFNLLEETLRLVELGDEGDAKDEQPQAVEGSSSGSMLHPAMRHVPHHPLAHYHNQQQVSDTRERCRRVINEVSRAADLTRTATDRQTRWGWAPNALLTNDMSTVAASSAHGGAGYAPNYHAPIPPIPEKHHEAYGAPAPLDGAYKVEAAHHPQQHDGLYYQTPTEPQQDGFLQSILNDEDLQLMDMAMNEGMYTMRMLDGAPTVHQTHAHMPVAAERDSASDSAVSSMGSERVPSLSDGEWCDGSDSAQEFHSSKFRPYEAAYGRERSHAPQKKHHMFGKRSFQEQPSQETRPVVKYECEQTYHEMHMHADYTPRQHIPPQLGVQPTLDINSPHSSHALQHTTLPSPNPPRFGFSSGDRVRHNHTYSAALPPTEERLPTRDKRVRRLTDGSTSDSGSGHLSRDEKRAKALGIPLEVQDIINLPMDEFNERLSKHDLSEAQLSLIRDIRRRGKNKVAAQNCRKRKLDQITSLADEVRTVRDRKARTQRDRHNLLADRQKLKERFAALYRHVFQHLRDPEGRPLSSSQYSLQQAADGSVVLVPRMGGATHSFTFTARPLHEPDGGGPRAEEQLRALVRHRPRTARGWVAAF-