Monarch geneset OGS2.0

DPOGS203797
TranscriptDPOGS203797-TA2031 bp
ProteinDPOGS203797-PA676 aa
Genomic positionDPSCF300010 + 1629681-1635559
RNAseq coverage561x (Rank: top 23%)
Annotation
HeliconiusHMEL0125020.072.67% 
BombyxBGIBMGA003704-TA2e-16659.12% 
DrosophilaHsf-PD2e-8258.14% 
EBI UniRef50UniRef50_B0M1K60.070.04%Heat shock transcription factor n=7 Tax=Eumetazoa RepID=B0M1K6_MAMBR
NCBI RefSeqXP_002006278.11e-8860.98%GI20955 [Drosophila mojavensis]
NCBI nr blastpgi|1677359080.070.04%heat shock transcription factor [Mamestra brassicae]
NCBI nr blastxgi|1677359080.069.76%heat shock transcription factor [Mamestra brassicae]
Group
Gene OntologyGO:00056341e-58nucleus
GO:00063551e-58regulation of transcription, DNA-dependent
GO:00435651e-58sequence-specific DNA binding
GO:00037001e-58sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[10-113] IPR0002321e-58Heat shock factor (HSF)-type, DNA-binding
[10-113] IPR0119919e-39Winged helix-turn-helix transcription repressor DNA-binding
Orthology groupMCL15758 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203797-TA
ATGCGTTCAGTTGTGGAAATCGGGGCAAGTGTCCCCGCTTTTTTGGGAAAATTGTGGAAATTATTAAATGATACAGAAACGAATCATTTAATATCTTGGAGTCCCAGTGGAAAGACATTTGTTATAAAGAATCAAGCTGATTTTGCAAGGGAGCTGTTACCACTATATTATAAACACAACAATATGGCTAGTTTCATCAGGCAATTGAATATGTATGGGTTCCATAAAATAACCTCAGTAGAAAATGGTGGTTTGAGGTATGAAAAAGATGAAATTGAGTTTTCACATCCCTGTTTTATGAGAGGACATGCATATCTATTGGAACATATAAAAAGAAAAATTGCCAATCCCAAGTCTATAGTGGCAAGCAGTGAAAGTGGTGAAAAAATTCTTTTGAAGCCAGAAATAATGAACAAAGTGTTAGCTGATGTGAAGCAAATGAAAGGGAAACAGGAGAGTCTGGATGCTAAGTTCAGTGCGATGAAGCAAGAAAATGAGGCGCTATGGAGGGAAGTAGCAATACTACGTCAAAAGCATATTAAACAACAACAAATTGTTAACAATCTCATACAATTCCTGATGTCGTTGGTCCAACCATCAAGAGCTCCCAGTTCCACTGGTAACAATGTTGGAGTAAAGAGGCCATATCAATTGATGATTAATAACGCGGCACATAACTGCGGTGATAGCTCATACCCTGGTAGGCTTAAGAATATTAAAATTGACAAGGACATTGCATTGGAAGATTTAAGTGAAGAAAACTTGGAGGATGGGCCCACTATACATGAATTGGTACATGACGACACATTGCACAATGAAGTATCTCAGGATTCATTGGACGATAACTTTGTGTCGGTTGATTTAGCAAACAACCCACTTATTGCCAACTCACACAATAACTCTGATCCAACAACATCCTCTCGGTACCATGTAACAATGGAAGATGGTGAAGACCTAGAAACAGATCGTTCAAGATTATCATTGCCTATAGTCAATTCCAATGGTTTATGGAAACGTGAGACCCAGCCCATAGTTTCGTCCCCATCCCCTACAATGGCATCTGTGTCTCCAGTAGGTCAACAGTCAGCTGAGAATGTTAATGCCCACATAACAATATCACCGGGCACATCAAAGACGAAGGCTCGAAGCAACTGTAGAAGTATTTTAGCGAACAAAAATGTAATGTCAACAAGCAACTTTAACTCTATAAACCCATCAGCGGATTTTAAGCTACCAGCTGAGATTTTCGCTAGCGACGACTCTGTAAGTGATGTGGCGGCTACAGAGGAAGTTCTGCAGGATCTGGTATGTGATCAGGTCATCTCAGCCAAAGACAAGATGTTAGGTGGTGTAAATATAAAAATAGAGAAACCATTGGACTGTAAGAGTGGGAAAAAGTCGAAAAAGTCGAAGGATACCAACGACACCTGCTGCTTGAACCTAGCCGACATCAAGACTGAATTGCAGGACGACTTTGATTGGAACAATATGACACTTGCCACCGTTAATAACTCTAATATTAACAGGCAGCAGACAGTGTATAGGAGAGATAACTGCCAAAATCGGGAAGATATAACTTCCCTGTTTGGATCAAATTCCAACAAGAACGATATCGACGATCATTTGGATTCAATGCAAACGGATTTGGATTCGTTGAAGGAGTTGTTGAGAGGTGATACTTACGCGTTGGACACAAATACATTATTAGGGCTTTTCGGATCAGATGATCCTTTCTATGGACTCTCTTACAATCCGTCGAATGATCGCGCTAAGACCTGCAGTGGCGCTATGAAGCTGAAAGGTGAGGTAACGAATGTAAGCGGTGACGACACACGCGCACAGAGCCCATTCGAAGATGACGCTGAAGGGAATCAGTTAATATCGTATACAGAGAATATTCCAGACTTTGAGGATATAAATATGCCGGAATTGGAGGGCGAGAACTCTCAAGACTGCATCCCGAGTCCCAGCAGCTCGACCTTGAATACACCACAAGTACAACTGCAGTCACCATCATATACGAGACCTTGA

Protein sequence:

>DPOGS203797-PA
MRSVVEIGASVPAFLGKLWKLLNDTETNHLISWSPSGKTFVIKNQADFARELLPLYYKHNNMASFIRQLNMYGFHKITSVENGGLRYEKDEIEFSHPCFMRGHAYLLEHIKRKIANPKSIVASSESGEKILLKPEIMNKVLADVKQMKGKQESLDAKFSAMKQENEALWREVAILRQKHIKQQQIVNNLIQFLMSLVQPSRAPSSTGNNVGVKRPYQLMINNAAHNCGDSSYPGRLKNIKIDKDIALEDLSEENLEDGPTIHELVHDDTLHNEVSQDSLDDNFVSVDLANNPLIANSHNNSDPTTSSRYHVTMEDGEDLETDRSRLSLPIVNSNGLWKRETQPIVSSPSPTMASVSPVGQQSAENVNAHITISPGTSKTKARSNCRSILANKNVMSTSNFNSINPSADFKLPAEIFASDDSVSDVAATEEVLQDLVCDQVISAKDKMLGGVNIKIEKPLDCKSGKKSKKSKDTNDTCCLNLADIKTELQDDFDWNNMTLATVNNSNINRQQTVYRRDNCQNREDITSLFGSNSNKNDIDDHLDSMQTDLDSLKELLRGDTYALDTNTLLGLFGSDDPFYGLSYNPSNDRAKTCSGAMKLKGEVTNVSGDDTRAQSPFEDDAEGNQLISYTENIPDFEDINMPELEGENSQDCIPSPSSSTLNTPQVQLQSPSYTRP-