Monarch geneset OGS2.0

DPOGS213670
TranscriptDPOGS213670-TA921 bp
ProteinDPOGS213670-PA306 aa
Genomic positionDPSCF300219 - 405090-406346
RNAseq coverage206x (Rank: top 46%)
Annotation
HeliconiusHMEL0164241e-15793.23% 
BombyxBGIBMGA010619-TA2e-15089.94% 
DrosophilaTbp-PA8e-11664.87% 
EBI UniRef50UniRef50_P290377e-9881.45%TATA-box-binding protein n=84 Tax=Eukaryota RepID=TBP_MOUSE
NCBI RefSeqNP_001037059.15e-14989.94%TATA-box-binding protein [Bombyx mori]
NCBI nr blastpgi|17299129e-15391.86%DNA-binding protein [Spodoptera frugiperda]
NCBI nr blastxgi|17299124e-15992.18%DNA-binding protein [Spodoptera frugiperda]
Group
Gene OntologyGO:00036777.4e-172DNA binding
GO:00063557.4e-172regulation of transcription, DNA-dependent
GO:00063677.4e-172transcription initiation from RNA polymerase II promoter
GO:00054887.4e-39binding
KEGG pathwayame:5506927e-137 
 K03120 (TBP, tbp)maps-> Huntington's disease
    Basal transcription factors
InterPro domain[33-305] IPR0008147.4e-172TATA-box binding protein
[139-219] IPR0122957.4e-39Beta2-adaptin/TATA-box binding, C-terminal
[127-214] IPR0122944.8e-33Transcription factor TFIID, C-terminal/DNA glycosylase, N-terminal
Orthology groupMCL12374 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213670-TA
ATGGATCAAATGCTTCCAAGTCCATATAACATACCAGGTATTGGTACTCCCTTGCACCAACCTGAAGAAGATCAACAGATTTTACCAAATGCTATGCAACAGCAACAACAACAACAACGCACAACAACAACTTCCTTGGTATCAATGGGTTCATCGCCGCTCGTGGGTTTTGGCGCCTCTATAATGGGCACACCTCAGAGAACGATGCATACGTATGCTCCAACAGCCAGCTATGCAACACCTCAACAGATGATGCAACCTCAAACACCGCAAAACTTAATGTCTCCATTGATAACGGGTTCAAGTATAGCCGGTCAACAGATCCTAAACCAAATGAGTCCTGCACCCATGACACCTATGACTCCACATTCTGCAGATCCTGGAATATTACCTCAGTTGCAAAATATAGTTTCCACAGTAAATCTTAATTGCAAATTAGACCTTAAAAAGATAGCCCTACATGCCCGCAATGCTGAATATAACCCTAAACGTTTTGCTGCCGTCATTATGAGGATACGAGAACCAAGGACTACAGCATTGATATTTTCTTCTGGCAAAATGGTTTGCACCGGTGCCAAGAGTGAAGAAGACTCCCGTCTTGCTGCAAGAAAATATGCCAGAATTATACAAAAGCTAGGATTTACGGCAAAATTTTTGGATTTTAAAATTCAAAACATGGTTGGAAGTTGCGATGTTAAATTTCCAATTCGCCTGGAAGGCTTAGTCCTAACACATGGACAATTTAGCTCTTACGAACCTGAACTCTTCCCTGGACTCATCTACCGAATGGTGAAACCTAGAATAGTTTTACTGATATTTGTATCAGGAAAAGTGGTACTAACAGGTGCAAAAGTTCGCCAAGAAATATATGAAGCTTTTGATAATATTTACCCAATATTGAAAAGTTTTAAGAAACAATAA

Protein sequence:

>DPOGS213670-PA
MDQMLPSPYNIPGIGTPLHQPEEDQQILPNAMQQQQQQQRTTTTSLVSMGSSPLVGFGASIMGTPQRTMHTYAPTASYATPQQMMQPQTPQNLMSPLITGSSIAGQQILNQMSPAPMTPMTPHSADPGILPQLQNIVSTVNLNCKLDLKKIALHARNAEYNPKRFAAVIMRIREPRTTALIFSSGKMVCTGAKSEEDSRLAARKYARIIQKLGFTAKFLDFKIQNMVGSCDVKFPIRLEGLVLTHGQFSSYEPELFPGLIYRMVKPRIVLLIFVSGKVVLTGAKVRQEIYEAFDNIYPILKSFKKQ-