Monarch geneset OGS2.0

DPOGS201327
TranscriptDPOGS201327-TA906 bp
ProteinDPOGS201327-PA301 aa
Genomic positionDPSCF300176 + 562647-565952
RNAseq coverage758x (Rank: top 17%)
Annotation
HeliconiusHMEL0124029e-12585.38% 
BombyxBGIBMGA003121-TA3e-13880.33% 
DrosophilaTrf2-PF2e-7964.71% 
EBI UniRef50UniRef50_Q16II11e-9071.24%Tata-box binding protein n=4 Tax=Neoptera RepID=Q16II1_AEDAE
NCBI RefSeqXP_001663841.13e-9171.24%tata-box binding protein [Aedes aegypti]
NCBI nr blastpgi|1571368725e-9071.24%tata-box binding protein [Aedes aegypti]
NCBI nr blastxgi|1571368724e-8571.24%tata-box binding protein [Aedes aegypti]
Group
Gene OntologyGO:00036772.9e-124DNA binding
GO:00063552.9e-124regulation of transcription, DNA-dependent
GO:00063672.9e-124transcription initiation from RNA polymerase II promoter
GO:00054889.2e-25binding
KEGG pathwayame:4129192e-86 
 K03120 (TBP, tbp)maps-> Huntington's disease
    Basal transcription factors
InterPro domain[66-267] IPR0008142.9e-124TATA-box binding protein
[66-267] IPR0154452.9e-124TATA-Box binding protein-like
[69-168] IPR0122946.2e-30Transcription factor TFIID, C-terminal/DNA glycosylase, N-terminal
[71-157] IPR0122959.2e-25Beta2-adaptin/TATA-box binding, C-terminal
Orthology groupMCL12628 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201327-TA
ATGGCTACACTCATCCAAGAGAATGGTATGAAGTTGAGCAAAGGAACCCATGGTGTAGTCGTTAACCATGGCATGACGACGCACGGTGTCCCAAATCACATGGTGCCAGACCATGAATACTGTGAATCGGGTCAGGCCGAGCAACCCGCACAACAGTGCCTCGACGCGGAGAGCGAGCCGCATCAGCCGCCCGTCGAGGAGGAGGAGGAAACCCCAGAAATTGACATAATGATAAACAATGTTGTGTGCAGTTTTAGTGTTAAGTGTCACCTGAACCTGAGACAGATCGCTTTAAACGGTGTTAATGTTGAGTTCAGACGGGAGAACGGAATGGTTACTATGAAACTCCGGCGTCCATACACCACGGCCTCTATATGGTCCTCTGGTCGGGTCACGTGCACGGGCGCCACTAGCGAGGATCAAGCTAAAGTGGCCGCCCGCCGCTACGCACGTGCCCTACAGAAGTTGGGCTTCCAAGTACGCTTTCGCAACTTTCGTGTTGTAAACGTACTCGGAACTTGTAGGATGCCCTTTGGCATTCGAATTATAGCATTCTCAAAGAAATACAAAGAAGCAGACTATGAGCCTGAACTTCATCCCGGGGTAACATACAAGTTGTACAATCCCAAGGCGACTCTCAAGATATTTTCCACCGGTGGTGTGACAATAACTGCTCGAAGCGTTAGCGATGTCCAGTCGGCCGTAGAGCGCATCTTCCCGCTGGTGTACGAGTTCCGTAAGCCTCACTCGCCCGCCGATGAGGAGAAGCTGCGTCAGAGGCGGGCGGCGCGATCGCGGGGCGCCGGCCCACAACCCGCAGAGGAGCGGCCCCTTGAACAAGCTGCACCACAGACTGACGACCCCATGCACCTGGTCACACTGTCCGACGACGACGCCTGGGAGTGA

Protein sequence:

>DPOGS201327-PA
MATLIQENGMKLSKGTHGVVVNHGMTTHGVPNHMVPDHEYCESGQAEQPAQQCLDAESEPHQPPVEEEEETPEIDIMINNVVCSFSVKCHLNLRQIALNGVNVEFRRENGMVTMKLRRPYTTASIWSSGRVTCTGATSEDQAKVAARRYARALQKLGFQVRFRNFRVVNVLGTCRMPFGIRIIAFSKKYKEADYEPELHPGVTYKLYNPKATLKIFSTGGVTITARSVSDVQSAVERIFPLVYEFRKPHSPADEEKLRQRRAARSRGAGPQPAEERPLEQAAPQTDDPMHLVTLSDDDAWE-