Monarch geneset OGS2.0

DPOGS213695
TranscriptDPOGS213695-TA852 bp
ProteinDPOGS213695-PA283 aa
Genomic positionDPSCF300219 + 375151-376537
RNAseq coverage92x (Rank: top 62%)
Annotation
HeliconiusHMEL0164311e-11278.60% 
BombyxBGIBMGA010619-TA1e-6461.71% 
DrosophilaTrf-PA5e-6454.00% 
EBI UniRef50UniRef50_Q17JE54e-7366.50%TATA binding protein, putative n=1 Tax=Aedes aegypti RepID=Q17JE5_AEDAE
NCBI RefSeqXP_001654561.18e-7466.50%TATA binding protein, putative [Aedes aegypti]
NCBI nr blastpgi|1571262572e-7266.50%TATA binding protein, putative [Aedes aegypti]
NCBI nr blastxgi|1571262573e-6966.50%TATA binding protein, putative [Aedes aegypti]
Group
Gene OntologyGO:00036775e-92DNA binding
GO:00063555e-92regulation of transcription, DNA-dependent
GO:00063675e-92transcription initiation from RNA polymerase II promoter
GO:00054881.1e-29binding
KEGG pathwayaag:AaeL_AAEL0020792e-73 
 K03120 (TBP, tbp)maps-> Huntington's disease
    Basal transcription factors
InterPro domain[86-283] IPR0008145e-92TATA-box binding protein
[104-201] IPR0122944.7e-32Transcription factor TFIID, C-terminal/DNA glycosylase, N-terminal
[116-195] IPR0122951.1e-29Beta2-adaptin/TATA-box binding, C-terminal
Orthology groupMCL18799 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213695-TA
ATGGATTCAGATCTTTCTGTTCCTGACTGCCCTATGGCGGAAATCGTGAGTGATGTTAATGTACAAAATATGCAAACTGATGGACACACTCCAGTTAATGAAAAACAAAACCCACAACGAGGGAATACGAAGCTTTTAACAAGTGGGACACCTAAGCCTCATTCCTTAGAAAATGTGCCATCAACTCCACAAATTTCAGGAGATATAACTTTGACACCAACACATCGAACATTCACTCCACAAACTCCATCTGTGAATCCACATAATTCTATGAGTGCCATTACCCCAATGGCAAGTGCAGTTAATCAAGCAAAAAATAGTATAAAATTTCAAAATTGTATTTCTACAGTAAGTTTAGATTGTGAACTGAATTTGTTAGACATATACTGTAGAACAAGGTTTTCAGAATACAACCCTGCTAGATTTAATGGAGTCGTTATGAAGATTTTGGAACCGCGAGCCACAGCCCTAGTATTTAGATCTGGTAAAATAGTCTGTACGGGAGCCAAAAATGGACATGACTCATATATCGCAGCTAGAAAATTTGCAAGAATTATTCAGAAACTTGGTTTTCCGGTGAAATTTGTTGATTTCAAAGTTCTTAATTTTCTAGCAACAGCGGATTTAAGATTTCCCATAAAACTGGAAGCGCTACAGCAAGCTCACGGTCAGTTCACTTCATATGAACCGGAACTTTTCTCTGGCCTCGTTTATAGAATGATACGACCAAGGGTTGTGTTGCTAATATTTGTTAATGGAAAAATGGTTATAACAGGCGCTAAAACTAATCAAGAAGTTTATGAAGCAGTTGACATAATACACCCCATTTTAAGAAGTTACAAGAAAAATTGA

Protein sequence:

>DPOGS213695-PA
MDSDLSVPDCPMAEIVSDVNVQNMQTDGHTPVNEKQNPQRGNTKLLTSGTPKPHSLENVPSTPQISGDITLTPTHRTFTPQTPSVNPHNSMSAITPMASAVNQAKNSIKFQNCISTVSLDCELNLLDIYCRTRFSEYNPARFNGVVMKILEPRATALVFRSGKIVCTGAKNGHDSYIAARKFARIIQKLGFPVKFVDFKVLNFLATADLRFPIKLEALQQAHGQFTSYEPELFSGLVYRMIRPRVVLLIFVNGKMVITGAKTNQEVYEAVDIIHPILRSYKKN-