Monarch geneset OGS2.0

DPOGS200534
TranscriptDPOGS200534-TA5460 bp
ProteinDPOGS200534-PA1819 aa
Genomic positionDPSCF300119 - 313633-329211
RNAseq coverage363x (Rank: top 33%)
Annotation
HeliconiusHMEL0093730.090.99% 
BombyxBGIBMGA007052-TA0.083.15% 
DrosophilaTaf1-PE0.063.09% 
EBI UniRef50UniRef50_E2AC060.057.47%Transcription initiation factor TFIID subunit 1 n=17 Tax=Coelomata RepID=E2AC06_CAMFO
NCBI RefSeqXP_001811347.10.060.47%PREDICTED: similar to transcription initiation factor TFIID subunit 1 [Tribolium castaneum]
NCBI nr blastpgi|2700032910.060.51%hypothetical protein TcasGA2_TC002507 [Tribolium castaneum]
NCBI nr blastxgi|3800181820.059.76%PREDICTED: transcription initiation factor TFIID subunit 1-like [Apis florea]
Group
Gene OntologyGO:00055152e-35protein binding
KEGG pathwaytca:6584930.0 
 K03125 (TFIID1, KAT4)maps-> Basal transcription factors
InterPro domain[551-1012] IPR0225912.6e-149Transcription initiation factor TFIID subunit 1, domain of unknown function
[1363-1501] IPR0014872e-35Bromodomain
[4-65] IPR0090675.9e-20TAFII-230 TBP-binding
Orthology groupMCL10174 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200534-TA
ATGTGTGATAGTGATGACCCAAATGAAAACGGCCGGTCTGGAATGGACCTGACTGGCTTTTTGTTTGGTAATATAGACGAGAGTGGACAGCTCGAAGATAATGGGATACTAGATGATGAATCCAAACGCATGCTGTCGTCTCTTAACAGATTGGGACTTGGTTCTATAATATCAGAGGTGTTGGATGAAAATGATATTATAAAGGAAGAGGAAGAAAAAGATTATTCAGAAAAAAGTCCATCAGCTGTTGATTTCTTTGATATTGATGATGCTGTTGATGAAGAAAAGGACATAGAAACAAAAGTAGTCACTAATGATGATGTATCCGTCAAAAATGAATGTGAAAATCAGTCCGAGGGTATTAAAAGTGATTCTTCGCAGAATGAAAAAGATACTGTTGAAAACTGGAATGAGCCAAATGTTGAGAAAAATGAGAGTAACAATGATGAAGGTTATGAGGGTGATATGGAAGGTGATGGTGAACTTATGCCTCCCCCGTCAACTGTGCCGAGACAGAAAACTGAAAAGCCTAAGAAATTGGAGACACCTCTAGCTGCAATGTTACCATCAAAGTATGCTGGTGTTGATGTAACAGAGTTATTCCCTGACTTTAGACCAGACAAAGTGCTATTGTTTTCGCGGTTGTTTGGTCCGGGGAAACTTTCCAGTCTGCCTCAGATATGGCGAGGCGTGAAGAAACGTAGGCGAAGAAAACGTAGTGGCAGCACCAGTAGTGACACACCACCACAGATAGAATTACAGGAAATTCAGTATGCTAGTGACGATGAAGAGAAATTTCTCAAGCCTATGGAAGAAACGCCTCAAGTATCCACACAAGAGACGAACTCACAGTCCTCTGGCGATAAATCTCAAAAACCAACACCCGCTCACTGGCGTTTTGGTCCAGCTCAGGCTTGGTATGATATGCTCAATGTACCAGAGACGGGGGAAGGTTTCGATTATGGATTTAAATTAAAGGCGCAATCTCCAGAACGAGAAAAAACAGGTGAAGAAGTTATTGATGAGGAGTGTGAGAAAGCTGACAGTCCTGATAGTGGGTTCCCTGATGACGCTTTCCTTATGGTCTCCCAATTACAATGGGAGGACGATGTCGTCTGGGATGGCAGTGAGATAAAACATAAGGTGCTGGCTCGATTAAACAGTAAGAGTAACGCTGCTGGCTGGGTCCCTACTTCTGGAAGTCGAACCGCTCAACAGTTTTCTCATGGACGACCGCCTCCCGCGCCGCTAACAGCTAAGCCATCCTCAAGCTCGTCATCTACACCGATTAATCCTGGAACGAATGGCGAGGGTGAGGACAATACATGGTACAGTATTTTCCCCGTGGAAAATGAAGAATTGGTCTATGGCACGTGGGAAGATGAAGTTATATGGGACGCTGAGAACATGCCCAAGATACCCAAACCAAAGATACTTACTCTGGACCCCAATGACGAGAATATTATATTGGGTATACCCGATGACGTGGATCCATCTAAGATTACGAGAGAACGAGGACCGGCTCCAAAAGTGAAAATACCTCATCCACATGTTAAGAAGTCTAAGATATTGCTCGGTAAGGCGGGCGTCATAAACGTGTTGCAAGAAGACGCTCCCCCTCCACCGCCAAAGTCACCAGACAGAGACCCCTTTAATATATCCAATGACGTATATTACCAGCCGAAATCTCAGAACCCTCGTCTTAAAGTTGGCGGAGGTCAGTTAATACAGCACAGTACTCCGGTGGTTGAGCTCAGGGCACCGTTCATCCAGACCCACATGGGCCCCGCGAGACTCCGCGCCTTCCATCGACCAGCCATGAGAAAATTGTCCTATGGACCTCTAGCCGCCCCAGGACCACATCCCGTGCAACCGCTACTGAAGCACATTAAGAAAAAGGCCAAACAACGCGAAGCCGAACGTCTAGCATCAGGTGGTGGGGATGTCTTCTTCATGAGAACTCCTGAAGACCTAAGCGGCAGAGATGGTGATTTAGTACTGGTTGAATTCTGTGAAGAACATCCCCCACTCATAAGTCAAGTGGGCATGTGTACTAAAATCAAGAACTATTACAAGCGTACAGCTACTAAAGATAACGGTCCTAAGCCAATGAAATACGGTGAAATAGCCTACGCTCATACATCACCGTTCTTGGGTATATTGCCGCCGGGCGCGACGCAACCGGTCGTTGAGAACAATATGTACCGAGCACCGATATACGAACACACACTTTCCTCCACAGATTTCCTCATAATAAGAACGAGACAAGCATACTATATCCGAGAAGTTGATGCATTATTTGTTGCTGGTCAAGAATGTCCACTGTATGAGGTGCCTGGACCAAATTCCAAGAGAGCCAACAATTTTGTTAGAGACTTTTTGCAGGTTTTCATATACAGATTATTCTGGAAATCTCGGGATAATCCTCGCCGTATAAAGATGGATGATATTAAAAGAGCTTTTCCGTCTCATTCAGAAAGCTCAATACGTAAACGTCTCAAACTATGCGCTGACTTCAAAAGAACCGGAACAGATTCTAACTGGTGGGTGATAAAGCCAGATTTCCGTCTACCCTCCGAAGAGGAGATTCGAGCTATGGTTTCCCCGGAGCAGTGTTGTGCGTACTTTAGTATGGCGGCGGCGGAACAACGTCTCAAGGACGCCGGCTATGGGGAGAAGTTCATATTCACGCCCCAGGAGGACGATGATGAGGAACTACAGCTCAAAATGGATGACGAAGTGAAAGTAGCTCCTTGGAACACAACTCGCGCCTACATCCAGGCTATGCGAGGAAAATGTCTCCTACAGCTGACGGGCGTTGCAGATCCCACTGGTTGCGGTGAAGGGTTTTCATACGTCCGGGTTCCCAACAAACCAACCCAGCAACCGAACGAGGAACAGCAACCTAAGAGAACTGTGACCGGTACTGACGCTGATCTGAGAAGGCTTAGTTTGAATAACGCTAAGGCCTTGTTGAGGAAATTTGGTGTGCCTGAGGAAGAAATAAAAAAACTGTCTCGCTGGGAAGTCATCGACGTGGTGAGAACTTTGTCAACGGAGAAAGCCAAGGCCGGTGAAGAAGGAATGACTAAATTCTCTAGAGGAAATAGATTTTCAATAGCCGAACACCAGGAGAGATATAAAGAGGAATGTCAACGTATATTCGAATTACAAAACCGTGTACTGACAAGCACAGAAGTGCTTAGTACAGACGAGGCGGAAAGCTCCGTCAGTGAAGAATCAGACCTCGAAGAAATGGGAAAGAATTTAGAAAATATGCTGGCCAATAAGAAGACTACCGAACAATTGAGCATGGAAAGAGAGGAGGCTGAGAGAGCTGAACTGCGAAAAATGATATTAGGACAATCTGAAAAGAAACCTCAGATAAATCAACAGGACCAACAGCAATCTTCAAACCAAGGTCGCGTCCTTCGCATCGTAAGGACGTTCCGAAACGCCTCCGGTCAAAGATACACGCGTGTCGAGTTAGTAAGAAAGGCAGCAGTCATAGAGGCCTACACCAAAATAAGGTCCACGAAAGACGACGCTTTCATACGACAATTCGCCACAATGGATGAATCACAGAAGGAAGAAATGAAGAGAGAAAAGAGAAGGATTCAGGAACAATTGAGACGTATCAAGAGGAACCAAGAAAGGGAGAGACTGGCTGGAAATGTATCAGTACCTGGTTCATCTATCAGTGATAGTATGAACATGTCGACCATGTCTGACCTAGGGTCCAAATCACCTGGCCTGATCCCATTGGGTCAGATCAAACAGGAACCGGATCTTCACACACCATCTAGACGACGGGCCAAATTGAAGCCCGATTTGAAATTAAAGTGCGGAGCGTGCGGTCAAGTTGGTCACATGCGTACAAACAAGGCGTGTCCGTTATACACAGGGGGCGGGGCGGTGACCCCCGAACATGATGAACCAGCCCCCGAACCTGACGACCTTGACCTTGGATATGTGGATGGAACCAAGCTCACACTACCTTCTAAATTTGTTAAGCAATCTTCGGAGGAGCTTCGTCGTCGTAGCGGCAGTCGTCGCGAGGTCCGAGCTGCTGGGCGAACCAAGAGACGCGGCACTGCCAGTGATTCTTGTGATTATCTCGTCAAAAGACCAGCGGAAAGACGCAGAACTGATCCACTAGTGACTCTCTCGTCCCTATTAGAAGACGTATTGAACACTATGAGACACCTCCCCGATGTACAACCCTTCCTGTTCCCAGTCAATCCGAAGCTGGTAGCGGATTACTATCGTATCGTATCGCGGCCGATGGATCTACAGACTATAAGGGACAATTTGAGGCAAAAACACTATCAAAGCCGCGAAGAATTCTTGGCTGATGTCAACCAGATCGTAGAAAATTCTACTCTTTACAATGGTCCTACGAGCAGCTTGACGGTAGCAGCGCAGCGTATGATGCAACGCTGTTTCGAAAAACTGGCAGAGAAAGAGGAACAGTTTATGAAACTGGAGAAACAAATCAACCCGCTGCTAGATGATAACGACCAGGTGGCTTTATCGTTTATTTTTGAGAATCTCTTAACAACGAAACTTAAAGTTATGCCGGAAGCTTGGCCATTCCTGAAACCAGTTAACAAGAAACAGGTCAAAGATTACTATAATGTTATTAAAAAACCAATCGACATGGAGACGATAGGAAAGAAAATACAAGCTCATAAGTACCACAGTCGGGAGGAATTTCTTCGGGATATACAGTTGTTGGTGGATAACTGTCGTGCTTACAACGGACTTAACTCACAGTTCACGAGACAGGCCGAGGCCGTACTCAAAGTAACCCAAGAAGCCCTAGAACAGTTCGATGAGCACGTGAGTCAGCTGGAGGCGAATATAGCACGCGTTCAGCAAAAGATGTTAGAGGATGCTGAGCAGTCTGAGTTAGAAGATGACCCGCCGCCGCCCAGCGACGAGAAAAGAGGCCGCGGAAGACCGAAGAAACACAAACCAAGCACATCAACATCGATGGCAGATGATGCCACACAGAGGAAACGTAGTCGTGTCAAGAAGGACCAAAATAGTCTGGTTGATGATTTACAATATTCTGACAGTGGAAATTCAGGTCTAGAGGAAGTCGAACAAAAAGATGCAGCTGAAGCTATGGTACAATTATCTGTTCGACCTGATGACGACATTCCTGATTCGAGCTTCGATACTTCTGAGTTCCTCATCAAGCGCGAGGTTCCAGACGAGCCGCACGAGCTGGTCGACCTGGACAGTCACAGCACTGACTATACTTACCCCGCCGTTGTTAAGGAGGAACCAATCGAGCCGGACATGAACATGGATCCGCCAATGATGGACGCGCTCCCTGAACACACGCAGTTTAAAGAAGAGCCGCTTGAGAACTGGCAGCCGGACCCTGTAATTCAGGATGACCTAAGAGTAACGGACAGCGAAGAAGAAGCAGAAGATGGACTTTGGTTTTAG

Protein sequence:

>DPOGS200534-PA
MCDSDDPNENGRSGMDLTGFLFGNIDESGQLEDNGILDDESKRMLSSLNRLGLGSIISEVLDENDIIKEEEEKDYSEKSPSAVDFFDIDDAVDEEKDIETKVVTNDDVSVKNECENQSEGIKSDSSQNEKDTVENWNEPNVEKNESNNDEGYEGDMEGDGELMPPPSTVPRQKTEKPKKLETPLAAMLPSKYAGVDVTELFPDFRPDKVLLFSRLFGPGKLSSLPQIWRGVKKRRRRKRSGSTSSDTPPQIELQEIQYASDDEEKFLKPMEETPQVSTQETNSQSSGDKSQKPTPAHWRFGPAQAWYDMLNVPETGEGFDYGFKLKAQSPEREKTGEEVIDEECEKADSPDSGFPDDAFLMVSQLQWEDDVVWDGSEIKHKVLARLNSKSNAAGWVPTSGSRTAQQFSHGRPPPAPLTAKPSSSSSSTPINPGTNGEGEDNTWYSIFPVENEELVYGTWEDEVIWDAENMPKIPKPKILTLDPNDENIILGIPDDVDPSKITRERGPAPKVKIPHPHVKKSKILLGKAGVINVLQEDAPPPPPKSPDRDPFNISNDVYYQPKSQNPRLKVGGGQLIQHSTPVVELRAPFIQTHMGPARLRAFHRPAMRKLSYGPLAAPGPHPVQPLLKHIKKKAKQREAERLASGGGDVFFMRTPEDLSGRDGDLVLVEFCEEHPPLISQVGMCTKIKNYYKRTATKDNGPKPMKYGEIAYAHTSPFLGILPPGATQPVVENNMYRAPIYEHTLSSTDFLIIRTRQAYYIREVDALFVAGQECPLYEVPGPNSKRANNFVRDFLQVFIYRLFWKSRDNPRRIKMDDIKRAFPSHSESSIRKRLKLCADFKRTGTDSNWWVIKPDFRLPSEEEIRAMVSPEQCCAYFSMAAAEQRLKDAGYGEKFIFTPQEDDDEELQLKMDDEVKVAPWNTTRAYIQAMRGKCLLQLTGVADPTGCGEGFSYVRVPNKPTQQPNEEQQPKRTVTGTDADLRRLSLNNAKALLRKFGVPEEEIKKLSRWEVIDVVRTLSTEKAKAGEEGMTKFSRGNRFSIAEHQERYKEECQRIFELQNRVLTSTEVLSTDEAESSVSEESDLEEMGKNLENMLANKKTTEQLSMEREEAERAELRKMILGQSEKKPQINQQDQQQSSNQGRVLRIVRTFRNASGQRYTRVELVRKAAVIEAYTKIRSTKDDAFIRQFATMDESQKEEMKREKRRIQEQLRRIKRNQERERLAGNVSVPGSSISDSMNMSTMSDLGSKSPGLIPLGQIKQEPDLHTPSRRRAKLKPDLKLKCGACGQVGHMRTNKACPLYTGGGAVTPEHDEPAPEPDDLDLGYVDGTKLTLPSKFVKQSSEELRRRSGSRREVRAAGRTKRRGTASDSCDYLVKRPAERRRTDPLVTLSSLLEDVLNTMRHLPDVQPFLFPVNPKLVADYYRIVSRPMDLQTIRDNLRQKHYQSREEFLADVNQIVENSTLYNGPTSSLTVAAQRMMQRCFEKLAEKEEQFMKLEKQINPLLDDNDQVALSFIFENLLTTKLKVMPEAWPFLKPVNKKQVKDYYNVIKKPIDMETIGKKIQAHKYHSREEFLRDIQLLVDNCRAYNGLNSQFTRQAEAVLKVTQEALEQFDEHVSQLEANIARVQQKMLEDAEQSELEDDPPPPSDEKRGRGRPKKHKPSTSTSMADDATQRKRSRVKKDQNSLVDDLQYSDSGNSGLEEVEQKDAAEAMVQLSVRPDDDIPDSSFDTSEFLIKREVPDEPHELVDLDSHSTDYTYPAVVKEEPIEPDMNMDPPMMDALPEHTQFKEEPLENWQPDPVIQDDLRVTDSEEEAEDGLWF-