Monarch geneset OGS2.0

DPOGS201062
TranscriptDPOGS201062-TA1650 bp
ProteinDPOGS201062-PA549 aa
Genomic positionDPSCF300497 + 37799-42198
RNAseq coverage132x (Rank: top 56%)
Annotation
HeliconiusHMEL0087663e-14853.61% 
BombyxBGIBMGA005232-TA6e-13658.77% 
Drosophila% 
EBI UniRef50UniRef50_UPI0001CBA94D5e-0827.17%UPI0001CBA94D related cluster n=2 Tax=unknown RepID=UPI0001CBA94D
NCBI RefSeqXP_001809288.11e-1724.23%PREDICTED: similar to predicted protein [Tribolium castaneum]
NCBI nr blastpgi|1892382033e-1624.23%PREDICTED: similar to predicted protein [Tribolium castaneum]
NCBI nr blastxgi|1892382036e-1724.02%PREDICTED: similar to predicted protein [Tribolium castaneum]
Group
Gene OntologyGO:00056341.9e-06nucleus
GO:00063521.9e-06transcription initiation, DNA-dependent
KEGG pathwayspu:5779993e-09 
 K03131 (TFIID5, TAF6)maps-> Basal transcription factors
InterPro domain[57-92] IPR0048231.9e-06TATA box binding protein associated factor (TAF)
Orthology groupMCL22091 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201062-TA
ATGTCTAACATATCCACAAGTTCGAAAAATCATTCCAAGAAAACCAATAGTCGAGAACGTGATAAAAATAAATCCATCGCTTCCGAATCTGGCGGTTCCAGTCAGTGTGATACGCCGCCACTTATTGAAAAGGATGCGACATCTCAAGGGAGGTTTGCAGGAGTCGGAGCAGATTCAGTGCTGTGCATGGCCGAGCAGATTGGAGCCGAAATTAATACTGAGGCAGCTACGAACCTCGCTGAGGACGTCAGCTATAAACTGCGGCAGGTTATCAGTACAATCGCCCTTCACAGCGAGTTGATGAAAAAGTCGTGTGTGGACAGTTGGGATGTTAACACTGTCTTTACATTATCAGACACAAGTCCTGTGGTCGGCTCTTGTTTGCTACAGTATGTAGCTGTGGGTGAAGAAAAATTGTGCTGTGAAGTAGAAAATCTTATAAATATATCAGAATACTCAATGGTGACGCAGAGTTATGTGTTCTCAACACTGCCGACAGTCTCTGTTGAGTGGATAATAGATGAAAAGTGTTTAAACAATAGTAGCAGTATAAGTGCAAACTTACAAAACTATTATACAAAAATTGCTAGAGCGATATTAAATCTGAAAAGGAAACCTAAAGAGATTGCCGTAGAAGATTTAGCTACAAATACCCGTATTGGTCCCATATTTCCGAACCTCTTCAATCTGGCTGTGTTAGTTCTTAACGATGATAATTTAAATGCCCTGAATGTACCAGCGAAGAAACCATTGCAATCCAATGTGTTGGATATGGTGGATGCTTTGTGTTCCAACCCATGCAGCTTGGACACCAACATACAACAACAGTTTCAAAGATTGTTCCCTGTGATGGTGTCAAATATTTTGGGCAACGGTTCGCTAGCAGAGAAGATGGTTGCCATACTCACTAAAATCACAAGAACTTGGCCATCTTTTATAGCCATAGGTAAAGGCATTCTCTTTGACTACTTATCTCAGGGTACTAAAGAGAGACTTACAGCTCCTATGATCCGATGCGTGCTGGCGTTGGGCCGACACGCGCTGGTCGAGTGTCTGGGGGATCATCTGGAACATTTGGACGACTGCGTGCAGGCGCGTGTGGGCGGCCTCAGGGAGACTTATGACCACTCCCAACTTACTATGCAAACGAGGAGGGACCGGAAGGATGCATCCACCTGTCTATTAAGGTCGGAGCCCACGACGTATTTGGAAGATTACGTAATGACTGACTATACTTTGTACGAGATGCTGGGCGACTCGATAATGCCGAGGAGGACATTATTTCCTGTCAAAGAAACAGCGACAAGCGAAAACACGACGAACGAAATCGCCACCGACCTAGAATTCTCTTTCGTTCCAATAAGACCTTTGATAAGAATACCAAAAGTGAGAAACCTGCCGAAAATAAAAACAACCAAAACACAGAGGGAAGCTTGTTTCGAAGTAACTAAATTTAGTGCTAATAATAAAAAAATACACATCAAAATAAAGAACTGTCGCTCGTTAGACGTCAAGGGAAAAGTTGACAGAACTCAAACGTGTTGTGTTAGAAATACGAGTGGCGTTATCGCTAGTGGACTGTTAAATAGATTTAGGTATAAATTAAACAAACCCACCCCTTTTCCGATATTTGGAGCTTTAACACTGTAA

Protein sequence:

>DPOGS201062-PA
MSNISTSSKNHSKKTNSRERDKNKSIASESGGSSQCDTPPLIEKDATSQGRFAGVGADSVLCMAEQIGAEINTEAATNLAEDVSYKLRQVISTIALHSELMKKSCVDSWDVNTVFTLSDTSPVVGSCLLQYVAVGEEKLCCEVENLINISEYSMVTQSYVFSTLPTVSVEWIIDEKCLNNSSSISANLQNYYTKIARAILNLKRKPKEIAVEDLATNTRIGPIFPNLFNLAVLVLNDDNLNALNVPAKKPLQSNVLDMVDALCSNPCSLDTNIQQQFQRLFPVMVSNILGNGSLAEKMVAILTKITRTWPSFIAIGKGILFDYLSQGTKERLTAPMIRCVLALGRHALVECLGDHLEHLDDCVQARVGGLRETYDHSQLTMQTRRDRKDASTCLLRSEPTTYLEDYVMTDYTLYEMLGDSIMPRRTLFPVKETATSENTTNEIATDLEFSFVPIRPLIRIPKVRNLPKIKTTKTQREACFEVTKFSANNKKIHIKIKNCRSLDVKGKVDRTQTCCVRNTSGVIASGLLNRFRYKLNKPTPFPIFGALTL-