Monarch geneset OGS2.0

DPOGS210520
TranscriptDPOGS210520-TA1431 bp
ProteinDPOGS210520-PA476 aa
Genomic positionDPSCF300186 + 232933-236942
RNAseq coverage217x (Rank: top 45%)
Annotation
HeliconiusHMEL0163370.071.86% 
BombyxBGIBMGA012623-TA0.073.09% 
Drosophilamrn-PA4e-17763.40% 
EBI UniRef50UniRef50_Q9VUR15e-17563.40%Marionette n=25 Tax=Eumetazoa RepID=Q9VUR1_DROME
NCBI RefSeqXP_971121.10.068.34%PREDICTED: similar to TFIIH basal transcription factor complex p52 subunit [Tribolium castaneum]
NCBI nr blastpgi|910941910.068.34%PREDICTED: similar to TFIIH basal transcription factor complex p52 subunit [Tribolium castaneum]
NCBI nr blastxgi|910941910.068.63%PREDICTED: similar to TFIIH basal transcription factor complex p52 subunit [Tribolium castaneum]
Group
Gene OntologyGO:00056342.2e-234nucleus
GO:00062812.2e-234DNA repair
GO:00063552.2e-234regulation of transcription, DNA-dependent
KEGG pathwaytca:6597510.0 
 K03144 (TFIIH4)maps-> Basal transcription factors
    Nucleotide excision repair
InterPro domain[10-474] IPR0045982.2e-234Transcription factor Tfb2
Orthology groupMCL10575 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210520-TA
ATGTCAGAGTCATCTAAGTCAAAGTCTTCTTCAAGTACAAATCCTTCCCCGACACTACAATGCAAGGATCTTCACGAATATTTGAAAAGTCGCTCGCCTCAGTTCCTCGAAACCTTATACAATTATCCCACAATATGCTTAGCGGTATATCGAGAACTGCCTGAGTTAGCCCGACATTTTGTGATCCGATTGCTGTTTGTAGAACAACCTGTCCCTCAAGCAGTGGTTGCTTCTTGGGTCACACAGACTCATGCCAAAGAGCAACACAAAGCTTGCGAGGCTCTGTCGGAGCTGTCTGTGTGGCAGGAGGCTCCTATCCCTGGAGGTCTGCCGGGATGGATGCTTTCACAGTCTTTTAAGAAGAATCTGAAAGTTGCTCTTCTTGGAGGCGGTCGTCCATGGAGCATGTCATCCTCTCTGGAACCGGATGGCAAGGCTCGTGATGTGTCATTCCTGGACGCGTACGCTCTTGAACGCTGGGAATGTGTTCTGCATTACATGGTGGGATCAACACAGACCGAGGGGATCAGTGCTGATGCTGTCAGGATACTGCTACAGGCCGGACTCATGAACAGAGATGCAGAAGATGGCACAGCTGTTATAACCAGAGCTGGATTCCAGTTTCTGCTCCTCAGCACAGCTAAACAGGTATGGCTGTTTCTCCAACACTACCTGCACACGGCGGAAAAACGCAGTCTCAGCGCCGCGGAGTGCCTCGCCTTCCTGTACCAGCTCAGCTTCAGCACACTCGGCAAGGATTACAGCACGGAGGGTATGAGTAACAACATGCTGGTGTTCCTCCAGCACCTGAGGGAGTTTGGGCTCGTCTACCAGAGGAAGCGTAAAGCGGGTCGATTCTACCCGACCCGTCTGGCGCTGAACATCACGTGTGTGAAGGACGGCGTGGCTCCCCTGCAGACGGCCGCCAGCTCGGGCTACATCATCGCGGAGACCAACTACCGGGTGTACGCCTACACCACCAGCGCCCTGCAGGTGGCGCTGCTGGGACTCTTCACTGAACTAGTCTACAGGTTCCCTAACGTGGTGGTAGGCGTCCTGACCCGCGAGTCCGTGCGCGCTGCCCTCCGCGGCGGTATCTCCGCCCAGCAGATCATCACCTACCTGGAGCAGCACTCTCACCCCCAGATGCTGAAATCCGACCAGGGCGGCATCAGGTCGTCGTCGTCGCTGCCGCCGACCGTCCTCGACCAGATCCGTCTGTGGGAGTCGGAGAGGAACCGCTTCACGTACACGGAGGGCGTCGTCTACAACCAGTTCCTGTCCCAGGCCGAGTTCAACGTGCTGAGGGACTACGGCCGCTCGTCCGGCGCGCTAGTGTGGGCGGCGGACAGGACGCGCACCATGGTGGTGGCCAGGGCGGCGCACGACGACGTCAAGAGATACTGGAAGAGATACTCCAAGGCCACTTAG

Protein sequence:

>DPOGS210520-PA
MSESSKSKSSSSTNPSPTLQCKDLHEYLKSRSPQFLETLYNYPTICLAVYRELPELARHFVIRLLFVEQPVPQAVVASWVTQTHAKEQHKACEALSELSVWQEAPIPGGLPGWMLSQSFKKNLKVALLGGGRPWSMSSSLEPDGKARDVSFLDAYALERWECVLHYMVGSTQTEGISADAVRILLQAGLMNRDAEDGTAVITRAGFQFLLLSTAKQVWLFLQHYLHTAEKRSLSAAECLAFLYQLSFSTLGKDYSTEGMSNNMLVFLQHLREFGLVYQRKRKAGRFYPTRLALNITCVKDGVAPLQTAASSGYIIAETNYRVYAYTTSALQVALLGLFTELVYRFPNVVVGVLTRESVRAALRGGISAQQIITYLEQHSHPQMLKSDQGGIRSSSSLPPTVLDQIRLWESERNRFTYTEGVVYNQFLSQAEFNVLRDYGRSSGALVWAADRTRTMVVARAAHDDVKRYWKRYSKAT-