Monarch geneset OGS2.0

DPOGS206449
TranscriptDPOGS206449-TA1920 bp
ProteinDPOGS206449-PA639 aa
Genomic positionDPSCF300070 - 289646-298277
RNAseq coverage108x (Rank: top 60%)
Annotation
HeliconiusHMEL0119408e-11775.78% 
BombyxBGIBMGA005377-TA2e-13572.38% 
DrosophilaSsl1-PA2e-12155.12% 
EBI UniRef50UniRef50_Q9VNP83e-11955.12%GH08526p n=23 Tax=Metazoa RepID=Q9VNP8_DROME
NCBI RefSeqXP_001845822.15e-13162.64%TFIIH basal transcription factor complex p44 subunit [Culex quinquefasciatus]
NCBI nr blastpgi|1700359361e-12962.64%TFIIH basal transcription factor complex p44 subunit [Culex quinquefasciatus]
NCBI nr blastxgi|1571150054e-13062.33%btf [Aedes aegypti]
Group
Gene OntologyGO:00062813.9e-85DNA repair
GO:00063553.9e-85regulation of transcription, DNA-dependent
GO:00082703.9e-85zinc ion binding
GO:00056347.9e-27nucleus
GO:00055155.5e-12protein binding
KEGG pathwaycqu:CpipJ_CPIJ0042421e-130 
 K03142 (TFIIH2)maps-> Basal transcription factors
    Nucleotide excision repair
InterPro domain[67-257] IPR0071983.9e-85Ssl1-like
[525-624] IPR0045957.9e-27TFIIH C1-like, C-terminal
[60-238] IPR0020355.5e-12von Willebrand factor, type A
Orthology groupMCL10695 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206449-TA
ATGGCAGATGATGAGCAGGATCCTAAAGAATATAGATGGGAGACGGGTTATGAGAAAACTTGGGAAGCTATCAAGGAAGATGAAGATGGTTTGGTGGAAGGCCTTGTCGCCGAATTCGCACAAAAAGCAGCTCGAAAAGCTATTGCTCCCAAACGTGGGCCGGTTCGTCTTGGTATGATGAGACATTTGCTGGTGGCAATAGACTGTTCCGAAGCCATGACATCTCAGGACTTAAAGCCTACGAGATTTCTTTGCACCTTAAAGCTTCTTGAAAAATTTGTGGAAGAATTTTTTGATCAAAATCCTTTAAGTCAACTTGGTATAGTTACAATGAAAAATAAAAGGGCTGAAAAAATTACAGAATTATCAGGAAATGTGAGAAAGCATATCAAAGCTGTTCAAGGTCTGTCAAATCTTGCATTAACGGGAGAGCCCTCTCTGCAGAACACTCTGGAGTTAGCAGGAAGAACATTGAGGCCGTTACCGGGACATGCATCAAGGGAGCTCCTGGTTTTATTTGGTTCACTTACAACATGTGACCCAGGAGATATAACTACCACCATACAGACTTTAAAAACGGATGGCATTAGATGTTCCATAATTGGTCTTGCAGCGGAGGTCAGGATATGTAAGAAATTGTGCCAAGATACTGGTGGTGAATACGGGGTGGTATTAGATGATGTGCATTATCGCTCGTTGCTGTTAGAACAGACCTCCCCTCCAGCCAGAGCGCGCGCTCTTGACGCAGGTCTTGTTAAAATGGGCTTTCCTCATACACCTCAACCCTCATCTGACCAGCCCGAGTCAGATCCACCGATAACATTATGCATGTGTCACCTTGAGGAGGGCGAGGGTGTGGTTGGCGAGGGTCATCTTTGTCCTCAATGTCGCAGCAAGTACTGTTCACTGCCAGCACAGTGTCGAACATGTGGACTGACACTCGCGTCAGCACCACATCTAGCCAGGTCATACCACCATCTATTCCCTGTGGAGCCTTTCGAGGAGTTACCAAATGAGGGTCAAGCTCAATTCTGTTTTGGCTGCCTCAGATCATTCACGGAAAATGATAAACAGGTTATATATTGTTACATCAGAACATTGGTAACTATTCTGTCAAATCTTGCATTAACGGGAGAGCCCTCTCTGCAGAACACTCTGGAGTTAGCAGGAAGAACATTGAGGCCGTTACCGGGACACGCATCAAGGGAGCTCCTGGTTTTATTTGGTTCACTTACAACATGTGACCCAGGAGACATAACTACCACCATACAGACTTTAAAAACGGATGGCATTAGATGTTCTATAATTGGTCTTGCAGCGGAGGTCAGGATATGTAAGAAATTGTGCCAAGATACTGGTGGTGAATACGGGGTGGTATTAGATGATGTGCATTATCGCTCGTTGCTGTTAGAACAGACCTCCCCTCCAGCCAGAGCGCGCGCTCTTGACGCAGGTCTTGTTAAAATGGGCTTTCCTCATACACCTCAACCCTCATCTGACCAGCCCGAGTCAGATCCACCGATAACATTATGCATGTGTCACCTTGAGGAGGGCGAGGGTGTGGTTGGCGAGGGTCATCTTTGTCCTCAATGTCGCAGCAAGTACTGTTCACTGCCAGCACAGTGTCGAACATGTGGACTGACACTCGCGTCAGCACCACATCTAGCCAGGTCATACCACCATCTATTCCCTGTGGAGCCTTTCGAGGAGTTACCAAATGAGGGTCAAGCTCAATTCTGTTTTGGCTGCCTGAGATCATTCACGGAAAATGATAAACAGATATATCGTTGTCGTCGTTGTACCGAATTCTATTGCTGGGAGTGTGAAGGTGTGGTGTCAAGTACACTTCACGTTTGTGGTTCCTGTGCCTCCAGACCGCAGCTCTACCAGAGACTACCAACGGCTGAAGGGAGAGACTGA

Protein sequence:

>DPOGS206449-PA
MADDEQDPKEYRWETGYEKTWEAIKEDEDGLVEGLVAEFAQKAARKAIAPKRGPVRLGMMRHLLVAIDCSEAMTSQDLKPTRFLCTLKLLEKFVEEFFDQNPLSQLGIVTMKNKRAEKITELSGNVRKHIKAVQGLSNLALTGEPSLQNTLELAGRTLRPLPGHASRELLVLFGSLTTCDPGDITTTIQTLKTDGIRCSIIGLAAEVRICKKLCQDTGGEYGVVLDDVHYRSLLLEQTSPPARARALDAGLVKMGFPHTPQPSSDQPESDPPITLCMCHLEEGEGVVGEGHLCPQCRSKYCSLPAQCRTCGLTLASAPHLARSYHHLFPVEPFEELPNEGQAQFCFGCLRSFTENDKQVIYCYIRTLVTILSNLALTGEPSLQNTLELAGRTLRPLPGHASRELLVLFGSLTTCDPGDITTTIQTLKTDGIRCSIIGLAAEVRICKKLCQDTGGEYGVVLDDVHYRSLLLEQTSPPARARALDAGLVKMGFPHTPQPSSDQPESDPPITLCMCHLEEGEGVVGEGHLCPQCRSKYCSLPAQCRTCGLTLASAPHLARSYHHLFPVEPFEELPNEGQAQFCFGCLRSFTENDKQIYRCRRCTEFYCWECEGVVSSTLHVCGSCASRPQLYQRLPTAEGRD-