Monarch geneset OGS2.0

DPOGS206093
TranscriptDPOGS206093-TA972 bp
ProteinDPOGS206093-PA323 aa
Genomic positionDPSCF300028 + 127366-128337
RNAseq coverage1600x (Rank: top 8%)
Annotation
Heliconius% 
BombyxBGIBMGA006819-TA2e-14881.60% 
DrosophilaRad23-PA1e-7443.03% 
EBI UniRef50UniRef50_D2CZY14e-14681.60%Nuclear excision repair protein Rad23 n=4 Tax=Coelomata RepID=D2CZY1_BOMMO
NCBI RefSeqNP_001164652.17e-14781.60%nuclear excision repair protein rad23 [Bombyx mori]
NCBI nr blastpgi|2839454821e-14581.60%nuclear excision repair protein rad23 [Bombyx mori]
NCBI nr blastxgi|2839454823e-14081.60%nuclear excision repair protein rad23 [Bombyx mori]
Group
Gene OntologyGO:00056342.7e-30nucleus
GO:00062892.7e-30nucleotide-excision repair
GO:00431619.6e-28proteasomal ubiquitin-dependent protein catabolic process
GO:00036849.6e-28damaged DNA binding
GO:00055154e-20protein binding
KEGG pathwayphu:Phum_PHUM5777501e-93 
 K10839 (RAD23, HR23)maps-> Nucleotide excision repair
    Protein processing in endoplasmic reticulum
InterPro domain[219-241] IPR0048062.7e-30UV excision repair protein Rad23
[189-253] IPR0153609.6e-28XPC-binding domain
[6-77] IPR0006264e-20Ubiquitin
[104-176] IPR0090605.4e-15UBA-like
[136-170] IPR0004492.6e-11Ubiquitin-associated/translation elongation factor EF1B, N-terminal
[135-172] IPR0159406e-10Ubiquitin-associated/translation elongation factor EF1B, N-terminal, eukaryote
[191-234] IPR0066366.4e-08Heat shock chaperonin-binding
Orthology groupMCL12480 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206093-TA
ATGTTACTGACTTTAAAAACTCTTCAACAACAAACTTTTCAGATAGAAATAGACCCTCAAGAGACTGTCAAAGCGCTGAAGCTTAAAATTGAAGTGGAAAAAGGTAAAGATTATGCGGCAGACAATCAAAGACTTATTTATGCGGGTAAGATTCTTCTCGATGATAATAAACTGCATACGTATAATATAGATGAAAAGAAATTCATCGTTATCATGGTGACAAAACCGAAAACTTCCGATAATCAACAAGCGTCGTCGACCTCGGCGCCAGAAGCCGGGGAAAGTGCGTCAACTGAAAGCGGCGACGGCAAAAGTAAAGTGGTCGAAGAAAAACCAAAACCTCAGCCTGCAGCGGAACCTGAGCGTGCATCTGAACCACCAGTAACATCAAATGAGCCAGATTTTGAATCAACAGTGCAAAGCATTATGGATATGGGCTATAATAGACAACAAGTAGAACAAGCTCTCCGTGCTTCATTTAATAATCGTGAAAGAGCTGTAGAATATCTCATAACAGGAATCCCTGAAGAGCTACTTCAAGAACAGGAAGCTGAAGAAAGTGCTGATGAAGACCCCTTGGGCTTTCTTCGGGATCAACCACAGTTTCAACAAATGCGTGCAGTGATTCAACAGAATCCTAATTTACTGAATACTGTGTTGCAGCAAATTGGTCAAACAAACCCAGCTTTACTTCAAGCTATTAGTCAGCATCAACAGGCTTTTGTGAGAATGTTGAATGAGCCTGTGAACCCATCTGCAGCTGGAGCCGTAGCCGAGGAGGCAGTGCCTGACAATCCAGTGCCCCAACAGCCTCAAAATGTTATTCAAGTATCTCCTCAAGACAAAGAGGCTATTGAAAGATTAAAAGCCTTAGGTTTTCCAGAACATATGGTTATTCAAGCGTATTTTGCGTGTGAGAAAAACGAAAATCTTGCTGCAAATTTCTTGTTGTCGCAAAATTTTGATGATTAA

Protein sequence:

>DPOGS206093-PA
MLLTLKTLQQQTFQIEIDPQETVKALKLKIEVEKGKDYAADNQRLIYAGKILLDDNKLHTYNIDEKKFIVIMVTKPKTSDNQQASSTSAPEAGESASTESGDGKSKVVEEKPKPQPAAEPERASEPPVTSNEPDFESTVQSIMDMGYNRQQVEQALRASFNNRERAVEYLITGIPEELLQEQEAEESADEDPLGFLRDQPQFQQMRAVIQQNPNLLNTVLQQIGQTNPALLQAISQHQQAFVRMLNEPVNPSAAGAVAEEAVPDNPVPQQPQNVIQVSPQDKEAIERLKALGFPEHMVIQAYFACEKNENLAANFLLSQNFDD-