Monarch geneset OGS2.0

DPOGS205620
TranscriptDPOGS205620-TA1542 bp
ProteinDPOGS205620-PA513 aa
Genomic positionDPSCF300023 - 946865-950213
RNAseq coverage1655x (Rank: top 8%)
Annotation
HeliconiusHMEL0073408e-5678.40% 
BombyxBGIBMGA001132-TA5e-9878.57% 
DrosophilaDsp1-PF1e-7770.33% 
EBI UniRef50UniRef50_Q245372e-7570.33%High mobility group protein DSP1 n=23 Tax=Bilateria RepID=HMG2_DROME
NCBI RefSeqXP_973934.21e-8345.51%PREDICTED: similar to High mobility group protein DSP1 (Protein dorsal switch 1) [Tribolium castaneum]
NCBI nr blastpgi|1892361073e-8245.51%PREDICTED: similar to High mobility group protein DSP1 (Protein dorsal switch 1) [Tribolium castaneum]
NCBI nr blastxgi|1892361072e-11747.79%PREDICTED: similar to High mobility group protein DSP1 (Protein dorsal switch 1) [Tribolium castaneum]
Group
Gene OntologyGO:00055155.4e-25protein binding
GO:00036771.5e-22DNA binding
GO:00056341.6e-18nucleus
KEGG pathwayapi:1001671322e-76 
 K10802 (HMGB1)maps-> Base excision repair
InterPro domain[398-490] IPR0090715.4e-25High mobility group, superfamily
[412-490] IPR0009101.5e-22High mobility group, HMG1/HMG2
[358-380] IPR0001351.6e-18High mobility group, HMG1/HMG2, subgroup
Orthology groupMCL12346 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205620-TA
ATGGGGGAACGGGGCACAACGGGGGGCGCTTGGGGTGCGCGCGACGATGCCTCGTGGTGGACAAGTGGCTCCGGGGAGCTTCAGCATCAACAACAATTGAATGAGGAAATAGCAAGGAGTACTGCCACTGTTACCCATCAGTTATATACATATAAGATGGCTACTGGCTTTTCAAATAATACTGGAAATAATACGTCGCCTGCGTTCGACTATCGTTTAATGACTGGATCTAATCCTAGGGATGAATCTCCTCAGCAACCTTGGTGGTATGCTTCTGGCGGTGTAGAATCACAACAACAAACTTCTTCTCCAACGCAACAGAATCAATCAAGTCCAGACTCGGATCAAAGTAACCACCAGACAAACACAATTCAACAAAACCATCAGGCACTTCAGCAGGCACAAGAACAGGCTCAGCTACAACAAAGCCAGTTGCAACAATTACAGCAACAACAACAAAGCCTGCAACAGGCATTACAGCAGCAAAATCAGAGTCTGCAACAGTCTTTACAACAGCAAAGTCAACAACAAACTCTGCAACAGATGCTACAACAGCAACAGCAGCAACATCAACAGCAACATCAACAGCAGCAACATCAGCAACAACAACAACAACAACATCAGCAGCAACAACAACAGCAGCAGCAGCAACAACAACATCAGCAGCAGCATCAACAACAGCAACACCAACAGCAACATCATCAACAGCAACAACAACAGCAAAATCAACAGCTTGAAATACAAGTTAGCCAGGCCCAAGCTAAAGCACTTGCTCAGGCAGCATTACAACAACAAGTTGCTCAAACATTGCAACAACATCAGCACAACTTAGAAGTTGCTCAACAGCAACAAATCCAAGCTGCTCTTCAAAGACAATCAGCTACATTGCAGGAATTGCAGCAACAGGCTCAACAGCAAGCATTGCTAGCTCAAGCTACAAATAGATCCAGAATGCCCAGAGCTCGTGCATACAATAAGCCCAGAGGCAGAATGACAGCTTACGCTTTCTTCGTACAGACCTGTAGAGAAGAACATAAGAAGAAACATCCAGAAGAAAATGTTATATTTGCTGCCTTCTCCAAGAAGTGTGCCGAGAGGTGGAATACAATGTCTGAGAAGGAGAAGCAACGTTTCCATGAAATGGCTGAACAAGACAAGAAGCGTTATGAGTTGGAAATGGCGAGTTATGTGCCTCCCAAGGACGTTAAGATCAGAGGCAGGAAGAGGCATGCACTGAAGGATCCCAACGCACCCAAGAGATCTCTATCAGCGTTTTTCTGGTTCTGCAACGACGAGCGTTCGAAGGTGAAGGCCAACAATCCAGAGTTCACGATGGGTGATATAGCCAAGGAGCTAGGAAGACGATGGGCGGCCGCGGAGCCCGACACCAAATCCAAATACGAGTCGCTCAGTGAACAGGACAAGGCTAGGTATGATAGGGAAATGACGGCATATAAGAAGGGACCATTGTCATTAGTCCAGCCACAACAGCAAGAGGTCGAGGAAGAAGGAGATTTTGATGGAGAAGAAGAATATAAGTAG

Protein sequence:

>DPOGS205620-PA
MGERGTTGGAWGARDDASWWTSGSGELQHQQQLNEEIARSTATVTHQLYTYKMATGFSNNTGNNTSPAFDYRLMTGSNPRDESPQQPWWYASGGVESQQQTSSPTQQNQSSPDSDQSNHQTNTIQQNHQALQQAQEQAQLQQSQLQQLQQQQQSLQQALQQQNQSLQQSLQQQSQQQTLQQMLQQQQQQHQQQHQQQQHQQQQQQQHQQQQQQQQQQQQHQQQHQQQQHQQQHHQQQQQQQNQQLEIQVSQAQAKALAQAALQQQVAQTLQQHQHNLEVAQQQQIQAALQRQSATLQELQQQAQQQALLAQATNRSRMPRARAYNKPRGRMTAYAFFVQTCREEHKKKHPEENVIFAAFSKKCAERWNTMSEKEKQRFHEMAEQDKKRYELEMASYVPPKDVKIRGRKRHALKDPNAPKRSLSAFFWFCNDERSKVKANNPEFTMGDIAKELGRRWAAAEPDTKSKYESLSEQDKARYDREMTAYKKGPLSLVQPQQQEVEEEGDFDGEEEYK-