Monarch geneset OGS2.0

DPOGS209104
TranscriptDPOGS209104-TA1242 bp
ProteinDPOGS209104-PA413 aa
Genomic positionDPSCF300268 - 187477-190788
RNAseq coverage1x (Rank: top 94%)
Annotation
HeliconiusHMEL0144641e-13160.92% 
BombyxBGIBMGA002252-TA2e-6452.30% 
Drosophila% 
EBI UniRef50UniRef50_D6X0483e-0922.00%Putative uncharacterized protein n=3 Tax=Tribolium castaneum RepID=D6X048_TRICA
NCBI RefSeqXP_968384.24e-1022.00%PREDICTED: similar to Huntington disease gene homolog [Tribolium castaneum]
NCBI nr blastpgi|1892406238e-0922.00%PREDICTED: similar to Huntington disease gene homolog [Tribolium castaneum]
NCBI nr blastxgi|3320293017e-0921.35%Huntingtin [Acromyrmex echinatior]
Group
KEGG pathwaytca:6567841e-09 
 K04533 (HD)maps-> Huntington's disease
Orthology groupMCL12692 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209104-TA
ATGTATGTTTGTCTCCAGGAGGTGTCCAGTGTTCACGAGAAGCTTCTCAAGTCGTTATCATCTGGGGCGGCTGCGGTCCGTCAGGCGGCGCTGAGGGGCTGGCTGTTACAGCTGGCGGGGAGAGGGCCGGGGGACGCCGCCGGGGCGCTCCGGGACAGTGTGAGGTGTGGAGGGGATAACAGGCTATCACCACACGAGCAGAGTCTCAACTGGTCGGTACTGTTCACACTAGTGGAGCTGGGTCACAGTGATCTAATGCATACAGCTGTTGACTTTGTATTGAATAAACCTAGACATTATTGTACGGATCTTGTTGTTAAGGGTATAACTTCCGTGCTGCGGCAACAAGTCCTGTCGAAGGACCTCAAGAAGTCTATCATAGAAAAACTGCTGGACAATATGAAGATGTACTCGGAGCACCACGCGGTTCAGATACTGATGGTGCATTTGTTCTCTGCGGACAGTAAGCTGATAAGTCCAAGATTCGAAACCGACGTGTCCAACATGGACCCGGACGTGCTGATGAACTCAATGGAACGCATAACGCTGCTGTACAAAGTCCTGAAGCAGTGTAAATATAGAGAGAACCAACAGATCTGTACCGCGACACTGAAATATTTCCTGCGAGAGACTCTACCGCCGGCCGCGACTCTGAGTAGAGTCGTGATAGAGTATTTGGAGTGCTGCAAGGAAACGGAAAGGCTAAATATGACTGCACTCAAAGAATTCAACAATAACATAGAATGTGCTATCATGAACGCTGATATTGTGTTCGAGGTATTCAACACCTCTATATCTCAAGATCAATTGCCAGTTCTAAGCGGTTGGATATTTGAAGCTCTCTGTCATTTACTCTCAGGGAAGATATCACATAAGCTGGTCCCGTACTGTTTGCTCACGTTATTGGTGTCCGCATCCGCCAACGCCAACATAAGGACGTTACATCCGCTAACATATTACATATTCAGACAGGGACTGCATAATAATTCCATGTACATGAGGAATAACACGGATGAAAATGATAAAAATGATGTTTTAACCCCGAGTAGGACTGATTCGGGCCAAAACTGGGGTATATTTGGTGATTTTACAAGAAATACGCCAATGTCGTTCACCGACAGACGTCTTCTATGCATAGTGGCTCTGCATTCAAATTTCAGCTCCAACCAACTGGAGAGGTTGAAGCAACTGTGCGAAGGGAACGAGTTTTTGGGAGATCTGATGAGATGTTTAACGGAATAA

Protein sequence:

>DPOGS209104-PA
MYVCLQEVSSVHEKLLKSLSSGAAAVRQAALRGWLLQLAGRGPGDAAGALRDSVRCGGDNRLSPHEQSLNWSVLFTLVELGHSDLMHTAVDFVLNKPRHYCTDLVVKGITSVLRQQVLSKDLKKSIIEKLLDNMKMYSEHHAVQILMVHLFSADSKLISPRFETDVSNMDPDVLMNSMERITLLYKVLKQCKYRENQQICTATLKYFLRETLPPAATLSRVVIEYLECCKETERLNMTALKEFNNNIECAIMNADIVFEVFNTSISQDQLPVLSGWIFEALCHLLSGKISHKLVPYCLLTLLVSASANANIRTLHPLTYYIFRQGLHNNSMYMRNNTDENDKNDVLTPSRTDSGQNWGIFGDFTRNTPMSFTDRRLLCIVALHSNFSSNQLERLKQLCEGNEFLGDLMRCLTE-