Monarch geneset OGS2.0

DPOGS210250
TranscriptDPOGS210250-TA3267 bp
ProteinDPOGS210250-PA1088 aa
Genomic positionDPSCF300196 + 687615-695212
RNAseq coverage73368x (Rank: top 0%)
Annotation
HeliconiusHMEL0146570.084.51% 
BombyxBGIBMGA002381-TA0.097.71% 
DrosophilaHsc70-4-PF0.090.34% 
EBI UniRef50UniRef50_Q9U6390.098.16%Heat shock 70 kDa protein cognate 4 n=125 Tax=cellular organisms RepID=HSP7D_MANSE
NCBI RefSeqNP_001036892.10.097.71%heat shock cognate protein [Bombyx mori]
NCBI nr blastpgi|2700159340.077.79%hypothetical protein TcasGA2_TC002089 [Tribolium castaneum]
NCBI nr blastxgi|2700159340.077.61%hypothetical protein TcasGA2_TC002089 [Tribolium castaneum]
Group
Gene OntologyGO:00055242.2e-87ATP binding
KEGG pathwayapi:1001590650.0 
 K03283 (HSPA1_8)maps-> Endocytosis
    MAPK signaling pathway
    Spliceosome
    Antigen processing and presentation
    Protein processing in endoplasmic reticulum
InterPro domain[435-1086] IPR0010230Heat shock protein Hsp70
[442-1048] IPR0131262.3e-278Heat shock protein 70
[1-421] IPR0193957.3e-154Transmembrane protein 161A/B
Orthology groupMCL10014 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210250-TA
ATGGTATCAGTGATACAGAAATTGGGAAATTATTCATTCGCAAGATGGTTACTGTGTTCCCAAGGATTATATCGATATCTTTATCCTGATAATAATGCACTGAAAACATTAGCTGGTGTTCCTAAGGATAAACCAAAAGGAAAGAAAACAAAAAGTGATTCGAATGGTAAGCCAGAAACGTTTCATGTGCCCCGAAGCCTAGAAATACAACTGGAAACTGCTCCTGTGACTCCACTAGACGTGGTTCATTTGAGATTTTATACAGAGTATATATGGATAGTAGATTTTTCTCTGTATACAGCCTTTGTTTATATTATGTCTGAGGTATACACTGCCTATTTCCCATTAAAAGATGAATTCAATCTCAGCATGGTATGGTGTCTACTGGTAGTGCTGTTTTCATTGTATCCTTTTTTAAATATGAACACAAAGATGCAGGATGGTTGTAGTAAAATCTTACTTTCACTGACCAAACAATATTTCACTAGTGATGAATCCATTGGTGAGCGTTCTACTTGCATAGTGTCATTCTGTGTGTTCCTACTTATAGCAATGATCATGTTAATAGTAGATGAGAGTATCTTAGAGGTTGGCGTGGATCCCGCTTATGATAGCTTTAATGAGAACGCTTCAAAGTTCCTGGAAAATCAAGGGTTGACTTCTGTAGGTCCAGCGTCTAAGTTGATACTGAAGTTCTCATTTGCTGTGTGGGCGGCTATCATTGGGACATTATTTACATTTCCCGGTCTTAGAGTGGCTAGGATGCATTGGGATTCATTAAGGTACTACTCTGAAAACAAAGTGAAGTCTCTTGTGCTAAACATAAACTTTGCAATGCCATTTGTTTTAGCACTATTATGGGTGCGGCCAGTCGCCAGATATTACTTGGCTGTCAGAGTGTTTAGTGGAATGAGTGGACCTATCATGAGCCCTCAGATGTTTGACACACTGCGCATAGTATTGGTCATTCTGACTGTGGTGTTAAGAGTGCTATTGATGCCACGTCAATTGCAGGCCTACCTTGACATGGCACAAAGGAGGTTGGATATACAAAGGAAAGAGGCCGGCAGAATTACTAATGTGGAACTGCAGACTAAGATAGCGTCAGTATTTTTCTACCTCTGTGTTGTAGCACTGCAGTATATATGTCCTATCATCATGTGTCTGTACCTGGCTCTGATGTATAAGACCCTCGGAGGGTATAGTTGGTCTTCTTTGATATATGAGACCGCGGAGAGTGAACCTGTTGTTGTTAATGTAGAAGGAATGGAGCAGTTTCAAATGGCTTGGGAAAACTTGAAAATGATGGCTGCCAAAGCACCCGCAGTAGGTATTGATCTGGGTACCACATACTCGTGCGTGGGAGTATTCCAGCACGGTAAGGTGGAAATCATCGCCAACGACCAGGGCAACAGGACTACGCCCTCTTATGTAGCGTTCACAGACACCGAGCGTCTCATCGGCGATGCCGCTAAGAACCAAGTGGCGATGAACCCTAACAACACTATTTTCGATGCCAAACGACTCATTGGCCGCAAGTTCGAGGACCTTACAGTGCAAGCTGACATGAAACACTGGCCATTCGAAGTGATCAGTGATGGAGGCAAACCAATGATCAAGGTACAATACAAAGGAGAAGACAAGACTTTCTTCCCTGAGGAAGTGAGCTCGATGGTGCTCACAAAGATGAAGGAAACAGCCGAGGCTTACCTCGGCAAAACGGTGCAAAATGCAGTTATAACGGTTCCAGCGTACTTCAACGACTCACAGCGACAGGCCACGAAAGATGCGGGTACCATCTCTGGCCTGAACGTTCTCCGTATCATCAACGAACCGACCGCTGCTGCGATTGCCTACGGTCTTGACAAGAAGGGAGGTGGAGAACGAAACGTCCTTATCTTCGATCTCGGCGGAGGCACCTTCGACGTGTCCATCCTCACCATCGAGGACGGTATCTTCGAAGTGAAGTCCACCGCCGGCGACACGCACTTGGGAGGTGAAGACTTCGACAACCGTATGGTCAACCACTTCGTACAGGAGTTCAAGAGGAAGTACAAGAAAGACCTCACCACCAACAAGAGGGCGCTCCGCAGACTGAGGACGGCCTGCGAGAGAGCGAAGAGGACTCTGTCCTCCTCGACCCAGGCCAGCATCGAAATCGATTCCCTGTTTGAGGGCATCGACTTCTACACTTCCATCACCAGGGCTCGTTTCGAAGAACTGAACGCTGATCTGTTCAGGTCTACCATGGAGCCCGTAGAGAAGTCTCTCCGCGACGCCAAAATGGACAAGTCCCAAATCCACGACATCGTGTTGGTGGGCGGGTCTACTCGCATTCCCAAGGTGCAGAAGCTCCTGCAAGACTTTTTCAACGGCAAGGAGCTGAACAAGTCCATCAACCCCGACGAGGCCGTAGCCTACGGAGCGGCGGTCCAGGCCGCCATCCTGCACGGTGATAAGTCGGAGGAGGTCCAGGATCTGCTGCTGCTGGACGTGACGCCGCTGTCGCTCGGTATCGAGACGGCCGGCGGAGTGATGACCACGCTCATCAAGAGGAACACCACCATCCCCACCAAGCAGACGCAGACCTTCACCACCTACTCCGACAACCAGCCCGGCGTGCTCATCCAAGTGTTCGAGGGCGAGCGTGCCATGACCAAGGACAACAACCTCCTCGGCAAGTTCGAGCTGACCGGCATCCCACCCGCGCCCCGCGGCGTGCCGCAGATCGAGGTCACCTTCGACATTGACGCCAACGGCATCCTGAACGTGTCCGCCGTGGAGAAGTCCACTAACAAGGAGAACAAGATCACCATCACCAACGACAAGGGCCGCCTGTCCAAGGAGGAGATCGAGCGGATGGTGAACGACGCCGAGAAGTACAGGAACGAGGACGAGAAGCAGAAGGAGACCATCCAGGCCAAGAACTCGCTGGAGTCGTACTGCTTCAACATGAAGTCCACCATGGAGGACGAGAAGCTCAAGGAGAAGATCTCTGACGCCGACAAGCAGACCATCCTCGACAAGTGCAACGACACCATCAAGTGGCTGGACTCCAACCAGCTGGCCGACAAGGAGGAGTACGAGCACAAGCAGAAGGAGCTGGAGGGCATCTGCAACCCCATCATCACCAAGATGTACCAGGGAGCCGGCGGTGTGCCCGGCGGTATGCCCGGCGGCATGCCCGGCTTCCCCGGAGGAGCGCCCGGAGCCGGAGGCGCAGCCCCCGGCGGCGGCGCCGGACCCACCATCGAAGAGGTCGACTAA

Protein sequence:

>DPOGS210250-PA
MVSVIQKLGNYSFARWLLCSQGLYRYLYPDNNALKTLAGVPKDKPKGKKTKSDSNGKPETFHVPRSLEIQLETAPVTPLDVVHLRFYTEYIWIVDFSLYTAFVYIMSEVYTAYFPLKDEFNLSMVWCLLVVLFSLYPFLNMNTKMQDGCSKILLSLTKQYFTSDESIGERSTCIVSFCVFLLIAMIMLIVDESILEVGVDPAYDSFNENASKFLENQGLTSVGPASKLILKFSFAVWAAIIGTLFTFPGLRVARMHWDSLRYYSENKVKSLVLNINFAMPFVLALLWVRPVARYYLAVRVFSGMSGPIMSPQMFDTLRIVLVILTVVLRVLLMPRQLQAYLDMAQRRLDIQRKEAGRITNVELQTKIASVFFYLCVVALQYICPIIMCLYLALMYKTLGGYSWSSLIYETAESEPVVVNVEGMEQFQMAWENLKMMAAKAPAVGIDLGTTYSCVGVFQHGKVEIIANDQGNRTTPSYVAFTDTERLIGDAAKNQVAMNPNNTIFDAKRLIGRKFEDLTVQADMKHWPFEVISDGGKPMIKVQYKGEDKTFFPEEVSSMVLTKMKETAEAYLGKTVQNAVITVPAYFNDSQRQATKDAGTISGLNVLRIINEPTAAAIAYGLDKKGGGERNVLIFDLGGGTFDVSILTIEDGIFEVKSTAGDTHLGGEDFDNRMVNHFVQEFKRKYKKDLTTNKRALRRLRTACERAKRTLSSSTQASIEIDSLFEGIDFYTSITRARFEELNADLFRSTMEPVEKSLRDAKMDKSQIHDIVLVGGSTRIPKVQKLLQDFFNGKELNKSINPDEAVAYGAAVQAAILHGDKSEEVQDLLLLDVTPLSLGIETAGGVMTTLIKRNTTIPTKQTQTFTTYSDNQPGVLIQVFEGERAMTKDNNLLGKFELTGIPPAPRGVPQIEVTFDIDANGILNVSAVEKSTNKENKITITNDKGRLSKEEIERMVNDAEKYRNEDEKQKETIQAKNSLESYCFNMKSTMEDEKLKEKISDADKQTILDKCNDTIKWLDSNQLADKEEYEHKQKELEGICNPIITKMYQGAGGVPGGMPGGMPGFPGGAPGAGGAAPGGGAGPTIEEVD-