Monarch geneset OGS2.0

DPOGS204614
TranscriptDPOGS204614-TA1641 bp
ProteinDPOGS204614-PA546 aa
Genomic positionDPSCF300432 - 51913-58380
RNAseq coverage3366x (Rank: top 4%)
Annotation
HeliconiusHMEL0101522e-13557.29% 
BombyxBGIBMGA012182-TA0.082.10% 
DrosophilaTcp-1eta-PA0.077.61% 
EBI UniRef50UniRef50_Q998320.074.29%T-complex protein 1 subunit eta n=73 Tax=Tetrapoda RepID=TCPH_HUMAN
NCBI RefSeqXP_001656981.10.078.60%chaperonin [Aedes aegypti]
NCBI nr blastpgi|1571370240.078.60%chaperonin [Aedes aegypti]
NCBI nr blastxgi|1951076030.077.21%GI23651 [Drosophila mojavensis]
Group
Gene OntologyGO:00064574e-263protein folding
GO:00055244e-263ATP binding
GO:00510824e-263unfolded protein binding
GO:00442678.9e-149cellular protein metabolic process
KEGG pathway 
InterPro domain[1-547] IPR0024230Chaperonin Cpn60/TCP-1
[1-547] IPR0127200T-complex protein 1, eta subunit
[38-54] IPR0179986.6e-31Chaperone, tailless complex polypeptide 1
Orthology groupMCL14941 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204614-TA
ATGTATGTTATAGCTATGCAACCGCAGATATTAGTTCTGAAGGAGGGGACGGATCAGTCCCAGGGCAAGCCCCAACTGGTCTCCAACATCAATGCCTGTCAGTTAGTTGTAGATGCGGTCCGCACGACGTTGGGGCCGAGGGGCATGGACAAGCTCATTGTAGACCACGGTGGAAAAGCTGTTATATCTAACGATGGGGCCACCATTATGAAGCTGTTAGATGTCGTCCATCCAGCTGCGAAGACCCTCGTCGACATAGCCAAGTCCCAGGATGCTGAGGTCGGAGACGGCACCACATCCGTTGTGATCCTGGCCGGGGAGCTGTTGAAGAGATTGAAGCCATTTGTTGAAGAAGGAGTCCACCCTCGGATAATAGTCCGGGCTGTCCGCAGTGCTGCAAAGTTGGCTGTCGACAAAGTCAAGGAGTTGGCTGTCAAGATTGAGAGCAAATCACCAGAAGAACAGAGGGAACTCCTCCGCAAGTGTGCGGTCACAGCTATGTCCTCGAAGCTGATCCACCAGCAGAAGGATCACTTCTCCAAGATGGTTGTGGATGCTGTACTGTCCCTGGATGCTCCACTCTTGCCGTTGGACATGATTGGCATCAAGAAGGTGTCCGGGGGAGCGCTGGAAGACTCATTCCTGGTAGCCGGTGTGGCTTTCAAGAAGACCTTCTCATACGCCGGCTTCGAGATGCAGCCCAAGAGCTACACCAACTGCAAGATCGCTCTCCTGAACATTGAACTGGAGCTGAAGGCTGAGAGGGACAACGCCGAAGTCCGAGTAAATAACGTCGAGGAGTACCAGAAAGTGGTGGACGCCGAGTGGAGGATTCTGTACGACAAGCTCGGAGCCCTGCATGCCAGCGGGGCACAGGTTGTGCTCAGTAAACTGCCCATAGGTGACGTCGCCACCCAATATTTTGCTGACAGGGACATGTTTTGCGCTGGTCGCGTAACAGACGAGGACCTCCGTCGCACCCAGCGCGCCTGCGGGGGCGCAGTCCTCAGTTCCGTCAGGGAAATCCGCCCCGAGTCCCTGGGCTCCTGCGAGGCCTTCGTGGAACGCCAGGTGGGAGGCGAGCGGTACAACGTGTTCACTGGTTGCCCGGCTGCGAAGACCTGCACCATCATACTGAGAGGAGGAGCCGAGCAGTTCCTAGAGGAGACGGAGCGCTCCCTGCACGACGCCATCATGATAGTCAGGAGAACCATTAAGAATGATGCTGTGGTGGCCGGCGGGGGCGCCATAGACATGGAACTGTCGAAGCATCTCCGTGATCACTCCAAGAGCATCGCTGGCAAGGAGCAGCTGCTGCTGTCGGCCGTGGCCAGAGCCTTCGAGGCCATACCCAGGCAGCTCTGCGACAACGCCGGCTTCGACGCCACCAACCTGCTGAACAAGCTGCGGCAGAAGCATCATCAGGGTGAACCGTGGTATGGTGTGGACATCCAGAAGGAAGACATAGCTGACAACTTCGAAGCCTGCGTCTGGGAACCAGCTGTAGTGAAGATCAACGCTATAACAGCTGCCTGTGAAGCGGCGGCCCAGATCCTGTCCATCGATGAGACCATCAAGAATGCGAAGAGTGGTGAACCCCAAATGCCGGGGCGGGGTATGGGACGGCCGAGGATGGGTTGA

Protein sequence:

>DPOGS204614-PA
MYVIAMQPQILVLKEGTDQSQGKPQLVSNINACQLVVDAVRTTLGPRGMDKLIVDHGGKAVISNDGATIMKLLDVVHPAAKTLVDIAKSQDAEVGDGTTSVVILAGELLKRLKPFVEEGVHPRIIVRAVRSAAKLAVDKVKELAVKIESKSPEEQRELLRKCAVTAMSSKLIHQQKDHFSKMVVDAVLSLDAPLLPLDMIGIKKVSGGALEDSFLVAGVAFKKTFSYAGFEMQPKSYTNCKIALLNIELELKAERDNAEVRVNNVEEYQKVVDAEWRILYDKLGALHASGAQVVLSKLPIGDVATQYFADRDMFCAGRVTDEDLRRTQRACGGAVLSSVREIRPESLGSCEAFVERQVGGERYNVFTGCPAAKTCTIILRGGAEQFLEETERSLHDAIMIVRRTIKNDAVVAGGGAIDMELSKHLRDHSKSIAGKEQLLLSAVARAFEAIPRQLCDNAGFDATNLLNKLRQKHHQGEPWYGVDIQKEDIADNFEACVWEPAVVKINAITAACEAAAQILSIDETIKNAKSGEPQMPGRGMGRPRMG-