Monarch geneset OGS2.0

DPOGS214796
TranscriptDPOGS214796-TA3480 bp
ProteinDPOGS214796-PA1159 aa
Genomic positionDPSCF300059 - 437592-446788
RNAseq coverage94x (Rank: top 62%)
Annotation
HeliconiusHMEL0172950.085.51% 
BombyxBGIBMGA012107-TA0.075.14% 
DrosophilaHpr1-PA5e-16044.08% 
EBI UniRef50UniRef50_E2ANQ30.051.91%THO complex subunit 1 n=12 Tax=Neoptera RepID=E2ANQ3_CAMFO
NCBI RefSeqXP_393145.20.051.42%PREDICTED: similar to Hpr1 CG2031-PA isoform 1 [Apis mellifera]
NCBI nr blastpgi|3227940080.052.69%hypothetical protein SINV_10936 [Solenopsis invicta]
NCBI nr blastxgi|3227940080.052.41%hypothetical protein SINV_10936 [Solenopsis invicta]
Group
Gene OntologyGO:00071655.4e-12signal transduction
GO:00055155.4e-12protein binding
KEGG pathwayame:4096470.0 
 K12878 (THOC1)maps-> Spliceosome
InterPro domain[69-459] IPR0218611.8e-116THO complex, subunit THOC1
[1092-1158] IPR0110292.1e-12DEATH-like
[1090-1158] IPR0004885.4e-12Death
Orthology groupMCL14100 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214796-TA
ATGTCCGAAAAAGTTGGATTTGATGAGTTGCGGCTAAAGTATAAGGATGTATTATCAAAAGCATTTACTACAAATAATATCGATTTGCTCGATTCCTTTTCAAAAAATGCGAATGAGAGCGACAGAAAATCAGCCATGGACCAAGCCTTTAGAGATAAACTACTGGATTTGCTATTAGAGGAGCCAAATACCTTGGAAAGCTATGTAAATTTTTGTATAGACTCATGTCGAAGGCAAATGGTGACTGCAACTATGCCTGTAGTTCTTTTGGGTGATATTTTCGATGCACTAACACTGAACAAGTGTGAAAAAATGTTTATGTATGTTGAAAATGGAGTAAATATATGGAGGGAAGAATTATTCTTTGTGGCATGTAAGAACCATTTACTAAGAATGTGTAATGACTTACTGAGAAGATTATCAAGATCTCAAAATACGGTATTCTGTGGCAGAATATTGTTATTCTTAGCCAAATTCTTTCCATTTTCTGAGCGCTCTGGGCTTAATATTGTGTCCGAATTCAACTTGGAAAACATTACGGAGTTTGGTGGTGATAATACAAGTACCTTAAAGGATGTATTGGATGAAGAAATGGTTGTTGAAGATGATAAAAATAAATTGGTCATAGATTACAATCTTTACTGTAGATTTTGGAGCTTACAGGACTTCTTTAGGAACCCTAACACATGTTATAATAAATTACAATGGAAAACCTTTGTTGCGCATTCAGGCAGTGTCCTATCAGCTTTCTCCTCATACAAGCTAGAGGCTGTGGAGTTGCAGAAATCGAAATTAAACATACTTAAATCGGTAAACTCGGATGTTGAAATGCAAGAGAACAAAGAGCAACATTACTTTGCAAAGTTTCTAACAAACCAGAAACTTTTGGAGTTACAGTTGTCTGACTCAAATTTCAGGAGATGTGTCCTGATACAGTATCTGATACTCTTTCAATATTTAATGTCGACGGTAAAGTTTAAAATGGAATCTCAGGAATTAAAATCAGATCAGATAGACTGGGTGAAAGACACCACAGCGCTCGTTTATAAGCTCCTCGGTGAAACTCCGCCCGACGGCAAACAGTTTGCTGAATGCGTGAAGAGAATATTGAAGAGGGAAGAACATTGGAACAGTTGGAAGAACGATGGATGTCCAGAATTTCAAAAGCCAAAACCTCCGGTGCAAGCGGAGAACGAGGAGGTGAAGCGTTCGAGGAAACGACGTCGGCCTGTTGGAGATATTATTAAAGAATATTCTGGAACCGACAAGTTCTTCATGGGGAATAATGACCTGACAAAGTTGTGGAACCTCTGTCCGGACAATCTAGCTGCTTGTAGAACAAAGGAAAGAGATTTCATGCCATCACTAGTTAGTATTCTCTACCTTAACCCTTTTAAGCGTTTCAAAATGTCCGAAAAAGTTGGATTTGATGAGTTGCGGCTTAAGTATAAGGATGTATTATCAAAAGCATTTACTACAAATAATATCGATTTGCTCGATTCCTTTTCAAAAAATGCGAATGAGAGCGACAGAAAATCAGCCATGGACCAAGCCTTTAGAGATAAACTACTGGATTTGCTATTAGAGGAGCCAAATACCTTGGAAAGCTATGTAAATTTTTGTATAGACTCATGTCGAAGGCAAATGGTGACTGCAACTATGCCTGTAGTTCTTTTGGGTGATATTTTCGATGCACTAACACTGAACAAGTGTGAAAAAATGTTTATGTATGTTGAAAATGGAGTAAATATATGGAGGGAAGAATTATTCTTTGTGGCATGTAAGAACCATTTACTAAGAATGTGTAATGACTTACTGAGAAGATTATCAAGATCTCAAAATACGGTATTCTGTGGCAGAATATTGTTATTCTTAGCCAAATTCTTTCCATTTTCTGAGCGCTCTGGGCTTAATATTGTGTCCGAATTCAACTTGGAAAATATTACGGAGTTTGGTGGTGATAATACAAGTACCTTAAAGGATGTATTGGATGAAGAAATGGTTGTTGAAGATGATAAAAATAAATTGGTCATAGATTACAATCTTTACTGTAGATTTTGGAGCTTACAGGACTTCTTTAGGAACCCTAACACATGTTATAATAAATTACAATGGAAAACCTTTGTTGCGCATTCAGGCAGCGTCCTATCAGCTTTCTCCTCATACAAGCTAGAGGCTGTGGAGTTGCAGAAATCGAAATTAAATATACTTAAATCGGTAAACTCGGATGTTGAAATGCAAGAGAACAAAGAGCAACATTACTTTGCAAAGTTTCTAACAAACCAGAAACTTTTGGAGTTACAGTTGTCTGACTCAAATTTCAGGAGATGTGTCCTGATACAGTATCTGATACTCTTTCAATATTTAATGTCGACGGTAAAGTTTAAAATGGAATCTCAGGAATTAAAATCAGATCAGATAGACTGGGTGAAAGACACCACAGCCCTTGTTTATAAGCTCCTCGGTGAAACTCCGCCCGACGGCAAACAGTTTGCTGAATGCGTGAAGAGAATATTGAAGAGGGAAGAACATTGGAACAGTTGGAAGAACGATGGATGTCCAGAATTTCAAAAGCCAAAACCTCCGGTGCAAGCGGAGAACGAGGAGGTGAAGCGTTCGAGGAAACGACGTCGGCCTGTTGGAGATATTATTAAAGAATATTCTGGAACCGACAAGTTCTTCATGGGGAATAATGACCTGACAAAGTTGTGGAACCTCTGTCCGGACAATCTAGCTGCTTGTAGAACAAAGGAAAGAGATTTCATGCCATCACTAGAATCATATATGTTATCTGGTGATGGTGGTGAGGGTGCTGGTGGGTGGGGGTGGCGAGCTCTGCGTCTGCTAGCAAGAAGGTCACCACACTTCTTCGTCCACACAAACAATCCCATCGGACGGCTGCCGGATTATCTTGATGATATGGTTAAAAAAATCACTCGTGAAGTGGCCGCCAATAATGTGTCCAATAACACAAACGGTGACTCCAACGTTAACAACGATAAAGCTAAGCTGGAGCAGACTGAAGAAGAACTGACAGAAGAGCAAATAGAAACTGATATTATAAAGGAAGAAGACACGACCGACCTTGAACAGGAGCAGACTGAAGAAGAGCTGACAGAAGAGCAAATAGAAACTGATATTATAAAGGAAGAAGACACGACCGACCTTGAACAGGCACCAGACTCGACCCACGACGACGATAAGCCGACCAGATCTAAGATAACAATGATATCATCCACACAGCTCCAGGCTGTTTCATCCAAGTTGCCCGAGTGGAAGAAACTCGCGGCGAAGCTAGGATACAAACCGGACGAAATACAGTTCTTCGAAACGGAATACACGACGGAGGAGGCGAGGGCCAAGAACATGCTGCAGCTTTGGTTCGACGACGACGAGGACGCCTCCGTCGAGAACCTGCTCTACACCATGGAGGGACTGAAAATGACCGAAGCCTGCGAGGCCTTGAAGAACTCTAAGTGA

Protein sequence:

>DPOGS214796-PA
MSEKVGFDELRLKYKDVLSKAFTTNNIDLLDSFSKNANESDRKSAMDQAFRDKLLDLLLEEPNTLESYVNFCIDSCRRQMVTATMPVVLLGDIFDALTLNKCEKMFMYVENGVNIWREELFFVACKNHLLRMCNDLLRRLSRSQNTVFCGRILLFLAKFFPFSERSGLNIVSEFNLENITEFGGDNTSTLKDVLDEEMVVEDDKNKLVIDYNLYCRFWSLQDFFRNPNTCYNKLQWKTFVAHSGSVLSAFSSYKLEAVELQKSKLNILKSVNSDVEMQENKEQHYFAKFLTNQKLLELQLSDSNFRRCVLIQYLILFQYLMSTVKFKMESQELKSDQIDWVKDTTALVYKLLGETPPDGKQFAECVKRILKREEHWNSWKNDGCPEFQKPKPPVQAENEEVKRSRKRRRPVGDIIKEYSGTDKFFMGNNDLTKLWNLCPDNLAACRTKERDFMPSLVSILYLNPFKRFKMSEKVGFDELRLKYKDVLSKAFTTNNIDLLDSFSKNANESDRKSAMDQAFRDKLLDLLLEEPNTLESYVNFCIDSCRRQMVTATMPVVLLGDIFDALTLNKCEKMFMYVENGVNIWREELFFVACKNHLLRMCNDLLRRLSRSQNTVFCGRILLFLAKFFPFSERSGLNIVSEFNLENITEFGGDNTSTLKDVLDEEMVVEDDKNKLVIDYNLYCRFWSLQDFFRNPNTCYNKLQWKTFVAHSGSVLSAFSSYKLEAVELQKSKLNILKSVNSDVEMQENKEQHYFAKFLTNQKLLELQLSDSNFRRCVLIQYLILFQYLMSTVKFKMESQELKSDQIDWVKDTTALVYKLLGETPPDGKQFAECVKRILKREEHWNSWKNDGCPEFQKPKPPVQAENEEVKRSRKRRRPVGDIIKEYSGTDKFFMGNNDLTKLWNLCPDNLAACRTKERDFMPSLESYMLSGDGGEGAGGWGWRALRLLARRSPHFFVHTNNPIGRLPDYLDDMVKKITREVAANNVSNNTNGDSNVNNDKAKLEQTEEELTEEQIETDIIKEEDTTDLEQEQTEEELTEEQIETDIIKEEDTTDLEQAPDSTHDDDKPTRSKITMISSTQLQAVSSKLPEWKKLAAKLGYKPDEIQFFETEYTTEEARAKNMLQLWFDDDEDASVENLLYTMEGLKMTEACEALKNSK-