Monarch geneset OGS2.0

DPOGS210135
TranscriptDPOGS210135-TA3183 bp
ProteinDPOGS210135-PA1060 aa
Genomic positionDPSCF300261 - 69199-73787
RNAseq coverage1326x (Rank: top 10%)
Annotation
HeliconiusHMEL0116070.084.90% 
BombyxBGIBMGA003759-TA0.078.41% 
DrosophilaSpt5-PA0.066.26% 
EBI UniRef50UniRef50_Q9V4600.066.26%Transcription elongation factor SPT5 n=16 Tax=Metazoa RepID=SPT5H_DROME
NCBI RefSeqXP_001604079.10.068.06%PREDICTED: similar to GA20489-PA [Nasonia vitripennis]
NCBI nr blastpgi|3407176760.068.13%PREDICTED: transcription elongation factor SPT5-like isoform 2 [Bombus terrestris]
NCBI nr blastxgi|3407176740.066.78%PREDICTED: transcription elongation factor SPT5-like isoform 1 [Bombus terrestris]
Group
Gene OntologyGO:00327840regulation of transcription elongation, DNA-dependent
GO:00063570regulation of transcription from RNA polymerase II promoter
GO:00329684.4e-13positive regulation of transcription elongation from RNA polymerase II promoter
KEGG pathway 
InterPro domain[1-1060] IPR0170710Transcription elongation factor Spt5
[234-320] IPR0051007.5e-25Transcription elongation factor Spt5, NGN domain
[522-583] IPR0089918.8e-20Translation protein SH3-like
[133-228] IPR0225816e-15Spt5 transcription elongation factor, N-terminal
[232-323] IPR0066454.4e-13Transcription antitermination protein, NusG, N-terminal
[531-558] IPR0058248.8e-07KOW
Orthology groupMCL11512 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210135-TA
ATGTCGGACTCGGAGGGCAGTAATTACTCCGGGAGTGGCTCGGACGCAGGTAGTGTTGTGTCTAATCGGTCCAGACGCAGCGCTGCATCAAATCGCTCTGCTAAGTCGGGATCTCGCTCCCGATCTCATTCTGGCAGCCGTAGTCCCTCGAGATCACCTTCAAGGTCACGATCGAGGTCCAGGTCACGTTCTCGCTCACGATCCAGAAGCCGTTCCGCTGGTTCCGATGGCAGCCGAAACAGGGATGATGAGGCTAAGGAGGCTTCTGGTGATGAAGAAGTTGAGGATGAGCAAGAGCCCGAAGGGGAGGACCTGGTGGACTCGGAAGAGTATGATGAGGACGAGGAAGAGGAACGACGTAGGAAGAAGAGGAAGAAGGACAGTCGCTACGGAGGATTCATTATAGATGAGGCTGAGGTAGATGATGAAGTCGATGAAGACGATGAGTGGGAGGAAGGCGCCCAGGAAATGGGTATCGTCGGTAATGAGGTGGATGAGATCGGACCCACAGCCAGAGAAATAGAGGGCCGACGCAGAGGAACCAATCTGTGGGACTCACAGAAAGAAGAAGAAATAGAGGAATACTTGAGAAATAAATATGCTGATGAATCAGCGGCGCTCAGACACTTTGGTGAGGGCGGTGAAGAAATGTCTGATGAGATCACTCAACAGACCTTGCTGCCCGGCATCAAGGATCCTAACCTGTGGATGGTGAAATGCAGGATCGGTGAAGAGAAGGCGACTGTGTTATTGCTTATGAGAAAATTTATTACCTACCAGAATTCAGAGGAACCTTTCCAAATAAAGTCGGTGGTGGCTCCGGAAGGAGTCAAGGGCTTCATCTACATTGAGGCATACAAACAGACACATGTGAAAGCCATCATAGACAAAGTGGGTAATTTGAGAATGGGCACATGGAAACAGGAGATGGTACCCATCAAGGAAATGACAGATGTTTTGAGGGTTGTTAAGGAACAGTCAGGTTTAAAACCGAAACAGTGGGTGCGACTCAAGCGAGGCCTCTATAAAGACGATATAGCTCAAGTAGATTACGTAGATTTAGCACAAAACCAAGTTCACCTGAAACTTCTTCCTAGAATAGACTACACAAGACTCAGAGGAGCTCTAAGGACCGTGCAGAGCGAGAGCGAAGCGGCCAAAAGGAAAAAAAAGCGGCGACCTGCGGCCAAACCTTTCGACCCCGAAGCTATTCGCGCCATCGGCGGCGAAGTGACTTCGGACGGTGACTTCCTCATATTTGAGGGAAACAGATACTCCAGAAAGGGTTTCCTGTACAAGAACTTCACCATGTCCGCGATATTGGCGGAGGGCGTCAAACCCACGCTCACGGAACTAGAAAGATTCGAAGAGCAACCGGAAGGTATAGACATCGAGCTGGCGGCGCCCGCCAAGGACGACCCCACTAGTCTGCACTCGTTCTCGATGGGAGATAACGTGGAGGTGTGTTCCGGTGATCTGGCCAACCTGCAGGCCAGGATCATAGCCATAGATGGCTCCATGATCACCGTCATGCCGAGACACGACGCTCTGAAGGATCCGCTCGTATTCAAACCCAACGAACTACGGAAGTACTTCAAACAGGGAGACCACGTGAAAGTCTTAGCGGGAAGATACGAGGGCGACACCGGTCTCATCGTCCGAGTGGAACCTCACAGGGCGGTCCTCGTGTCGGATGTGACGATGCACGAGCTGGAGGTGTTGCCCAGAGACCTGCAGCTGTGTTCGGACATGGCGACCGGCGTGGACTCGCTGGGACAGTTCCAGTGGGGGGACATGGTGCTGCTGGACTCGCAGACGGTCGGCGTCATCGTCCGACTCGAGAGGGAGAACTTCCACGTGCTCGGCATGCAGGGGAAGGTGATCGAGTGCAAACCTCAGGCGCTGCAGAAGAGAAGGGAGAACAGGTTCACCATGGCGCTCGACTCCGAGCACAACTCCATACAAAAGAAAGACATCGTCAAGGTCATCGACGGACCGCACGCGGGCCGCGAGGGAGAGATAAAGCATCTGTACAGAAACTTCGCCTTCCTGCAGTCGAGGATGTACCCCGACAACGGAGGAATCTTTGTGTGCAAGACGAGACACCTGCAGCTGGCGGGAGGCGCCAAGAACGCCGCCGCCAGCAACGGACTCGCTCTCGCGTTCATGTCGCCGAGGATACAGTCACCCATGCACCCGTCGGGCAGGGGAGGGGGCCGGGGCCGCGGCCGGGGAGGGAGGGGGGCTGTCGCCAGGGACAGGGAGCTCATAGGACAGACCATCAATAGAGACGCCACGGGCAGCACCGCGCGCGTGGAGCTGCACACCATGTGTCAGACCATCTCCGTGGACCGCGGACACATCGCGGCGGCCGGCGGCCCCAACGGCATCGCCCGCGGGGGAGCCTCCAGTTATGGCCGCACCCCCATGCGGGCGGGCGCGCACACGCCGACTTACCGCGAGGCGGGGCTGAAGACGCCGCTCCAGGGCAACGCAACGCCGATCTACGAGGCGGGAGCTCGCACGCCTCACTACGGGTCCAGCACGCCGGCGCACGAGGGCGGCAGGACACCGGCCCACCCCGCCTGGGACGCCGCCGCCCACACGCCGCGTCCCGACCACGATCTGCTGCTGGCGTCCGCCTCTCCTCCGCCCGCCGCCTCCTCCTCGCACTACGACGCCGCCTACCAGCAGGGGCCCTTCACGCCGCAGACGCCGGGCACCATGTACGGCTCCGATCACACCTACAGCCCGTACCGACCCAGCCCGAGCCCCGGCACTTACGCCGGCTACCTGGCCACACCCAGCCCGGCGCCCTACTCGCCCCGCTCGCCCTACACGGCCGAGGACGCCGACGACTGGCACGCGCCCGACCTGGAGGTACGCGTGCGGGGCGGAGCGGAGCCGGGCCTGCGGGGGCAGGCGGGAGCGCTGCGGAGCGTGTCGGGCGCCACGTGCGCCGTGTACCTGCCGCTGGAGGACCGCGTGCTCAACCTGCCCGCGCACCTGCTGGAGCCCGTGGTGCCTCACAGCGGGGACCGGGTCAAGGTGATCGCGGGCGAGGACCGGGAGGCGGTCGGCCAGCTCATCTCCATCGAGAACCAGGAGGGGGTCGTGAAGTTCGGCTCCGACGACATCAAGATCATGCAGCTGAGACATCTCTGCAAGATGGCCTCCAACTGA

Protein sequence:

>DPOGS210135-PA
MSDSEGSNYSGSGSDAGSVVSNRSRRSAASNRSAKSGSRSRSHSGSRSPSRSPSRSRSRSRSRSRSRSRSRSAGSDGSRNRDDEAKEASGDEEVEDEQEPEGEDLVDSEEYDEDEEEERRRKKRKKDSRYGGFIIDEAEVDDEVDEDDEWEEGAQEMGIVGNEVDEIGPTAREIEGRRRGTNLWDSQKEEEIEEYLRNKYADESAALRHFGEGGEEMSDEITQQTLLPGIKDPNLWMVKCRIGEEKATVLLLMRKFITYQNSEEPFQIKSVVAPEGVKGFIYIEAYKQTHVKAIIDKVGNLRMGTWKQEMVPIKEMTDVLRVVKEQSGLKPKQWVRLKRGLYKDDIAQVDYVDLAQNQVHLKLLPRIDYTRLRGALRTVQSESEAAKRKKKRRPAAKPFDPEAIRAIGGEVTSDGDFLIFEGNRYSRKGFLYKNFTMSAILAEGVKPTLTELERFEEQPEGIDIELAAPAKDDPTSLHSFSMGDNVEVCSGDLANLQARIIAIDGSMITVMPRHDALKDPLVFKPNELRKYFKQGDHVKVLAGRYEGDTGLIVRVEPHRAVLVSDVTMHELEVLPRDLQLCSDMATGVDSLGQFQWGDMVLLDSQTVGVIVRLERENFHVLGMQGKVIECKPQALQKRRENRFTMALDSEHNSIQKKDIVKVIDGPHAGREGEIKHLYRNFAFLQSRMYPDNGGIFVCKTRHLQLAGGAKNAAASNGLALAFMSPRIQSPMHPSGRGGGRGRGRGGRGAVARDRELIGQTINRDATGSTARVELHTMCQTISVDRGHIAAAGGPNGIARGGASSYGRTPMRAGAHTPTYREAGLKTPLQGNATPIYEAGARTPHYGSSTPAHEGGRTPAHPAWDAAAHTPRPDHDLLLASASPPPAASSSHYDAAYQQGPFTPQTPGTMYGSDHTYSPYRPSPSPGTYAGYLATPSPAPYSPRSPYTAEDADDWHAPDLEVRVRGGAEPGLRGQAGALRSVSGATCAVYLPLEDRVLNLPAHLLEPVVPHSGDRVKVIAGEDREAVGQLISIENQEGVVKFGSDDIKIMQLRHLCKMASN-