Monarch geneset OGS2.0

DPOGS208958
TranscriptDPOGS208958-TA975 bp
ProteinDPOGS208958-PA324 aa
Genomic positionDPSCF300009 + 773061-779662
RNAseq coverage72x (Rank: top 66%)
Annotation
HeliconiusHMEL0157751e-10156.95% 
BombyxBGIBMGA002431-TA8e-2258.44% 
DrosophilaCG14721-PA5e-4738.49% 
EBI UniRef50UniRef50_D6W9528e-5345.34%Putative uncharacterized protein n=2 Tax=Tribolium castaneum RepID=D6W952_TRICA
NCBI RefSeqXP_975470.22e-5345.34%PREDICTED: similar to thiamin pyrophosphokinase 1 [Tribolium castaneum]
NCBI nr blastpgi|3323734782e-5444.31%unknown [Dendroctonus ponderosae]
NCBI nr blastxgi|3323734783e-5442.91%unknown [Dendroctonus ponderosae]
Group
Gene OntologyGO:00047881.8e-46thiamine diphosphokinase activity
GO:00067721.8e-46thiamine metabolic process
GO:00055241.1e-35ATP binding
GO:00092291.1e-35thiamine diphosphate biosynthetic process
KEGG pathwaymmu:298071e-45 
 K00949 (E2.7.6.2, THI80)maps-> Thiamine metabolism
InterPro domain[86-318] IPR0062821.8e-46Thiamin pyrophosphokinase
[83-210] IPR0073711.1e-35Thiamin pyrophosphokinase, catalytic domain
[236-322] IPR0073738.5e-26Thiamin pyrophosphokinase, vitamin B1-binding domain
Orthology groupMCL15289 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208958-TA
ATGTTTTGCAAGTCCATGTTTGATAAACACTCGTATTCACGATATTTATTACTGTATAAATATTTATCGGCGTATGAAAAAAGTTACAGGTTGGACTCAAACAATATATTGTCAAATTTAAATAGGATAATGACTAGGAATTTAGCATTGACTAAGGTGTCAGAAAATCATAGCGATTTATCAAATAATATCATAAAATGCTGGAAATGGAATGTAAATAAAATACTTAATGTACAGGAAAATAAGAAATATGCAATATTAATACTGAATTGTAGAATAACACAGAAGAAAGACATCATTAAACGATTTTGGAATGAAGCATCATTGAGAATAACTGTTGATGGTGGAACCTCACATTGGGATAAGTTTTTGAATCATTTATCACACGATGAACAAAAATCAATGAAATGCCCCGATCTTGTGACTGGAGACTTTGATTCTATAAGTGAAGAGATGTTGCAGAAATATAAAGACAAACATTGTAAGATAATAAGCACACCTGATCAGGATTTCACAGATTTTACAAAGGCTATCATAGAATTGAATAATTACTGTGAAGAGAATAAAGTACAGATGGACTATGCCGTTGTGATGGCTCAGAATTCAGGTCGCCTTGATCAAATACTGGGAAACATTCAAACACTGCATCTTATTAAGGAAAACAGGTTACTGCATCCGCAGACTAGAGTGTACATGTTGTCAGATGACTCTATATCCTGGCTTCTACATCCCGGAGACCACATCATAGAAATTCCGCTTGCAAGTAGGAATGGCAATGCATGGTGTTCGCTAATACCAGTAGGAGAGCCATGTATAAGCGTCACAACCAGTGGACTTAAATGGAACTTAGATAATCAAAAATTGAATTTTGGTGGTCTTATAAGCACATCGAACACATTCGACGGATCCGACCAGGTTAAAGTTAAATGTAGTCACACGTTGTTGTGGTCTATGGAAATACCAACTCTGATGTAG

Protein sequence:

>DPOGS208958-PA
MFCKSMFDKHSYSRYLLLYKYLSAYEKSYRLDSNNILSNLNRIMTRNLALTKVSENHSDLSNNIIKCWKWNVNKILNVQENKKYAILILNCRITQKKDIIKRFWNEASLRITVDGGTSHWDKFLNHLSHDEQKSMKCPDLVTGDFDSISEEMLQKYKDKHCKIISTPDQDFTDFTKAIIELNNYCEENKVQMDYAVVMAQNSGRLDQILGNIQTLHLIKENRLLHPQTRVYMLSDDSISWLLHPGDHIIEIPLASRNGNAWCSLIPVGEPCISVTTSGLKWNLDNQKLNFGGLISTSNTFDGSDQVKVKCSHTLLWSMEIPTLM-