Monarch geneset OGS2.0

DPOGS209159
TranscriptDPOGS209159-TA1491 bp
ProteinDPOGS209159-PA496 aa
Genomic positionDPSCF300061 - 177396-184500
RNAseq coverage2900x (Rank: top 4%)
Annotation
HeliconiusHMEL0097540.081.38% 
BombyxBGIBMGA011477-TA0.080.97% 
DrosophilaHex-A-PA0.066.12% 
EBI UniRef50UniRef50_D3TRU83e-17862.81%Hexokinase n=36 Tax=Arthropoda RepID=D3TRU8_GLOMM
NCBI RefSeqXP_001660031.10.068.28%hexokinase [Aedes aegypti]
NCBI nr blastpgi|3123739680.068.78%hypothetical protein AND_16684 [Anopheles darlingi]
NCBI nr blastxgi|3123739680.068.78%hypothetical protein AND_16684 [Anopheles darlingi]
Group
Gene OntologyGO:00055249.7e-239ATP binding
GO:00167739.7e-239phosphotransferase activity, alcohol group as acceptor
GO:00059759.7e-239carbohydrate metabolic process
KEGG pathwayaag:AaeL_AAEL0093870.0 
 K00844 (HK)maps-> Starch and sucrose metabolism
    Galactose metabolism
    Glycolysis / Gluconeogenesis
    Amino sugar and nucleotide sugar metabolism
    Fructose and mannose metabolism
    Type II diabetes mellitus
    Streptomycin biosynthesis
    Insulin signaling pathway
    Butirosin and neomycin biosynthesis
InterPro domain[3-494] IPR0013129.7e-239Hexokinase
[253-491] IPR0226738e-76Hexokinase, C-terminal
[10-206] IPR0226722e-68Hexokinase, N-terminal
Orthology groupMCL10337 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209159-TA
ATGGGAATGGCAATCAATCATTTACCACAGATTCGCGAGGAATGCGAAGTCTTCCATCTATCAGATAAGCAGTTGAAGGAGATTATGAGCAGACTTCACAACGATCTGCTTAAGGGTCTAGGCAAAGACAGTCATGCGAATGCCATTGTGAAGTGCTGGATAACTTATATACAAGACTTACCGAATGGAAAAGAGCGCGGTAAATTCTTGGCCTTGGACTTAGGCGGGACAAATTTCAGAGTTTTAATAATAAATTTAGGCGACAATCACTTCGACATGCAGTCAAAAATTTACGCTATACCCAATCACATAATGACCGGCACCGGGATCGCTCTGTTCGATCATATAGCGGAATGTCTAGCTAATTTCATGAAGGAACACAACGTGTATGAAGAACGCCTGGCTTTGGGCTTCACGTTCAGTTTCCCGCTGAAACAGCTCGGCCTCACCAAGGGTATACTACAACGTTGGACCAAAGGATTCTCATGCTCCGGGGTGGTGGGAGAAGACGTAGTACAAGGCTTAAAAGACGCGATCGCTAGAAGAGGAGACCTCTCATTATCCGTCATGGGCATCCTAAACGATGCCACGGGGACGCTCATGTCGTGTGCCCATAGAGATAAATATTGCAAGATAGGAATTATCATTGATAATATGAGTTCCTGTCGGCGGATGTCCGAGGACGTACAGATCGATATATGCGCCATATTGAATGACACCACGGGGACTTTGATGTCGTGCGCATGGAAAAACCACAATTGCAAGATAGGACTCATAGTCGGTACAGGCAGCAACGCGTGCTATGTTGAGAAAACTGAGAACTGCGAACTGTTCGACGGGGAGCCCGGGAAGCCGGAGTTGTTGATCAACACGGAATGGGGCGCCTTCGGTGACGACGGCGCCCTGGACTTCGTGAGGACGGAGTTCGACAGGGACGTGGACGAGAACTCCATCAACCCGGGGAAGCAGATTCAAGAGAAGATGATATCGGGCATGTACATGGGCGAGCTGGTGAGGCTGGTGTTGGTGAAGTTCACGAGGATGGGGCTACTGTTCGGAGGCCGGGGCTCCGACCTGCTGTTCCAGAGAGGCAAATTCTACACCAAGTACGTGTCCGAGATAGAGTCCGACAAGCCCGGAGACTTCACCAGCTGTATGGAGGTCTTGGAGGAACTAGGTCTAGAGCACGCCAGCGAGTCGGACATGGCGGGCGTGCGCCACGTGTGCGAGTGCGTGTCGCGGCGAGCGGCGCATCTGGTGTCCAGCGGCCTAGCGACGCTCCTCAACAAGATGTCGGAGCCGCGCGTCACGGTCGGCATCGACGGCTCCGTGTATCGCTTCCACCCGCACTTCCACACGCTCATGTGTGACAAAATCGCCACGCTCGTACGACCGGGCCTGCAGTTCGACCTGATGTTGTCGGAGGACGGCAGCGGCCGCGGGGCGGCGCTGGTGGCGGCTGTCGCGTGTCGCGAGAGAGAACTCGCGTAG

Protein sequence:

>DPOGS209159-PA
MGMAINHLPQIREECEVFHLSDKQLKEIMSRLHNDLLKGLGKDSHANAIVKCWITYIQDLPNGKERGKFLALDLGGTNFRVLIINLGDNHFDMQSKIYAIPNHIMTGTGIALFDHIAECLANFMKEHNVYEERLALGFTFSFPLKQLGLTKGILQRWTKGFSCSGVVGEDVVQGLKDAIARRGDLSLSVMGILNDATGTLMSCAHRDKYCKIGIIIDNMSSCRRMSEDVQIDICAILNDTTGTLMSCAWKNHNCKIGLIVGTGSNACYVEKTENCELFDGEPGKPELLINTEWGAFGDDGALDFVRTEFDRDVDENSINPGKQIQEKMISGMYMGELVRLVLVKFTRMGLLFGGRGSDLLFQRGKFYTKYVSEIESDKPGDFTSCMEVLEELGLEHASESDMAGVRHVCECVSRRAAHLVSSGLATLLNKMSEPRVTVGIDGSVYRFHPHFHTLMCDKIATLVRPGLQFDLMLSEDGSGRGAALVAAVACRERELA-