Monarch geneset OGS2.0

DPOGS212210
TranscriptDPOGS212210-TA1050 bp
ProteinDPOGS212210-PA349 aa
Genomic positionDPSCF300323 + 127929-128978
RNAseq coverage45x (Rank: top 71%)
Annotation
HeliconiusHMEL0132843e-5534.02% 
BombyxBGIBMGA001156-TA1e-10956.17% 
DrosophilaCG10749-PA1e-5135.14% 
EBI UniRef50UniRef50_Q9VU292e-4935.14%Malate dehydrogenase n=10 Tax=Drosophila RepID=Q9VU29_DROME
NCBI RefSeqXP_002066060.11e-5135.40%GK22135 [Drosophila willistoni]
NCBI nr blastpgi|1954362063e-5035.40%GK22135 [Drosophila willistoni]
NCBI nr blastxgi|1953272776e-4835.46%GM25384 [Drosophila sechellia]
Group
Gene OntologyGO:00551141.4e-64oxidation-reduction process
GO:00061081.4e-64malate metabolic process
GO:00300601.4e-64L-malate dehydrogenase activity
GO:00166165.1e-36oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor
GO:00038245.1e-36catalytic activity
GO:00059755.1e-36carbohydrate metabolic process
GO:00054885.5e-21binding
GO:00164915.2e-17oxidoreductase activity
GO:00442621.1e-11cellular carbohydrate metabolic process
KEGG pathwaydwi:Dwil_GK221354e-51 
 K00026 (MDH2)maps-> Citrate cycle (TCA cycle)
    Pyruvate metabolism
    Carbon fixation in photosynthetic organisms
    Glyoxylate and dicarboxylate metabolism
InterPro domain[3-348] IPR0100971.4e-64Malate dehydrogenase, type 1
[174-338] IPR0159555.1e-36Lactate dehydrogenase/glycoside hydrolase, family 4, C-terminal
[174-337] IPR0223832.6e-23Lactate/malate dehydrogenase, C-terminal
[28-173] IPR0160405.5e-21NAD(P)-binding domain
[29-158] IPR0012365.2e-17Lactate/malate dehydrogenase, N-terminal
[27-341] IPR0015571.1e-11L-lactate/malate dehydrogenase
Orthology groupMCL44299 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212210-TA
ATGAATGTATTTAGTAGAAGTTTACAAAGAACACAATTTAAATCTCTTTGTTCTGTAAAAAATTATGTGCAGCGAAGAAATGTACAAATATCGATAGTCGGTGCCGCTAGTGACATTGGAAAGAACGTGGCATTGCTGTTGAAACGGAATCCTTATATTACGAGGCTGCACTTATACGATGACAATAATATAGTAAAAGGTATTGGCCTGGAATTGGAACAGATTCCCGGAGGACCCAAAGTAGCTTCATTTTCGGGGGATCCCTTTTTGTCGGCAGCTATAAGATACTCCAATCTGGTACTGCTTGTTTCGAGAACTCCTCGTAAACTCGGATTTTCCCGTGAACAGATGCTAGCCACGAATGCTATACCCGTATATAAATTGTGCAAAGTTTTGTCATACCAAAACCCTGATGCCTTTTTTGCCATTTCAACTAATCCTATAAATTCAATAATACCGTTTGCAAACTTATTATTAAAAAACTGCAACGCTCATAATCCGTTTAAACTCTTCGGTATAACCCACATCGACACGACGAGAGCCAGGGCCTTCATTTCAAACACTTTGAACGTTAATCCGAGACATCTCTATGTACCGGTGATCGGAGGGCACTCTGACGAAACGATAGTTCCCCTATTTTCAAATTTGTGTCCTAGCCATTATTGCGTCGGTCACTGTGAAGCTGACACATTGACACGTCTGATTAAGAAATCTGGCACAGAAGTGTTAAACCGAAAGCACGGCAGTGATTCATCGACGCTGGCTATGGCCTGGTCGATCAACGAATTCGTCCAAAATCTCATTGAAGCTCTGTACGGGAACTGTGTAATCGTGAACAGTTATACAGCGAATCCTCATTTTGGAACAAAATTTTTTTCAGGTCCGACTAAGGTTGGACCCGAAGGCGTCATAGAAACATGCAATAAGACCTTTCACATGAGTGATTATGAAAGCAAACTCCTGGAGCGCGCTGTTCCCATCATAAACAGGGACGTGGCTGAGGGTGAAGCGCATGTCAGCGTCCTCGAAAGTGCCAGGAGTTGTTACTAG

Protein sequence:

>DPOGS212210-PA
MNVFSRSLQRTQFKSLCSVKNYVQRRNVQISIVGAASDIGKNVALLLKRNPYITRLHLYDDNNIVKGIGLELEQIPGGPKVASFSGDPFLSAAIRYSNLVLLVSRTPRKLGFSREQMLATNAIPVYKLCKVLSYQNPDAFFAISTNPINSIIPFANLLLKNCNAHNPFKLFGITHIDTTRARAFISNTLNVNPRHLYVPVIGGHSDETIVPLFSNLCPSHYCVGHCEADTLTRLIKKSGTEVLNRKHGSDSSTLAMAWSINEFVQNLIEALYGNCVIVNSYTANPHFGTKFFSGPTKVGPEGVIETCNKTFHMSDYESKLLERAVPIINRDVAEGEAHVSVLESARSCY-