Monarch geneset OGS2.0

DPOGS200490
TranscriptDPOGS200490-TA996 bp
ProteinDPOGS200490-PA331 aa
Genomic positionDPSCF300158 - 50937-52300
RNAseq coverage2956x (Rank: top 4%)
Annotation
HeliconiusHMEL0082687e-17591.24% 
BombyxBGIBMGA010423-TA5e-17290.63% 
DrosophilaMdh1-PA2e-13674.16% 
EBI UniRef50UniRef50_P409251e-12466.26%Malate dehydrogenase, cytoplasmic n=84 Tax=root RepID=MDHC_HUMAN
NCBI RefSeqNP_001040257.14e-17190.94%cytosolic malate dehydrogenase [Bombyx mori]
NCBI nr blastpgi|1140525618e-17090.94%cytosolic malate dehydrogenase [Bombyx mori]
NCBI nr blastxgi|1140525612e-16890.94%cytosolic malate dehydrogenase [Bombyx mori]
Group
Gene OntologyGO:00551149.6e-222oxidation-reduction process
GO:00061089.6e-222malate metabolic process
GO:00166159.6e-222malate dehydrogenase activity
GO:00300601.6e-162L-malate dehydrogenase activity
GO:00166163.2e-81oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor
GO:00038243.2e-81catalytic activity
GO:00059753.2e-81carbohydrate metabolic process
GO:00054886.3e-71binding
GO:00442621.4e-68cellular carbohydrate metabolic process
GO:00164911.5e-36oxidoreductase activity
KEGG pathwayaag:AaeL_AAEL0077078e-138 
 K00025 (MDH1)maps-> Citrate cycle (TCA cycle)
    Pyruvate metabolism
    Proximal tubule bicarbonate reclamation
    Carbon fixation in photosynthetic organisms
    Glyoxylate and dicarboxylate metabolism
InterPro domain[1-326] IPR0109459.6e-222Malate dehydrogenase, type 2
[6-326] IPR0112741.6e-162Malate dehydrogenase, NAD-dependent, cytosolic
[155-329] IPR0159553.2e-81Lactate dehydrogenase/glycoside hydrolase, family 4, C-terminal
[1-154] IPR0160406.3e-71NAD(P)-binding domain
[4-326] IPR0015571.4e-68L-lactate/malate dehydrogenase
[156-327] IPR0223834.8e-44Lactate/malate dehydrogenase, C-terminal
[5-153] IPR0012361.5e-36Lactate/malate dehydrogenase, N-terminal
Orthology groupMCL15150 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200490-TA
ATGGCTGCACCAATCAAAGTTGTTGTAACAGGGGCTGCTGGTCAAATCGCGTACTCTCTTCTATACCAAGTTGCTTCTGGAGCGGTTTTTGGTCCAGAACAACCTGTATTCCTGCACCTTCTTGATATTGCTCCCATGATGGGAGTGCTGGAAGGTGTTGTTATGGAACTTGCTGACTGCGCTCTACCTTTATTAGCTGGTGTTTTACCCACTGCCAATCCTGAAGAAGCATTTAAGGATGTAGCTGCTGCTTTTCTAGTTGGTGCTATGCCAAGAAGGGAAGGAATGGAAAGAAAAGACCTTCTTTCTGCTAATGTACGCATTTTCAAAGAACAAGGTCAAGCTTTGGATAAAGTTGCTCGCAAGGATGTCAAAGTTCTTGTTGTTGGCAACCCTGCTAATACAAATGCTTTGATCTGTTCTAAATATGCCCCATCTATTCCAAAAGAGAATTTCACTGCTATGACTCGTCTTGATCAGAATCGTGCCCAGTCTCAATTAGCAGCTAAGCTTGGAGTACCAGTACAGGATGTTAAGAATGTTATTATCTGGGGTAACCATTCATCCACTCAATTCCCTGATGCCTCCAATGCGAAAGTCAGAATTGGTGGTGTTGAAAAATCTGTGCCGGAAGCTATAAACAATGATGCTTTCTTAAAAACTGATTTTGTATCCACTGTACAAAAGCGCGGTGCAGCTGTTATAGCAGCTAGAAAGATGTCATCAGCTTTGTCAGCGGCCAAGGCAGCTTCTGATCACATGAGAGATTGGTTCTTGGGAACTGGCGATCGTTGGGTCAGTATGGGAGTTGTGTCTGATGGTTCTTACGGTGTTCCCAAAGACGTTGTTTATTCCTTCCCCGTCACTGTCTCTAATGGAAAATGGAAAATTGTTGAGGGTCTTACAATCTCTGATTTTGCACGGGAAAAATTAGATATCACTGGAAAGGAACTAGTTGAAGAGAAACAGGATGCTTTGGATGTGTGCAAAGATTAA

Protein sequence:

>DPOGS200490-PA
MAAPIKVVVTGAAGQIAYSLLYQVASGAVFGPEQPVFLHLLDIAPMMGVLEGVVMELADCALPLLAGVLPTANPEEAFKDVAAAFLVGAMPRREGMERKDLLSANVRIFKEQGQALDKVARKDVKVLVVGNPANTNALICSKYAPSIPKENFTAMTRLDQNRAQSQLAAKLGVPVQDVKNVIIWGNHSSTQFPDASNAKVRIGGVEKSVPEAINNDAFLKTDFVSTVQKRGAAVIAARKMSSALSAAKAASDHMRDWFLGTGDRWVSMGVVSDGSYGVPKDVVYSFPVTVSNGKWKIVEGLTISDFAREKLDITGKELVEEKQDALDVCKD-