Monarch geneset OGS2.0

DPOGS212201
TranscriptDPOGS212201-TA618 bp
ProteinDPOGS212201-PA205 aa
Genomic positionDPSCF300323 - 126065-126682
RNAseq coverage86x (Rank: top 63%)
Annotation
HeliconiusHMEL0132846e-3535.15% 
BombyxBGIBMGA000994-TA3e-6957.92% 
DrosophilaMdh2-PA4e-3636.87% 
EBI UniRef50UniRef50_B4L2P22e-3033.67%Malate dehydrogenase n=1 Tax=Drosophila mojavensis RepID=B4L2P2_DROMO
NCBI RefSeqXP_003137632.12e-3638.00%malate dehydrogenase [Loa loa]
NCBI nr blastpgi|3120693313e-3538.00%malate dehydrogenase [Loa loa]
NCBI nr blastxgi|665130924e-3437.00%PREDICTED: malate dehydrogenase, mitochondrial-like isoform 1 [Apis mellifera]
Group
Gene OntologyGO:00551141.1e-40oxidation-reduction process
GO:00300601.1e-40L-malate dehydrogenase activity
GO:00061081.1e-40malate metabolic process
GO:00166167.7e-33oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor
GO:00059757.7e-33carbohydrate metabolic process
GO:00038247.7e-33catalytic activity
GO:00054883.5e-05binding
KEGG pathwayame:4089506e-36 
 K00026 (MDH2)maps-> Citrate cycle (TCA cycle)
    Pyruvate metabolism
    Carbon fixation in photosynthetic organisms
    Glyoxylate and dicarboxylate metabolism
InterPro domain[1-200] IPR0100971.1e-40Malate dehydrogenase, type 1
[40-200] IPR0159557.7e-33Lactate dehydrogenase/glycoside hydrolase, family 4, C-terminal
[40-199] IPR0223832.9e-25Lactate/malate dehydrogenase, C-terminal
Orthology groupMCL44291 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212201-TA
ATGTCGCCGATGCCGTTCGTAGGTATCGCTACGGAACCCATTAACACTTTAGTTCCCATGGCTGCGGAAATAATGAAGAACCATGGGGAATATGATCCCAAAAAAATGTTCGGCATCACAATATTAGATAAGCTGAAAACAGAAGCATTGTACGCCGCGGAAGCCGAAAAGGATCCTCAAAACTGCAACGTCCCAGTGATAGGCGGCCACTCAGAAAAAACCCTGATACCGCTACTGTCACAGGCAGAACCCAAATGTAACTTGGACGAGAAAAGAATACAAGAATTCACATCTAGGGTGAGGTCATCTGATAGCGCAATTTTGAAATCAAAATGCGGATGGTCGCCATCTTTGTCCGTAGCGTACGGCGCTGTGGCATTCACTAAATGTATTATGGATGCTTTGGACGGTCGAACGACTCAAATACAAGCGTACGTTGAAAACAATGACTTCGGCACGTCGTATTTCTCTGGACTGGTCACCGTTGATCAAAATGGAGTTAAGGAGATGCAAAGCTACTCAAACCTATCGTCATACGAATGTCAGTTGTTAGAAAGAAGTATCGAGCAGCTGAGAAAGGAAGTCTTGATGGGGAAGAAGGCACTGGAGCTGGAGTAG

Protein sequence:

>DPOGS212201-PA
MSPMPFVGIATEPINTLVPMAAEIMKNHGEYDPKKMFGITILDKLKTEALYAAEAEKDPQNCNVPVIGGHSEKTLIPLLSQAEPKCNLDEKRIQEFTSRVRSSDSAILKSKCGWSPSLSVAYGAVAFTKCIMDALDGRTTQIQAYVENNDFGTSYFSGLVTVDQNGVKEMQSYSNLSSYECQLLERSIEQLRKEVLMGKKALELE-