Monarch geneset OGS2.0

DPOGS206635
TranscriptDPOGS206635-TA1569 bp
ProteinDPOGS206635-PA522 aa
Genomic positionDPSCF300048 - 553493-560436
RNAseq coverage1427x (Rank: top 9%)
Annotation
HeliconiusHMEL0111530.090.61% 
BombyxBGIBMGA011482-TA3e-5229.16% 
DrosophilaCG17896-PB0.068.08% 
EBI UniRef50UniRef50_Q022520.063.64%Methylmalonate-semialdehyde dehydrogenase [acylating], mitochondrial n=21 Tax=Eutheria RepID=MMSA_HUMAN
NCBI RefSeqXP_312441.40.071.98%AGAP002499-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1582909050.071.98%AGAP002499-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|1582909050.072.25%AGAP002499-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00044911.2e-222methylmalonate-semialdehyde dehydrogenase (acylating) activity
GO:00551141.2e-222oxidation-reduction process
GO:00081522.7e-142metabolic process
GO:00164912.7e-142oxidoreductase activity
GO:00166201.6e-61oxidoreductase activity, acting on the aldehyde or oxo group of donors, NAD or NADP as acceptor
KEGG pathwayaga:AgaP_AGAP0024990.0 
 K00140 (E1.2.1.27, mmsA, iolA)maps-> Inositol phosphate metabolism
    Propanoate metabolism
    Valine, leucine and isoleucine degradation
InterPro domain[1-520] IPR0100610Methylmalonate-semialdehyde dehydrogenase
[10-509] IPR0161612.7e-142Aldehyde/histidinol dehydrogenase
[37-500] IPR0155903.5e-133Aldehyde dehydrogenase domain
[27-283] IPR0161621.6e-79Aldehyde dehydrogenase, N-terminal
[284-472] IPR0161631.6e-61Aldehyde dehydrogenase, C-terminal
Orthology groupMCL16060 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206635-TA
ATGGCGCTGAATCTCTTGAAACTTATTAAATCAGAATCTCATATATTGTTACGAAGACACTATAGCAGTTCAGCACCGTCAACTAAGTTATACATAGATGGACAATATGTGGAATCAAAAACTAGCAACTGGATTGAACTCACCAATCCCGCAACTAATGAAGTTATCGGCAGGGTACCAGAGGCGACTCAAGAGGAATTGAATTTGGCACTGGAAGCTGCTAAGAAGGCATACAAATCATGGAGTCAGAGCACTGTATTAACTCGTCAACAACTCATGTTGAAGTTTGCTCGTCTTCTAAGAGAAAATCAAAGTAAATTAGCAGCTAAAATAACAGAGGAGCAAGGAAAAACTATAGCTGATGCTGAGGGAGATGTACTTAGAGGAATTCAATCTGTGGAGCACTGTTGCAGTATAACAAGTTTGCAGCTCGGTGATTGTATACAGAACATAGCTAAAGATATGGACACACATAGCTATAAAGTACCACTTGGAGTCACCGCTGGAGTAGCAGCATTCAACTTCCCAGTAATGATACCTTTATGGATGTTCCCACCCGCATTAGTGACTGGTAACACTTGTATCATCAAACCATCGGAGCAGGACCCCGGTGCCACCCTTATGATGATGGAACTGCTGCAGGAGGCCGGAGCTCCCGCTGGAGTGGTTAATGTTGTTCACGGAACTCATGACCCTGTGAACTTCATATGTGATCACCCTGACATCAAAGCTGTGTCATTCGTAGGGGGTGATGCAGCTGGGAAACATATCTACAGCAGGGCTTCGGCTGCCGGCAAGCGTGTTCAGAGCAATATGGGTGCCAAAAACCATGGGGTCATAATGCCGGATGCTAACAAAGAGCACACATTGAACCAATTGGCTGGAGCTGCGTTCGGAGCGGCCGGACAAAGGTGTATGGCGCTCAGCACGGCCGTGTTTGTGGGTGAGGCCAAAGAATGGATACCAGATTTGGTGAAACGAGCTGAAGCTCTCAAAGTTAATGCCGGTCATGTACCTGGCACTGATGTTGGTCCGGTCATCTCTGTTAGAGCAAAAGAGAGGATTCATAGGCTTGTTGAATCTGGAGCAAAAGAGGGCGCTAAAATCGTGCTTGACGGTAGAGGGGTCAAGGTTCAAGGCTTCGAGAAAGGAAACTTCGTCGGTCCGACCATTCTCACTCACGTACAACCAAACATGGAATGCTACAGAGAAGAAATCTTCGGTCCTGTATTAATTTGTCTCTTTGTTGACACCTTGGACGAAGCTATTGAAATGATCAATTCAAATCCCTATGGTAACGGAACAGCCATCTTCACAACCAACGGGGCGACCGCAAGGAAATTTTCTTCACAAATCGATGTTGGCCAAGTCGGAATAAACGTTCCCATACCAGTGCCATTGTCTATGTTCTCATTCAGCGGTAGCAGAGGTAGCTTTTTGGGTACAAATCATTTCTGTGGCAAACAAGGTATCGACTTTTACACCGAATTAAAAACCGTTGTATCATTCTGGAGACAGAGTGACGTATCTCACGCCAAGGCCGCCGTCTCTATGCCAACTCAGCAATAA

Protein sequence:

>DPOGS206635-PA
MALNLLKLIKSESHILLRRHYSSSAPSTKLYIDGQYVESKTSNWIELTNPATNEVIGRVPEATQEELNLALEAAKKAYKSWSQSTVLTRQQLMLKFARLLRENQSKLAAKITEEQGKTIADAEGDVLRGIQSVEHCCSITSLQLGDCIQNIAKDMDTHSYKVPLGVTAGVAAFNFPVMIPLWMFPPALVTGNTCIIKPSEQDPGATLMMMELLQEAGAPAGVVNVVHGTHDPVNFICDHPDIKAVSFVGGDAAGKHIYSRASAAGKRVQSNMGAKNHGVIMPDANKEHTLNQLAGAAFGAAGQRCMALSTAVFVGEAKEWIPDLVKRAEALKVNAGHVPGTDVGPVISVRAKERIHRLVESGAKEGAKIVLDGRGVKVQGFEKGNFVGPTILTHVQPNMECYREEIFGPVLICLFVDTLDEAIEMINSNPYGNGTAIFTTNGATARKFSSQIDVGQVGINVPIPVPLSMFSFSGSRGSFLGTNHFCGKQGIDFYTELKTVVSFWRQSDVSHAKAAVSMPTQQ-