Monarch geneset OGS2.0

DPOGS200407
TranscriptDPOGS200407-TA1461 bp
ProteinDPOGS200407-PA486 aa
Genomic positionDPSCF300236 - 528637-533214
RNAseq coverage1291x (Rank: top 10%)
Annotation
HeliconiusHMEL0115942e-11575.00% 
BombyxBGIBMGA008983-TA1e-9267.93% 
DrosophilaCG5261-PB1e-13252.08% 
EBI UniRef50UniRef50_D6WXB47e-13353.89%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WXB4_TRICA
NCBI RefSeqXP_317493.49e-14456.20%AGAP007975-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1582972312e-14256.20%AGAP007975-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|1582972314e-15157.36%AGAP007975-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00047427.6e-150dihydrolipoyllysine-residue acetyltransferase activity
GO:00452547.6e-150pyruvate dehydrogenase complex
GO:00060907.6e-150pyruvate metabolic process
GO:00084153.5e-78acyltransferase activity
GO:00081523.5e-78metabolic process
KEGG pathwayaga:AgaP_AGAP0079753e-143 
 K00627 (DLAT, aceF, pdhC)maps-> Citrate cycle (TCA cycle)
    Glycolysis / Gluconeogenesis
    Pyruvate metabolism
InterPro domain[70-486] IPR0062577.6e-150Dihydrolipoamide acetyltransferase, long form
[255-486] IPR0232137.4e-80Chloramphenicol acetyltransferase-like domain
[254-485] IPR0010783.5e-782-oxoacid dehydrogenase acyltransferase, catalytic domain
[64-165] IPR0110535.7e-27Single hybrid motif
[70-143] IPR0000893.5e-18Biotin/lipoyl attachment
Orthology groupMCL12512 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200407-TA
ATGCTACGAACGATTTTAGTGCGAAATCAAATATTAAACGATACAGTTAAAAAAGCAATCCGGTGTAATATAACGAGATGTTTAAGCACAGAATTGGCAAGACGAAAGCAGCCAAATAAGCCATTAGGTGTAACACAACATAAGATAAAGCCAGAACACAAATGGGGAATTCAAATCAGGAGCTATTCAAGTCTTCCATCACATTCAAAGGTAAACCTACCAGCATTATCACCGACTATGGAAAATGGGTCTATAGTGAGCTGGGAAAAGAAGGAGGGTGATAAACTAAGTGAAGGTGATCTTTTATGTGAAATCGAGACTGATAAAGCCACAATGGGCTTTGAAACTCCAGAGGAGGGTTACTTAGCGAAAATCCTCCTACCCGCCGGTACAAAAGGTGTCCCCGTAGGGAAGTTGCTATGTATTATAGTTGAAAATCAAGCGGATGTTGCAGCATTTAAAGATTACAAGGATGATTCGGGTGATGCTAAACCGGCTGCGGCACCAGCACCAGCACCAGCAGCGCCAGCGGCTCCATCTCCAGCTCCAGCTGCAGCCCCAGCTGTAGCACCAGCTGTGGCCCCAGCTGCTGCTGAACATGGCAGGTTGTATGCTAGTCCAATGGCCAGGCGATTAGCTGAGCTTAAGAATATGAGGTTGGGTGGTCAAGGTTCAGGGCTGTATGGGTCATTAAAGAGTGGTGATCTCGCAGCTGCGGGCCAGCCTGCAGCGGCAGCTGCGCCACCAGCTCCAGGTGCTGCTTACACAGACATACCATTGACCAGCATGAGAGAAGCCATCGCCAAGAGGTTGTCACTATCCAAACAAACAATACCACATTACCAATTGACTGTCATAGCTAATGTTGAGAAGTTGCTGGAAATGAGGAAGAGGATTAATGAAAAGTTACAGGCTGATAAGTCTGATGTTAAGATTTCAGTCAACGATTTTATCCTTAAGGCTGTAGCTTCAGCCTGTAAAAGAGTACCAACAGTCAATTCACATTGGATGGAAACGTTTATTAGACAGTTCAACAATGTGGATGTATCAACAGCTGTAGCGACACCCTCTGGTCTGATCACACCCATTATATTCAACGCTGATTCCATCGGGATCATCGAGATATCCAAGGAGATGAAGAAGTTAGCGGCCAAAGCGAGGGAAGGAAAACTTCAGCCTCAGGAATTTGTCGGAGGCACAGTCACAGTCTCCAATCTAGGAATGTTTGGTATCGCAAACTTCACATCTATAATCAACCCGCCTCAATCTCTGATACTGTCTGTCGGTGGACTACAGGATATGATGATTCCAGATAAAAATGAACCACAAGGGTTCCGCTTCGCAAAGGTCATGACATTCACCGCGTCAGCTGATCACCGCGTCATAGACGGGGCGGTCGGCGCTCAATGGATGAAGGAATTGAGAGAAAACATCGAAGATCCAGCCAACATCATACTGTAA

Protein sequence:

>DPOGS200407-PA
MLRTILVRNQILNDTVKKAIRCNITRCLSTELARRKQPNKPLGVTQHKIKPEHKWGIQIRSYSSLPSHSKVNLPALSPTMENGSIVSWEKKEGDKLSEGDLLCEIETDKATMGFETPEEGYLAKILLPAGTKGVPVGKLLCIIVENQADVAAFKDYKDDSGDAKPAAAPAPAPAAPAAPSPAPAAAPAVAPAVAPAAAEHGRLYASPMARRLAELKNMRLGGQGSGLYGSLKSGDLAAAGQPAAAAAPPAPGAAYTDIPLTSMREAIAKRLSLSKQTIPHYQLTVIANVEKLLEMRKRINEKLQADKSDVKISVNDFILKAVASACKRVPTVNSHWMETFIRQFNNVDVSTAVATPSGLITPIIFNADSIGIIEISKEMKKLAAKAREGKLQPQEFVGGTVTVSNLGMFGIANFTSIINPPQSLILSVGGLQDMMIPDKNEPQGFRFAKVMTFTASADHRVIDGAVGAQWMKELRENIEDPANIIL-