Monarch geneset OGS2.0

DPOGS209521
TranscriptDPOGS209521-TA1272 bp
ProteinDPOGS209521-PA423 aa
Genomic positionDPSCF300127 + 455617-458988
RNAseq coverage1619x (Rank: top 8%)
Annotation
HeliconiusHMEL0162660.088.89% 
BombyxBGIBMGA007438-TA0.079.20% 
DrosophilaFaa-PA3e-13956.87% 
EBI UniRef50UniRef50_B7Q0D42e-15963.48%Fumarylacetoacetase, putative n=21 Tax=cellular organisms RepID=B7Q0D4_IXOSC
NCBI RefSeqNP_001040300.10.078.96%fumarylacetoacetase [Bombyx mori]
NCBI nr blastpgi|1140514810.078.96%fumarylacetoacetase [Bombyx mori]
NCBI nr blastxgi|1140514810.078.96%fumarylacetoacetase [Bombyx mori]
Group
Gene OntologyGO:00090722.4e-189aromatic amino acid family metabolic process
GO:00043342.4e-189fumarylacetoacetase activity
GO:00081524.4e-103metabolic process
GO:00038244.4e-103catalytic activity
KEGG pathwaynvi:1001178604e-171 
 K01555 (E3.7.1.2, FAH)maps-> Styrene degradation
    Tyrosine metabolism
InterPro domain[3-419] IPR0059592.4e-189Fumarylacetoacetase
[122-412] IPR0112344.4e-103Fumarylacetoacetase, C-terminal-related
[125-409] IPR0025291.1e-46Fumarylacetoacetase, C-terminal-like
[2-119] IPR0153773.6e-38Fumarylacetoacetase, N-terminal
Orthology groupMCL13147 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209521-TA
ATGAAATCCTTTATAGAGTATTCTGCGAACACAGATTTCCCGATTGAAAATTTACCCTACGGTGTATTTACTTCGCCCAAAAATGCCCAAAAACATATTGGAGTCGCGATTGGTGATTGGATTCTGGATTTGCATGTGATCTCTCATCTATTCTCAGGACCTCTACTGAAGGGAAAACAACATGTTTTTAGGGATGACAAATTGAACTCGTTTATGGGATTAACAAAGGCGCACTGGATTGAAGCAAGGGCCACTCTTCAGAGTCTTCTGGATGCGAGCAATACCAAATTGAAGAATGACTCCGAATTGCGACAAAAGGCGTTCGTTAAGCAATCCGAGGCTACCATGCATTTGCCAGCTGAGATCGGAGACTACACGGATTTTTACTCCTCAATCCACCACGCCACAAACGTAGGCATCATGTTCAGAAGCAAAGAGAACGCTCTCATGCCAAACTGGAAACACTTGCCAGTGGGTTACCATGGACGCAGCAGCTCCATCGTCATTTCTGGCACTCCGATTACAAGACCCTACGGTCAAACATTGCCTATTGAAGGAGCCGATCCTCACTTCGGTCCATCTCGTCTCATGGACTTCGAGCTGGAGATGGCCTGCTTCGTGGGTGGTCCACCGACGGCGCTTGGCGAACGAGTCTCCGCCCGTCATGCCGAAGAACGCCTGTTTGGATTCGTACTGATGAATGATTGGAGTGCTCGTGATATCCAAAAATGGGAGTACGTGCCGCTCGGTCCATTTACTGCCAAGAACCTGGGAACATCCATTTCCCCCTGGATCGTTACCATAGAGGCACTGCGTCCGTACATCGTGGACAACTATCCCCAGGATCCCGAGCCGTTCGCGTATTTGAAACACGACGATAAGTTCAACTTCGATATCAAACTCGAGGTGGATGTTAAGACTGACAAATCTCCTGTGGCAAACACCATCTGTCGCTCCAACTACCGCTTCATGTACTGGACCGTGAAACAGCAGCTAGCTCAGCAGACTGTCTCGGGCTGCAACCTGCGTCCTGGGGACTTGCTTGGATCGGGAACCATTAGCGGGGATACATCTGATTCATATGGCAGCATGTTAGAGTTGAGCTGGAAAGGAACCAAGCCAATCCGCTTACAGAATGGAGAAGAAAGGAAATTTTTGCAAGATGGCGACACAGTAATATTAAGAGGATATTGCATAAACGAGATCGGTGTCCGCATCGGTTTCGGAAAATGCGAAGGAAAGCTGCTTCCGGCTCTACCTTTCAAAGAATGA

Protein sequence:

>DPOGS209521-PA
MKSFIEYSANTDFPIENLPYGVFTSPKNAQKHIGVAIGDWILDLHVISHLFSGPLLKGKQHVFRDDKLNSFMGLTKAHWIEARATLQSLLDASNTKLKNDSELRQKAFVKQSEATMHLPAEIGDYTDFYSSIHHATNVGIMFRSKENALMPNWKHLPVGYHGRSSSIVISGTPITRPYGQTLPIEGADPHFGPSRLMDFELEMACFVGGPPTALGERVSARHAEERLFGFVLMNDWSARDIQKWEYVPLGPFTAKNLGTSISPWIVTIEALRPYIVDNYPQDPEPFAYLKHDDKFNFDIKLEVDVKTDKSPVANTICRSNYRFMYWTVKQQLAQQTVSGCNLRPGDLLGSGTISGDTSDSYGSMLELSWKGTKPIRLQNGEERKFLQDGDTVILRGYCINEIGVRIGFGKCEGKLLPALPFKE-