Monarch geneset OGS2.0

DPOGS203007
TranscriptDPOGS203007-TA1359 bp
ProteinDPOGS203007-PA452 aa
Genomic positionDPSCF300068 + 95083-98138
RNAseq coverage722x (Rank: top 18%)
Annotation
HeliconiusHMEL0110200.090.73% 
BombyxBGIBMGA003866-TA0.088.29% 
DrosophilaHn-PA0.072.01% 
EBI UniRef50UniRef50_P172760.072.01%Protein henna n=36 Tax=Eukaryota RepID=PH4H_DROME
NCBI RefSeqXP_001600555.10.073.73%PREDICTED: similar to phenylalanine hydroxylase [Nasonia vitripennis]
NCBI nr blastpgi|840950740.089.89%phenylalanine hydroxylase [Papilio xuthus]
NCBI nr blastxgi|840950740.089.89%phenylalanine hydroxylase [Papilio xuthus]
Group
Gene OntologyGO:00167141.2e-288oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, reduced pteridine as one donor, and incorporation of one atom of oxygen
GO:00551141.2e-288oxidation-reduction process
GO:00090722.5e-279aromatic amino acid family metabolic process
GO:00055062.5e-279iron ion binding
GO:00044972.5e-279monooxygenase activity
GO:00045053.6e-226phenylalanine 4-monooxygenase activity
GO:00065593.6e-226L-phenylalanine catabolic process
GO:00165971.9e-07amino acid binding
GO:00081521.9e-07metabolic process
KEGG pathwaynvi:1001159840.0 
 K00500 (phhA, PAH)maps-> Phenylalanine metabolism
    Phenylalanine, tyrosine and tryptophan biosynthesis
InterPro domain[1-453] IPR0197731.2e-288Tyrosine 3-monooxygenase-like
[24-445] IPR0012732.5e-279Aromatic amino acid hydroxylase
[23-452] IPR0059613.6e-226Phenylalanine-4-hydroxylase, tetrameric form
[120-449] IPR0197745.6e-181Aromatic amino acid hydroxylase, C-terminal
[41-100] IPR0029121.9e-07Amino acid-binding ACT
Orthology groupMCL14962 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203007-TA
ATGGAGCCGAGTGTAAACATACTATCAACCTCGCCGATGGACAAGCCAAAGTTAATGCAGGGTGGCAACTACATAGCCGAGGGACGCGATTCTAAAAAGTCAACATGGCTCTTATTTTCTCCGGAGACTCCGGATCAAGCTGGTTCTTTGGAGAAATTTCTGAGTATCTTTTCATCTCACGGGGTCAACTTGAGCCACATCGAATCTCGCTCCTCTGCCAGGAGACCAGGCTATGAATTCATGGTCGAGTGTGAACACGAATCCGGGGACTTTGGAGCGGCTTTGGATGAGCTGAAGAAGAGCACTGGATATCTCAACATTATTTCTAGAAACTACAAGGATAATAGATCTGCGGTGCCTTGGTTCCCTCGCCGTATTCGTGATCTGGATAGATTCGCTAATCAGATATTGTCTTATGGAGCCGAGCTCGACTCAGATCATCCAGGTTTCACAGACCCGGAGTACCGCGCGAGAAGAAAGTATTTTGCTGATATCGCTTACAACTACAAGCACGGCCAGCCGCTGCCTCACGTGAATTATACTAAAGAAGAAATTAACACATGGGGAGTAGTGTTCAGGAAGCTCACGGAACTCTACCCGACGCACGCCTGCAAGGAACACAATCATGTTTTTCCGCTTTTGATTGAAAACTGTGGTTATAGGGAGGACAATATTCCACAACTCGAAGACGTATCTAACTTTCTCAAAGATTGCACTGGATTCACTCTCCGTCCAGTGGCAGGTCTGCTTTCTTCACGAGATTTCCTCGCTGGCTTGGCGTTCCGTGTATTTCATAGTACTCAGTACATTAGGCACCATTCTCGTCCCCTTTACACTCCTGAACCTGATGTCTGCCACGAGCTCCTCGGACACGCGCCATTGTTCGCTGATCCCGCGTTCGCACAGTTCTCTCAGGAAATCGGCCTGGCTTCATTGGGAGCTCCTGACGATTTTATCGAAAGACTTGCAACGTGCTTTTGGTTTACTGTTGAATTTGGTCTGTGTCGGCAAGAAGGACAGCTGAAGGCATACGGCGCCGGTTTGCTGTCATCATTCGGTGAACTTCAATATTGTCTCTCAGATAAGCCACAGCTCCAAGAATTTGAACCAGAAATCACGGGAGAACAGAAGTATCCTATCACTGAATACCAACCAATATATTTCGTTGCTAACAGTTTTGAAAGTGCTAAGGAAAAGATGATCAAATTCGCCCAAACAATACCCCGTGACTTCGGAGTGAGATACAATCCCTACACCCAAAGTATTGACCTCCTAGATTCTCCACGGCAGATGAAAGATCTGCTGAAAGGCATCCGCCAAGAAATGGAACTGCTGGTTGGCACCATGGACAAGTTGTAG

Protein sequence:

>DPOGS203007-PA
MEPSVNILSTSPMDKPKLMQGGNYIAEGRDSKKSTWLLFSPETPDQAGSLEKFLSIFSSHGVNLSHIESRSSARRPGYEFMVECEHESGDFGAALDELKKSTGYLNIISRNYKDNRSAVPWFPRRIRDLDRFANQILSYGAELDSDHPGFTDPEYRARRKYFADIAYNYKHGQPLPHVNYTKEEINTWGVVFRKLTELYPTHACKEHNHVFPLLIENCGYREDNIPQLEDVSNFLKDCTGFTLRPVAGLLSSRDFLAGLAFRVFHSTQYIRHHSRPLYTPEPDVCHELLGHAPLFADPAFAQFSQEIGLASLGAPDDFIERLATCFWFTVEFGLCRQEGQLKAYGAGLLSSFGELQYCLSDKPQLQEFEPEITGEQKYPITEYQPIYFVANSFESAKEKMIKFAQTIPRDFGVRYNPYTQSIDLLDSPRQMKDLLKGIRQEMELLVGTMDKL-