Monarch geneset OGS2.0

DPOGS206336
TranscriptDPOGS206336-TA3165 bp
ProteinDPOGS206336-PA1054 aa
Genomic positionDPSCF300082 + 126804-149431
RNAseq coverage348x (Rank: top 34%)
Annotation
HeliconiusHMEL0027150.084.96% 
BombyxBGIBMGA005262-TA0.085.07% 
DrosophilaCG11796-PA2e-15968.87% 
EBI UniRef50UniRef50_D3TQN48e-15868.07%4-hydroxyphenylpyruvate dioxygenase n=1 Tax=Glossina morsitans morsitans RepID=D3TQN4_GLOMM
NCBI RefSeqXP_973835.19e-16068.34%PREDICTED: similar to 4-hydroxyphenylpyruvate dioxygenase [Tribolium castaneum]
NCBI nr blastpgi|910909082e-15868.34%PREDICTED: similar to 4-hydroxyphenylpyruvate dioxygenase [Tribolium castaneum]
NCBI nr blastxgi|910909086e-15568.34%PREDICTED: similar to 4-hydroxyphenylpyruvate dioxygenase [Tribolium castaneum]
Group
Gene OntologyGO:00090728.4e-229aromatic amino acid family metabolic process
GO:00167018.4e-229oxidoreductase activity, acting on single donors with incorporation of molecular oxygen
GO:00038688.4e-2294-hydroxyphenylpyruvate dioxygenase activity
GO:00551148.4e-229oxidation-reduction process
KEGG pathwaytca:6626583e-159 
 K00457 (HPD, hppD)maps-> Tyrosine metabolism
    Phenylalanine metabolism
    Ubiquinone and other terpenoid-quinone biosynthesis
InterPro domain[669-1054] IPR0059568.4e-2294-hydroxyphenylpyruvate dioxygenase
[854-1008] IPR0043605.8e-19Glyoxalase/fosfomycin resistance/dioxygenase
Orthology groupMCL11949 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206336-TA
ATGAGATCCTGCCGCTCACCACGGCCGCCTGTTCACCAAGCTTACGAGCCCCACAGCCTTCCAGAGTACGACAAACCAAATCCTCCGTCTCGCCAGAGAATAAAAATGACGAAAGTATCAGAGACAATTAAGCAAGCGAAGGGAAAAGTGCTCAACTTCGACCATCTGACGTTTTGGGTGGCCAACGCAAAAACGGCTTCCAGTTACTTCGTAACACGCTTCGGTTTCAAACCCTTGGCTGTTCGTGAACCTTCAGAAGAAAGACAAGTGCTATCCCACGCTGTACAACTCAACAAAATAACTATAATATTCGAGTCACCGACTGTTAATGATCACAACATATCCAAAGATTTAGCAGCCCATGGTGATTTCGTTAAGGACGTATCATTTGAAGTAAGCGACCTGGAATCTATATTCGGAAGCGCTAAAACAAAAGGAGCTCATGTGATTAAAGAGATTACTGAAGAAAGCGACGAAAATGGTCTCATAAGATATGCTGTACTGAGAACGTATGGCGACAATACACATACTCTGGTTGATAGGTCCAAATATAACGGACTGTTGTTTCCTGGGTACAAGAAATCTGAAGAGGATTTAGCCAATAAGTTACTGCCAGACACAAATTTACGTTTTGTGGATCACGTCGAAGGAAATATGGCGGACGAAACTCTAGAAGATTCCGTCTCTTGGTATGAAAAGAACCTCAACATGCTCAGATTTTGGTGTGTTGACTACAGCCATGATTTGACGCCGTATTCATGTATCAACTCAGCTGCTGTTATTAACGAAAACGAAACCGTTCTTTTATCTATGAACGAGTCAGCCCCGGGTAAGCGTCCTACTAGCAAGGCTCGCGACTTCGTAGCATCACACGGCACGTCCGGCATTGAACACGTCGCCTTTTATACTGACGATATTGTACACACTGTGGATAAAATGACGAAAGTATCAGAGACAATTAAGCAAGCGAAGGGAAAAGTACTCAACTTCGACCATCTGACGTTTTGGGTGGCCAACGCAAAAACGGCTTCCAGTTACTTCGTAACACGCTTCGGTTTCAAACCCTTGGCGGTTCGTGAACCTTCAGAAGAAAGACAAGTGCTATCCCACGCTGTACAACTCAACAAAATAACTATAATCTTCGAGTCACCGACTGTTAATGATCACGACATATCCAAAGATTTAACAGCCCATGGTGATTTCGTTAAGGACGTATCATTTGAAGTAAGCGACCTGGAATCTATATTCGGAAGCGCTAAAACAAAAGGAGCTCATGTGATTAAAGAGATTACTGAAGAAAGCGACGAAAATGGTCTCATAAGATATGCTGTACTGAGAACGTATGGCGACAATACACATACTCTGGTTGATAGGTCCAAATATAACGGACTGTTGTTTCCTGGGTACAAGAAATCTGAAGAGGATTTAGCCAATAAGTTACTGCCAGACACAAATTTACGTTTTGTGGATCACGTCGAAGGAAATATGGCGGACGAAACTCTAGAAGATTCCGTCTCTTGGTATGAAAAGAACCTCAACATGCTCAGATTTTGGTGTGTTGACTACAGCCATGATTTGACGCCGTATTCATGTATCAACTCAGCTGCTGTTATTAACGAAAACGAAACCGTTCTTTTATCTATGAACGAGTCAGCCCCGGGTAAGCGTCCTACTAGCAAGGCTCGCGACTTCGTAGCATCACACGGCACGTCCGGCATTGAACACGTCGCCTTTTATACTGACGATATTGTACACACTATGAAGAGTTTAAAAGCACGTGGCGCCGATATTGTAACCTGGCCACCGACGTATTACGAACTTATAAAGGAGAAACTCAAAGAGAGCTCCGTAAACGTTACCGAAAGTATTGAAGAACTGAAGGAAAATAACATATTGATAGACTTCGACGAAAAAGGTTACATGCTGCAAGCTTTCACTAAACATCTACAAGTTCGTCCGACACTATTTATAGAGGTCATACAAAGGAGGAATCATAAGGGTTTCGGAGCTATGAACTATCAATGGACGTCCTACACAGACAAGGGAAAAAAACCCGAAGACGGTCGGTTCCTAGCTTTCGATCATGTAACCTTCTGGGTTTCAAACGCTAAACAGGCCGCTAGTTATTACGTCACACGGTTCGGGTTCGAACCGCTCGCTTACAAAGGTTTAGAAACAGGATCCAGGCAGTTTTCCTCTCACGCTGTCAGATTAAATAAAATCATTTTCGTGTTTGAGGGTCAGTATAACCCAGAAGAGACAGATTTCATCAACGAAGTAGGTTATCACGGCGACTTTGTGAAGGATGTCGCCTTTGAAGTTGAAAACTTGGATTACATTCTAAACTACGCTAAAAAACAAGGTGCAGTTGTTATCAAGGACGTTTGGGAAGAAAAAGACGAGCATGGAGTGGTCAAGTCAGCTACACTTAAAACGTACGGCGACAATACGCATACTTTAGTGGATAGATCACAATATAAGGGACCCTTCCTGCCTGGATATCAGATGTTACAGAAGGATCCCATTCATAAATTCCTACCGAAGGTGGAGATTAACTTCATAGATCACGTGGTGGGAAATCAACCAGACAATGGTCTCGAGGAAGCGGCGTCGTGGTATGAACGCTGTCTGCAGTTCCATCGGTTCTGGTCGGTGGACGATAAGCAAATATGCACGGAGTATTCGTCACTGCGATCAATAGTGATGGCGAACTATGAGGAGACGGTTAAGATGCCGCTCAACGAACCCGCAGACGGCAAACGGAAGAGTCAGATTCAGGAATACGTGGAGTACCACGGGGGTGCGGGAGTTCAACACATCGCTTTGAACACAGAAGATATCATAACAGCCGTTGAAAATCTTCGAGCACGAGGAGTAGAATTCTTGACAATTCCATCAAAGTACTACAAGCTGATCAGAGAAAAACTATCACACAGCAAGGTGAAGGTGGCTGAGAGTATAGACATATTGGAGCGCCTCAATATCCTCATTGATTACGATGATGACGGGTATTTACTGCAGATATTCACAAAGAACACCCAGGATCGCCCCACACTCTTCTTGGAAGTTATACAGAGAAGAAATTTCAATGGTTTCGGCGCCGGTAACTTTAAAACTTTATTCGAGTCTATAGAAATCGAGCAAGAAAAGAGAGGAAACTTATAA

Protein sequence:

>DPOGS206336-PA
MRSCRSPRPPVHQAYEPHSLPEYDKPNPPSRQRIKMTKVSETIKQAKGKVLNFDHLTFWVANAKTASSYFVTRFGFKPLAVREPSEERQVLSHAVQLNKITIIFESPTVNDHNISKDLAAHGDFVKDVSFEVSDLESIFGSAKTKGAHVIKEITEESDENGLIRYAVLRTYGDNTHTLVDRSKYNGLLFPGYKKSEEDLANKLLPDTNLRFVDHVEGNMADETLEDSVSWYEKNLNMLRFWCVDYSHDLTPYSCINSAAVINENETVLLSMNESAPGKRPTSKARDFVASHGTSGIEHVAFYTDDIVHTVDKMTKVSETIKQAKGKVLNFDHLTFWVANAKTASSYFVTRFGFKPLAVREPSEERQVLSHAVQLNKITIIFESPTVNDHDISKDLTAHGDFVKDVSFEVSDLESIFGSAKTKGAHVIKEITEESDENGLIRYAVLRTYGDNTHTLVDRSKYNGLLFPGYKKSEEDLANKLLPDTNLRFVDHVEGNMADETLEDSVSWYEKNLNMLRFWCVDYSHDLTPYSCINSAAVINENETVLLSMNESAPGKRPTSKARDFVASHGTSGIEHVAFYTDDIVHTMKSLKARGADIVTWPPTYYELIKEKLKESSVNVTESIEELKENNILIDFDEKGYMLQAFTKHLQVRPTLFIEVIQRRNHKGFGAMNYQWTSYTDKGKKPEDGRFLAFDHVTFWVSNAKQAASYYVTRFGFEPLAYKGLETGSRQFSSHAVRLNKIIFVFEGQYNPEETDFINEVGYHGDFVKDVAFEVENLDYILNYAKKQGAVVIKDVWEEKDEHGVVKSATLKTYGDNTHTLVDRSQYKGPFLPGYQMLQKDPIHKFLPKVEINFIDHVVGNQPDNGLEEAASWYERCLQFHRFWSVDDKQICTEYSSLRSIVMANYEETVKMPLNEPADGKRKSQIQEYVEYHGGAGVQHIALNTEDIITAVENLRARGVEFLTIPSKYYKLIREKLSHSKVKVAESIDILERLNILIDYDDDGYLLQIFTKNTQDRPTLFLEVIQRRNFNGFGAGNFKTLFESIEIEQEKRGNL-