Monarch geneset OGS2.0

DPOGS214069
TranscriptDPOGS214069-TA708 bp
ProteinDPOGS214069-PA235 aa
Genomic positionDPSCF300171 + 261215-263215
RNAseq coverage538x (Rank: top 23%)
Annotation
HeliconiusHMEL0128624e-7081.25% 
BombyxBGIBMGA010566-TA4e-10874.04% 
DrosophilaPH4alphaEFB-PA3e-8364.68% 
EBI UniRef50UniRef50_Q53EK06e-10674.04%Prolyl 4-hydroxylase alpha subunit n=8 Tax=Endopterygota RepID=Q53EK0_BOMMO
NCBI RefSeqNP_001037195.11e-10674.04%prolyl 4-hydroxylase alpha subunit [Bombyx mori]
NCBI nr blastpgi|1129845202e-10574.04%prolyl 4-hydroxylase alpha subunit precursor [Bombyx mori]
NCBI nr blastxgi|1129845201e-10274.04%prolyl 4-hydroxylase alpha subunit precursor [Bombyx mori]
Group
Gene OntologyGO:00167054.5e-72oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen
GO:00055064.5e-72iron ion binding
GO:00551144.5e-72oxidation-reduction process
GO:00314184.5e-72L-ascorbic acid binding
GO:00167064.2e-13oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, 2-oxoglutarate as one donor, and incorporation of one atom each of oxygen into both donors
GO:00164914.2e-13oxidoreductase activity
KEGG pathwaytca:6578622e-84 
 K00472 (E1.14.11.2)maps-> Arginine and proline metabolism
InterPro domain[22-203] IPR0066204.5e-72Prolyl 4-hydroxylase, alpha subunit
[99-203] IPR0051234.2e-13Oxoglutarate/iron-dependent oxygenase
Orthology groupMCL10977 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214069-TA
ATGACAGAAAACCACCCGTTCCTGAGACTAGCGCCGGTCAGAATGGAGTACTTGTACAGGAATCCAGACATTATTGTATTTAACGACGTCCTCAGCGACTATGAAATTGATTATATAAAAAGAATCGCTCAACCGAGGTTCCGTCGCGCGACCGTCCACGACCCAGCGACCGGCGAGTTGGTCCCCGCTCACTATAGGATCAGCAAGTCAGCGTGGCTCAAGGACGAGGAGTCAGCGGTGGTCGCGCGTGTGTCACGGCGTGTCGCCGATATAACCGGCCTTAGTATGACCACCGCTGAGGAGTTACAGGTCGTTAACTACGGCATCGGAGGCCATTATGATCCACACTTTGACTTCGCAAGGAAAGAAGAAAACGCATTTGAGAAGTTCAATGGCAACCGCATAGCTACAGTCCTGTTTTACATGTCAGACGTGGCTCAAGGCGGGGCGACAGTGTTCACTGAGCTCGGCCTAAGCGTGTTCCCGCGGCGAGGGTCGGCTGTGTTCTGGCTGAATCTGCATCCGTCCGGTGAAGGAGACCTCGCCACCCGACACGCCGCCTGCCCCGTACTGAGGGGCTCCAAGTGGGTTTGTAACAAGTGGATACATCAAGGCGGCCAAGAATTAATAAGACCTTGCAATCTGGAATACCAAAAAGAAAGCATCATACGGCCGATACCGCAACCAATCGAAAAGTTCTTCAGATAA

Protein sequence:

>DPOGS214069-PA
MTENHPFLRLAPVRMEYLYRNPDIIVFNDVLSDYEIDYIKRIAQPRFRRATVHDPATGELVPAHYRISKSAWLKDEESAVVARVSRRVADITGLSMTTAEELQVVNYGIGGHYDPHFDFARKEENAFEKFNGNRIATVLFYMSDVAQGGATVFTELGLSVFPRRGSAVFWLNLHPSGEGDLATRHAACPVLRGSKWVCNKWIHQGGQELIRPCNLEYQKESIIRPIPQPIEKFFR-