Monarch geneset OGS2.0

DPOGS214068
TranscriptDPOGS214068-TA1164 bp
ProteinDPOGS214068-PA387 aa
Genomic positionDPSCF300171 + 251739-260666
RNAseq coverage137x (Rank: top 55%)
Annotation
HeliconiusHMEL0128626e-7644.21% 
BombyxBGIBMGA010566-TA2e-7345.19% 
DrosophilaPH4alphaEFB-PA3e-3541.79% 
EBI UniRef50UniRef50_Q53EK03e-7145.19%Prolyl 4-hydroxylase alpha subunit n=8 Tax=Endopterygota RepID=Q53EK0_BOMMO
NCBI RefSeqNP_001037195.16e-7245.19%prolyl 4-hydroxylase alpha subunit [Bombyx mori]
NCBI nr blastpgi|1129845201e-7045.19%prolyl 4-hydroxylase alpha subunit precursor [Bombyx mori]
NCBI nr blastxgi|1129845201e-7344.50%prolyl 4-hydroxylase alpha subunit precursor [Bombyx mori]
Group
Gene OntologyGO:00167021e-38oxidoreductase activity, acting on single donors with incorporation of molecular oxygen, incorporation of two atoms of oxygen
GO:00057831e-38endoplasmic reticulum
GO:00046561e-38procollagen-proline 4-dioxygenase activity
GO:00551141e-38oxidation-reduction process
GO:00054881.1e-30binding
KEGG pathwayame:4088627e-40 
 K00472 (E1.14.11.2)maps-> Arginine and proline metabolism
InterPro domain[26-161] IPR0135471e-38Prolyl 4-hydroxylase alpha-subunit, N-terminal
[240-343] IPR0119901.1e-30Tetratricopeptide-like helical
Orthology group 
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214068-TA
ATGTATAGGGTTAATATTTTTTGTTTATTATCAATATTATTGAGCCTGGTGACAACATCACATGGAGAGTTGTTCACTGCGATATCGGACGTAGAACCTTTATTGGAAACTCATAAGAAGATCATCGACGATTTAGAAGATTACATAAAAAAAGAAGAGGACAGGTTGCAGGCGTTGAAAAGACACTTGGTTATATACAGACGGGAACATGAGCAAGCTATGGAGGATATACCCAACTACCTCGGGAATCCTATAAACGCGTTCACTTTGATAAAGAGATTGACCATAGACTTGGACGACATTGAGAAGAGTATTGAAATAGGCACTGAATATATTAAAAACATAACAATTATAAATAATCACGCGAATGTCAAATATCCAACTTTGGAGGATTTAACTGGAGCTGCGCAGGCGCTGACAAGATTGCAACAGACATACAAACTAGATGTCAAGGACCTGTCAGAGGGGAGGCTCAATGGAGTTGTTTACAGGTCTGCTGGTTTATTACAAGGCTTGCAACGAACGAATATTCCTTTACTATTCTTATTCGAATATATTAAAAACATAACAATTATAAATAATCACGCGAATGTCAAATATCCAACTTTGGAGGATTTAACTGGAGCTGCGCAGGCGCTGACAAGATTGCAACAGACATACAAACTAGATGTCAAGGACCTGTCAGAGGGGAGGCTCAATGGAGTTGTTTACAGCACACCAATGAGCGCTGGAGACTGCTACGAGCTCGGCAAAGCTCTGTACAATGAAAAGGATTATGACAATGCGTTGGATTGGATGATGGAGGCGTTGTACAAGTTTGTGGACGAAGACCAACCCTACCCGTTCAGCGAGGCCGATATATTGGAATACATAAGCTTCTCCCATTATCTTTTAGGCGATCTTAAAAGCGCGATAGAATGGACTAAGAATCTTTTGCTGGTTGAACCAAAACACAGCAGAGCTGCCGGCAATCTGCCTCATTACCAGAAAGCGTTACAGGACAAAGAGTTGCAGGCGAAGAAAAGGCGCAAAGGGGACACTGGCGAACCGGATATCGAATACGAGGCGCAGGTTAAAGAGTCAAGGTCAACGTCCTACAACGCGGAGAGACAATCATATGAAGCGTTATGCCGCGGGGAGAAGGAATTACCACGCCTTTATTGA

Protein sequence:

>DPOGS214068-PA
MYRVNIFCLLSILLSLVTTSHGELFTAISDVEPLLETHKKIIDDLEDYIKKEEDRLQALKRHLVIYRREHEQAMEDIPNYLGNPINAFTLIKRLTIDLDDIEKSIEIGTEYIKNITIINNHANVKYPTLEDLTGAAQALTRLQQTYKLDVKDLSEGRLNGVVYRSAGLLQGLQRTNIPLLFLFEYIKNITIINNHANVKYPTLEDLTGAAQALTRLQQTYKLDVKDLSEGRLNGVVYSTPMSAGDCYELGKALYNEKDYDNALDWMMEALYKFVDEDQPYPFSEADILEYISFSHYLLGDLKSAIEWTKNLLLVEPKHSRAAGNLPHYQKALQDKELQAKKRRKGDTGEPDIEYEAQVKESRSTSYNAERQSYEALCRGEKELPRLY-