Monarch geneset OGS2.0

DPOGS204629
TranscriptDPOGS204629-TA1863 bp
ProteinDPOGS204629-PA620 aa
Genomic positionDPSCF300277 - 164594-169441
RNAseq coverage336x (Rank: top 34%)
Annotation
HeliconiusHMEL0105786e-14368.41% 
BombyxBGIBMGA014459-TA2e-5549.00% 
Drosophila% 
EBI UniRef50UniRef50_F5HKE41e-6631.16%AGAP002933-PB n=1 Tax=Anopheles gambiae RepID=F5HKE4_ANOGA
NCBI RefSeqXP_001237409.26e-6634.52%AGAP002934-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3479688935e-6631.16%AGAP002933-PB [Anopheles gambiae str. PEST]
NCBI nr blastxgi|3479688932e-6431.32%AGAP002933-PB [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00167063.8e-31oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, 2-oxoglutarate as one donor, and incorporation of one atom each of oxygen into both donors
GO:00055063.8e-31iron ion binding
GO:00551143.8e-31oxidation-reduction process
GO:00314183.8e-31L-ascorbic acid binding
GO:00167052.7e-10oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen
KEGG pathway 
InterPro domain[382-605] IPR0196013.8e-31Oxoglutarate/iron-dependent oxygenase, C-terminal degradation domain
[164-358] IPR0066202.7e-10Prolyl 4-hydroxylase, alpha subunit
Orthology groupMCL16037 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204629-TA
ATGAGTTCTCCGACTAAAGAAACTGAAGATCCATCATCTAGCAATGCTGAGAGTGAGGAATACACAGGCGGAAATAGCGATGCGAATACAGAGCAGAGACCGCCCGCCAAGCGGCCTATGTCTACAGCAGTTATTGAAATTTCAGACACTGAAAGTGATGATTCCGATGTCTGCGCTGTTAATTCTTACCAGGCCTCAGCTGATGAGGTGAAAAGAATACGAAGAGATTATTCATCTTCGTCATCATCTTCCTCATCATCAAACTACAGCTCTGATTCTGACTCACCATGGGAAGATGACTCTGTAGTAATAGATGATAAAGCAATGGGAAGGCCTGTGATTGCTAAGATGTTAGTTAGAGCTAATAGAATGGATGACCCTAAATTTAATCCTGAACTGAAGTCTCAAGAGATCATAAGTAAAATTAAATCTCACTGGGATGAGAAGACAGACCACAGTAGTGACCAAGTGACCTTAACATGTAAACCGTTCAGACTCTGTCGGATTCATGGCTTGTTAGAGAACTCGGAGATAATAAATAATATAGTGGACGACATGAACACATTGGACTGGTCGAGGAAGAAGATGGATCTGTACGAGTTTCACCAGACCTCTGACTTAGCAAACTTAACTTGGCAGCGTAGTATAAGAGGTATTTACGAATTATTGAAGACTGAAGTAATGACTTGGGTGTCGCAAGTAACGGGCATAGAGTTGACATCAGTGTCGGCGTCATGTTCGCTGTATGGCCCCGGAGACCATCTCTTGGTTCACGATGATCGACTCGGGGACAGGAGGGTGGCCTTCATCCTGTACCTAGCACCCTGGACGCCACGATCACCACCACACATGCAGAACGGAGCTGAAAGTCAAGATAAGTGTTGGAGCGGTCCGGGCTGGAGGCCGCATATGGGTGGAGCGTTGGAGTTGGTCGAGGATGGACAGGTTGTGTTCCGTGCCTTCCCCGCTAATAATACATTAGCATTCTTCGCAGTCGGCCCGACGTCCTTTCATCAGGTGGGCGAAGTCCTATCTATGGAGCTTCCTCGGCTGTCTATTAACGGTTGGTTTCACGGTCCGGCGCCGGAGTCCGAGGAGCCGCACGCGGAGCTCCCAGTGCCACTCACACCGCACAACCAAGTGGTGGTGTTGAAGTCGTGGGTAGAGGCTGGGTACTTGTGTCCCCGAGCTCGAGCCCAGGTCCAGGCGCAGATGGAGCGTGCCAGCGAGGTCTGCCTGCATGACCTGCTGCTGCCATCGCGATGCCAGCAACTGCTGGAAGCGCTGGAGAAGAATGACATAGAATGGGAGCAGTGCGGTCCAGCACATCAGCGACGGTATCAGCGAGTGACGGAGAAATGGCTCTCAGCCAGCGAACTCTCTGAGGCAACAGAGGAAGAAGCCATCCAGGGCGAAGAGCCCGACGACTGCGGGGTACAGGGGGAGACGCATGTCGTACGAGCACTGCTAAGGCTCCTCAGTAGTACAGCATTCATGAGGCTGGTGGCGGACTGTACAGATCTACCGCTGACTTTGTACAGGAAACTAGAAATGCAACGCTGGCGGGCTGGAGATTTCACTCTTCTCCCGCCCCGGGAACATTATCAGCAGCCTCGTCTAGAGGCAGTCCTGTATCTGGGTGTGCCGAAACATCCTATCTGTGGAGGTCAAACGTTATATGTGGCCCCAGAAGAGGGGTCGCTTGCGGAGGCCGAGGCATTGGTGACTCTGCCCCCCAGACACAACGCGTTAGGGCTGGTGTACTGCGACGCTGGCGCAGCCTCCTTCACCAAATATCTCAGCAAGATGACCATGTCGGAGAACGAGTGCTTCTATATAGTGACCTGTACTTATACCGAGTGA

Protein sequence:

>DPOGS204629-PA
MSSPTKETEDPSSSNAESEEYTGGNSDANTEQRPPAKRPMSTAVIEISDTESDDSDVCAVNSYQASADEVKRIRRDYSSSSSSSSSSNYSSDSDSPWEDDSVVIDDKAMGRPVIAKMLVRANRMDDPKFNPELKSQEIISKIKSHWDEKTDHSSDQVTLTCKPFRLCRIHGLLENSEIINNIVDDMNTLDWSRKKMDLYEFHQTSDLANLTWQRSIRGIYELLKTEVMTWVSQVTGIELTSVSASCSLYGPGDHLLVHDDRLGDRRVAFILYLAPWTPRSPPHMQNGAESQDKCWSGPGWRPHMGGALELVEDGQVVFRAFPANNTLAFFAVGPTSFHQVGEVLSMELPRLSINGWFHGPAPESEEPHAELPVPLTPHNQVVVLKSWVEAGYLCPRARAQVQAQMERASEVCLHDLLLPSRCQQLLEALEKNDIEWEQCGPAHQRRYQRVTEKWLSASELSEATEEEAIQGEEPDDCGVQGETHVVRALLRLLSSTAFMRLVADCTDLPLTLYRKLEMQRWRAGDFTLLPPREHYQQPRLEAVLYLGVPKHPICGGQTLYVAPEEGSLAEAEALVTLPPRHNALGLVYCDAGAASFTKYLSKMTMSENECFYIVTCTYTE-