Monarch geneset OGS2.0

DPOGS201498
TranscriptDPOGS201498-TA744 bp
ProteinDPOGS201498-PA247 aa
Genomic positionDPSCF300006 + 710095-715928
RNAseq coverage349x (Rank: top 33%)
Annotation
HeliconiusHMEL0154901e-5948.99% 
BombyxBGIBMGA002587-TA3e-11477.37% 
DrosophilaPhm-PB3e-7457.14% 
EBI UniRef50UniRef50_O014044e-7257.14%Peptidylglycine alpha-hydroxylating monooxygenase n=22 Tax=Pancrustacea RepID=PHM_DROME
NCBI RefSeqXP_001841850.16e-8460.64%conserved hypothetical protein [Culex quinquefasciatus]
NCBI nr blastpgi|1700279311e-8260.64%conserved hypothetical protein [Culex quinquefasciatus]
NCBI nr blastxgi|1700279311e-8160.64%conserved hypothetical protein [Culex quinquefasciatus]
Group
Gene OntologyGO:00038241e-46catalytic activity
GO:00551141e-46oxidation-reduction process
GO:00099871e-46cellular process
GO:00167151e-46oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, reduced ascorbate as one donor, and incorporation of one atom of oxygen
GO:00055077.8e-23copper ion binding
GO:00044977.8e-23monooxygenase activity
GO:00160202.6e-21membrane
GO:00045042.6e-21peptidylglycine monooxygenase activity
GO:00065182.6e-21peptide metabolic process
KEGG pathway 
InterPro domain[81-240] IPR0089771e-46PHM/PNGase F-fold domain
[86-231] IPR0147841.6e-44Copper type II, ascorbate-dependent monooxygenase-like, C-terminal
[13-83] IPR0003237.8e-23Copper type II, ascorbate-dependent monooxygenase, N-terminal
[22-41] IPR0007202.6e-21Peptidyl-glycine alpha-amidating monooxygenase
Orthology groupMCL16085 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201498-TA
ATGCAGAACAATGATATCGACTCCCGCTACAACACGGCCAGCCCTTGTCAGTCTGGTTCACAGATAGTATACGCGTGGGCCCGCGATGCTCCGAGTCTCAAACTGCCGGAGGACGTCGGTTTCCTAGTGGGAGAGAACTCGCCGATCAAGTATTTAGTTTTGCAAGTGCATTATATGCATAAATTTCCAGAAGGGCAAACCGACAATTCAGGAGTGCTGTTACAGTATACGACGGAGAGGATGCCTCGTCAAGCGGGGGTGTTCCTGTTGGGCACTAGCGGGGTGATTGCACCAAACCGCGTTGAGCATATGGAGACGGCTTGCACGCTCCACGAGGACAAGGTCATACATCCCTTCGCCTTTAGACCTCACACTCATAGCCTCGGCAGGGAGGTTTCGGGGTACGTGGTGCGTCGGGCGTCGTCTGGTGATGAATGGCGTCTGCTGGGTCGCCGCGACCCCCAGGAGCCTCAGATGTTCTACCCCGTGGAGGATATGGACCCCATCAAGAAGAACGACGTACTAGCAGCGCGCTGCGTCATGAACAATACCCACGAATACCCCGTCAAGATTGGGGCTACCAACAACGATGAGATGTGCAACTTCTATCTGATGTACTGGGTGCAGAACGACACACCCCTGGCACAGAAGTACTGCTTCTCCGCTGGTCCACCATACTATTACTGGAATAGGGCCGTCGAAAACTTTGATCGTATACCAGATAGAGATATTAATATTATCTAG

Protein sequence:

>DPOGS201498-PA
MQNNDIDSRYNTASPCQSGSQIVYAWARDAPSLKLPEDVGFLVGENSPIKYLVLQVHYMHKFPEGQTDNSGVLLQYTTERMPRQAGVFLLGTSGVIAPNRVEHMETACTLHEDKVIHPFAFRPHTHSLGREVSGYVVRRASSGDEWRLLGRRDPQEPQMFYPVEDMDPIKKNDVLAARCVMNNTHEYPVKIGATNNDEMCNFYLMYWVQNDTPLAQKYCFSAGPPYYYWNRAVENFDRIPDRDINII-