Monarch geneset OGS2.0

DPOGS205057
TranscriptDPOGS205057-TA2811 bp
ProteinDPOGS205057-PA936 aa
Genomic positionDPSCF300074 - 377279-387993
RNAseq coverage24x (Rank: top 78%)
Annotation
HeliconiusHMEL0079280.079.11% 
BombyxBGIBMGA006804-TA0.087.10% 
DrosophilaTdc2-PA0.070.49% 
EBI UniRef50UniRef50_A1Z6N40.070.49%MIP05841p n=17 Tax=Neoptera RepID=A1Z6N4_DROME
NCBI RefSeqXP_001870804.10.070.33%aromatic amino acid decarboxylase [Culex quinquefasciatus]
NCBI nr blastpgi|1700488480.070.33%aromatic amino acid decarboxylase [Culex quinquefasciatus]
NCBI nr blastxgi|1953835060.070.66%GJ20190 [Drosophila virilis]
Group
Gene OntologyGO:00197521.6e-278carboxylic acid metabolic process
GO:00168311.6e-278carboxy-lyase activity
GO:00301701.6e-278pyridoxal phosphate binding
GO:00038247.9e-101catalytic activity
GO:00065201.5e-76cellular amino acid metabolic process
KEGG pathwaycel:K01C8.30.0 
 K01593 (E4.1.1.28, DDC)maps-> Betalain biosynthesis
    Isoquinoline alkaloid biosynthesis
    Tryptophan metabolism
    Tyrosine metabolism
    Histidine metabolism
    Phenylalanine metabolism
    Indole alkaloid biosynthesis
InterPro domain[1-526] IPR0021291.6e-278Pyridoxal phosphate-dependent decarboxylase
[1-484] IPR0154243e-139Pyridoxal phosphate-dependent transferase, major domain
[84-366] IPR0154217.9e-101Pyridoxal phosphate-dependent transferase, major region, subdomain 1
[6-25] IPR0109771.5e-76Aromatic-L-amino-acid decarboxylase
[367-482] IPR0154223.2e-38Pyridoxal phosphate-dependent transferase, major region, subdomain 2
Orthology groupMCL10444 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205057-TA
ATGGACGTCGAAGAGTTCCGTGTTCGGGGTAAAGAAATGGTTGACTACATCTGTACCTATATGACGACCCTGTCGAAGCGGAGGGTGACTCCATCGGTTGAGCCCGGTTACCTCCGCACGGAACTGCCGACGGAGGCTCCTTTCCTTCCTGAAAATTGGAATGATGTGATGGAAGATGTGGAAAATAAGATTATGCCGGGCGTCACACATTGGCAGCATCCTCGGTTTCATGCATATTTCCCATCGGGCAATGGCTACCCCTCAATACTTGGTGACATGCTCTCCGCAGGCATCGGCTGTATCGGATTTTCATGGGCTGCGAGTCCAGCTTGCACGGAATTAGAAATTATAATGTTGGATTGGATGGGAAAAGCTATAGGGTTGCCTCCCGCTTTTCTGCAACTTGAGGAAGGAAGCAAGGGCGGTGGCGTTATTCAGGGATCAGCCAGCGAGTGTGTACTTGTGTGCATGTTAGCCGCAAGAGCTGCAGGGATCAAGCGATTGAAGCATCAATTCCCGACCGTCGATGAGGGGCTGTTACTTTCAAAGTTAATTGCTTATTGTTCCAAAGAAGCACACTCTTGTGTTGAGAAAGCTGCTATGATAAGTTTCGTTAAACTGCGTATTCTACAGCCGGACGAACACGGTTCACTTAGAGGGGATACATTAAAAGAAGCAATGGAAGAAGATGAAGAAGCGGGACTAGTCCCATTTTTCGTTTCAGCAACGCTAGGGACAACAGGGACGTGTGCATTTGATAATTTGTCCGAAATTGGACCGGTAGTTCGGAAATTTCCTAGCGTTTGGCTGCATGTAGACGCTGCGTATGCTGGCAGCTCATTTATCTGCCCTGAACATAAATATCATCTGGCAGGAATTGAATATGCTGACTCATTTAATACTAATTCAAATAAAATGATGCTCACCAACTTTGATTGTTCTTTAATGTGGGTCACAAACAGATATCTATTGACATCTGCTTTAGTCGTCGATCCGTTGTATTTACAACATTGTTATGACGGTACCGCAATCGATTACCGCCACTGGGGAATACCGCTCAGCCGTCGCTTCAGATCACTGAAGTTGTGGTTCATGTTGAGGAGTTATGGAATCAGTGGCCTGCAGAAATATATACGAAGACATTGCGAACTCGCTAAGTATTTCGAACAACTTGTTAAAAAGGACAAGAGATTCGAAGTATGCAACCAAGTTAAGTTGGGATTAGTATGCTTTCGATTGGTAGGGAGTCGCGACGAAAATGAGGAACAAGTTGATGAGTTGAATAAGAAACTGCTTACTAACATCAATGCTTCTGGAAAGCTCCACATGGTGCCCACTTCTTTTCGTGATCGATACGTGATTCGTTTCTGTGTTGTGCACCAACACGCTAGCCGTGAAGATATTGAATATGCTTGGGATACCATAACTGACTTCGCAGAAGAATTATACGAAGGTCCCGATAAGGAAAGGGATTTGAATGAGGAAAGGGCACGTAAGCATCTGCAAGCTCTCGCTCATAAGCGTTCGTTCTTCGTGCGCATGGTGAGCGACCCGAAGATCTACAATCCTGCCATTAACAAGACCCCGCCGCCAATCCCAACTAGCCCAACTACCCCACCAGCACCCGCCGCCCCCGATACACCATCGGAAACCGATCCGATGACACCGAAACAATCGTCGTGGATAAGTTGGCCACTTGCTTTCTTCTTCCAAAGCGCAATGGAAGAAGATGAAGAAGCGGGACTAGTCCCATTTTTCGTTTCAGCAACGCTAGGGACAACAGGGACGTGTGCATTTGATAATTTGTCCGAAATTGGACCGGTAGTTCGGAAATTTCCTAGCGTTTGGCTGCATGTAGACGCTGCGTATGCTGGCAGCTCATTTATCTGCCCTGAACATAAATATCACCTGGCAGGAATTGAATATGCTGACTCATTTAATACTAATTCAAATAAAATGATGCTCACCAACTTTGATTGTTCTTTAATGTGGGTCACAAACAGATATCTATTGACATCTGCTTTAGTCGTCGATCCGTTGTATTTACAACATTGTTATGACGGTACCGCAATCGATTACCGCCACTGGGGAATACCGCTCAGCCGTCGCTTCAGATCACTGAAGTTGTGGTTCATGTTGAGGAGTTATGGAATCAGTGGCCTGCAGAAATATATACGAAGACATTGCGAACTCGCTAAGTATTTCGAACAACTTGTTAAAAAGGACAAGAGATTCGAAGTATGCAACCAAGTTAAGTTGGGATTAGTATGCTTTCGATTGGTAGGGAGTCGCGACGAAAATGAGGAACAAGTTGATGAGTTGAATAAGAAACTGCTTACTAACATCAATGCTTCTGGAAAGCTCCACATGGTGCCCACTTCTTTTCGTGATCGATACGTGATTCGTTTCTGTGTTGTGCACCAACACGCTAGCCGTGAAGATATTGAATATGCTTGGGATACCATAACTGACTTCGCAGAAGAATTATACGAAGGTCCCGATAAGGAAAGGGATTTGAATGAGGAAAGGGCACGTAAGCATCTGCAAGCTCTCGCTCATAAGCGTTCGTTCTTCGTGCGCATGGTGAGCGACCCGAAGATCTACAATCCTGCCATTAACAAGACCCCGCCGCCAATCCCAACTAGCCCAACTACCCCACCAGCACCCGCCGCCCCCGATACACCATCGGAAACCGATCCGATGACACCGTTCCGGCATTTAGATACGATGGTACGTCTAAAGAGCCCACAGATACGCAGAGGTTCATCGCCAGGCGTGTCCCCCGAGCGTCGGCCCTCCCCTGCAAACTGA

Protein sequence:

>DPOGS205057-PA
MDVEEFRVRGKEMVDYICTYMTTLSKRRVTPSVEPGYLRTELPTEAPFLPENWNDVMEDVENKIMPGVTHWQHPRFHAYFPSGNGYPSILGDMLSAGIGCIGFSWAASPACTELEIIMLDWMGKAIGLPPAFLQLEEGSKGGGVIQGSASECVLVCMLAARAAGIKRLKHQFPTVDEGLLLSKLIAYCSKEAHSCVEKAAMISFVKLRILQPDEHGSLRGDTLKEAMEEDEEAGLVPFFVSATLGTTGTCAFDNLSEIGPVVRKFPSVWLHVDAAYAGSSFICPEHKYHLAGIEYADSFNTNSNKMMLTNFDCSLMWVTNRYLLTSALVVDPLYLQHCYDGTAIDYRHWGIPLSRRFRSLKLWFMLRSYGISGLQKYIRRHCELAKYFEQLVKKDKRFEVCNQVKLGLVCFRLVGSRDENEEQVDELNKKLLTNINASGKLHMVPTSFRDRYVIRFCVVHQHASREDIEYAWDTITDFAEELYEGPDKERDLNEERARKHLQALAHKRSFFVRMVSDPKIYNPAINKTPPPIPTSPTTPPAPAAPDTPSETDPMTPKQSSWISWPLAFFFQSAMEEDEEAGLVPFFVSATLGTTGTCAFDNLSEIGPVVRKFPSVWLHVDAAYAGSSFICPEHKYHLAGIEYADSFNTNSNKMMLTNFDCSLMWVTNRYLLTSALVVDPLYLQHCYDGTAIDYRHWGIPLSRRFRSLKLWFMLRSYGISGLQKYIRRHCELAKYFEQLVKKDKRFEVCNQVKLGLVCFRLVGSRDENEEQVDELNKKLLTNINASGKLHMVPTSFRDRYVIRFCVVHQHASREDIEYAWDTITDFAEELYEGPDKERDLNEERARKHLQALAHKRSFFVRMVSDPKIYNPAINKTPPPIPTSPTTPPAPAAPDTPSETDPMTPFRHLDTMVRLKSPQIRRGSSPGVSPERRPSPAN-