Monarch geneset OGS2.0

DPOGS208843
TranscriptDPOGS208843-TA1002 bp
ProteinDPOGS208843-PA333 aa
Genomic positionDPSCF300036 + 877164-880012
RNAseq coverage271x (Rank: top 40%)
Annotation
HeliconiusHMEL0154232e-15578.68% 
BombyxBGIBMGA007949-TA9e-12579.49% 
DrosophilaPbgs-PA4e-10955.83% 
EBI UniRef50UniRef50_E0VHL12e-11056.17%Delta-aminolevulinic acid dehydratase n=3 Tax=Coelomata RepID=E0VHL1_PEDHC
NCBI RefSeqXP_001847346.12e-11965.11%delta-aminolevulinic acid dehydratase [Culex quinquefasciatus]
NCBI nr blastpgi|1700390243e-11865.11%delta-aminolevulinic acid dehydratase [Culex quinquefasciatus]
NCBI nr blastxgi|1582878633e-11363.24%AGAP010935-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00468724.4e-182metal ion binding
GO:00330144.4e-182tetrapyrrole biosynthetic process
GO:00046554.4e-182porphobilinogen synthase activity
GO:00081522.7e-112metabolic process
GO:00038242.7e-112catalytic activity
KEGG pathwaycqu:CpipJ_CPIJ0060035e-119 
 K01698 (E4.2.1.24, hemB)maps-> Porphyrin and chlorophyll metabolism
InterPro domain[3-330] IPR0017314.4e-182Tetrapyrrole biosynthesis, porphobilinogen synthase
[15-331] IPR0137852.7e-112Aldolase-type TIM barrel
Orthology groupMCL13776 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208843-TA
ATGAACCCTTTATTTTTAAGTCCGGAACATGTCCTACAAGGAGGATATTTTAATGCAACTCTACGAAAGTTGCAAGAGCCCAATACTACAATTGAACCTCATAACTTGATGTACCCAATTTTTCTCTTAGAAAATGAGGATGCTATGCAATCTGTATCAAGTATGCCAAATGTTTATCGTTATGGAATAAACAAACTGATTCCAGCACTTGTGGAACTGGTGGAACAGGGTCTGAAATCCATCCTTATATTTGGTATTGTTGAAACACTACCCAAGGATGCAAGAGGCTCAAGTGCAGATTGTTCCGAGAATCCCGTAGTGAAGGCCTTGCCCAGGATTCTCGAAGCCTGCCCGAATCTGACAATAGCTTGCGATGTGTGCCTCTGTCCTTACACCTCACATGGTCATTGTGGCTTATTAACTGAGAATGGGGTTATCGACCATGCTGCCTCCGTGAAGAGAATTGCCGAGGTGGCTTTAGCATATGCAAAAGCTGGTGCTCATATTGTGGCACCTTCTGATATGATGGACAACAGAATTAAGGCCATCAAAGATGCTCTCGTAGAGAATAAACTTCAGAATCAGGTGTCGGTGTTGTCCTACTCGTGCAAATTCGCGTCTTCCATGTACGGTCCGTTCCGTGACACTATGAAGAGTTCCCCAATGGCTGGTGACCGGAAGTGCTACCAGCTGCCCCCCGGCAGTGCTGGACTGGCGGCGCGGGCTGCGGCACGTGACGTCAGCGAGGGCGCCGACTTCCTGATGGTGAAGCCGGGTCTACCTTACCTGGACATAGTGCGTCAGACCAAGGACAAGTATCCGCATCATCCACTCTTCATTTATCAGGTATCCGGCGAGTACGCTATGATCTCGCGTAACGGAGACTCCTCGGAAGTGGAGAGCACTCTCATGGAAACACTCACGTGCATGCGGCGAGCTGTATACGACTGTATCATCACGTACTTCGCGCCGCTCGTTTTAAACATACTGTCTAGAAAATAA

Protein sequence:

>DPOGS208843-PA
MNPLFLSPEHVLQGGYFNATLRKLQEPNTTIEPHNLMYPIFLLENEDAMQSVSSMPNVYRYGINKLIPALVELVEQGLKSILIFGIVETLPKDARGSSADCSENPVVKALPRILEACPNLTIACDVCLCPYTSHGHCGLLTENGVIDHAASVKRIAEVALAYAKAGAHIVAPSDMMDNRIKAIKDALVENKLQNQVSVLSYSCKFASSMYGPFRDTMKSSPMAGDRKCYQLPPGSAGLAARAAARDVSEGADFLMVKPGLPYLDIVRQTKDKYPHHPLFIYQVSGEYAMISRNGDSSEVESTLMETLTCMRRAVYDCIITYFAPLVLNILSRK-