Monarch geneset OGS2.0

DPOGS201074
TranscriptDPOGS201074-TA1170 bp
ProteinDPOGS201074-PA389 aa
Genomic positionDPSCF300185 - 198884-200053
RNAseq coverage16011x (Rank: top 1%)
Annotation
HeliconiusHMEL0223140.089.97% 
BombyxBGIBMGA001595-TA0.087.92% 
Drosophilapyd3-PA5e-13458.07% 
EBI UniRef50UniRef50_Q9UBR16e-13057.81%Beta-ureidopropionase n=60 Tax=Eukaryota RepID=BUP1_HUMAN
NCBI RefSeqNP_001165388.10.087.66%aliphatic nitrilase [Bombyx mori]
NCBI nr blastpgi|2848135650.087.66%aliphatic nitrilase [Bombyx mori]
NCBI nr blastxgi|2848135650.087.66%aliphatic nitrilase [Bombyx mori]
Group
Gene OntologyGO:00068079.6e-60nitrogen compound metabolic process
GO:00168109.6e-60hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds
KEGG pathwayaga:AgaP_AGAP0102291e-139 
 K01431 (E3.5.1.6)maps-> Pantothenate and CoA biosynthesis
    Drug metabolism - other enzymes
    Pyrimidine metabolism
    beta-Alanine metabolism
InterPro domain[63-364] IPR0030109.6e-60Nitrilase/cyanide hydratase and apolipoprotein N-acyltransferase
Orthology groupMCL11063 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201074-TA
ATGGACGGTGAAACACAGAGCCTCGAATCCATCATCAACACGAACCTCGGGGGAAAAGACCTTGAGGAGTTCAACAGAATCTATTACGGAAGAAAAAATCATCATGAGGTAGACCTAAAGGAGTCCTCGATCGCCGCCGCCAAGGACAACGATTTCGAAATCGCTGCTTACGCCTTCCCCGCGAAAAAGGAATCCACACGACCACCGAGGATCGTGAAGGTGGCCGTCATTCAGCACTCCATCGCCGTGCCGACCGACCGGCCCGTCAACGAACAGAAAAATGCCATACTCGCCAAGGTGAAGAAGATCATAGACGTGGCCGGTCAGGAGGGTGTCAACATCTTGTGCTTTCAAGAACTATGGAACATGCCCTTCGCTTTCTGCACCAGGGAGAAGCAGCCGTGGTGCGAGTTTGCGGAGTCCGCGGAGGACGGGCCCACCACACGCTTCCTGCGCGAGCTCTGCATCAAGTACGCGATGGTCATCGTATCATCCATACTGGAACGTGATGAGAAGCACGCCGACATCATATGGAACACGGCGGTCGTGATCAGCGACACCGGCAGTGTGATCGGAAAACACAGGAAGAATCACATCCCCAGGGTAGGCGACTTCAACGAGTCCAACTATTACATGGAGGGTAACACCGGCCACCCGGTGTTTGCGACGCGGTACGGTAAAATAGGCATCAACATCTGCTTTGGACGTCACCACGTCCTGAACTGGATGATGTTCGGGCAGAACGGAGCGGAAATAGTGTTCAACCCATCAGCCACGATCGCCGCTGAGGCCGGCAGCGAGTACATGTGGAACATCGAGGCGAGGAACGCCGCTATAACCAACTGCTACTTCACGGCTGCGATTAACAGGGTCGGATACGAGGAGTTCCCGAATGAGTTCACCTCCGCTGATGGTAAACCGGCCCACAAGGATTTGGGTTTGTTCTACGGGTCCAGCTACTTCTGTGGTCCGGACGGCGTCAGGTGCCCTGGACTGTCGCGTAATAGAGACGGGCTCCTGATAGCGGTCGTGGACCTCAATATGAACAGGCAGATCCGAGATCGGCGCTGTTACTACATGACACAACGCCTGGACATGTACGTGGAGAGCCTCAAGAGAGTCCTGGACCTGGACTTCAAGCCACAGGTCGTCAATGAAACAGAAAAATGA

Protein sequence:

>DPOGS201074-PA
MDGETQSLESIINTNLGGKDLEEFNRIYYGRKNHHEVDLKESSIAAAKDNDFEIAAYAFPAKKESTRPPRIVKVAVIQHSIAVPTDRPVNEQKNAILAKVKKIIDVAGQEGVNILCFQELWNMPFAFCTREKQPWCEFAESAEDGPTTRFLRELCIKYAMVIVSSILERDEKHADIIWNTAVVISDTGSVIGKHRKNHIPRVGDFNESNYYMEGNTGHPVFATRYGKIGINICFGRHHVLNWMMFGQNGAEIVFNPSATIAAEAGSEYMWNIEARNAAITNCYFTAAINRVGYEEFPNEFTSADGKPAHKDLGLFYGSSYFCGPDGVRCPGLSRNRDGLLIAVVDLNMNRQIRDRRCYYMTQRLDMYVESLKRVLDLDFKPQVVNETEK-