Monarch geneset OGS2.0

DPOGS209503
TranscriptDPOGS209503-TA1644 bp
ProteinDPOGS209503-PA547 aa
Genomic positionDPSCF300127 - 27598-37997
RNAseq coverage736x (Rank: top 18%)
Annotation
HeliconiusHMEL0160154e-11860.22% 
BombyxBGIBMGA007345-TA3e-16967.20% 
DrosophilaCG5112-PA3e-10742.63% 
EBI UniRef50UniRef50_E2BQI58e-12042.01%Fatty-acid amide hydrolase 2 n=7 Tax=Formicidae RepID=E2BQI5_HARSA
NCBI RefSeqXP_967443.13e-12444.78%PREDICTED: similar to CG5112 CG5112-PA [Tribolium castaneum]
NCBI nr blastpgi|3504058998e-12443.52%PREDICTED: fatty-acid amide hydrolase 2-like [Bombus impatiens]
NCBI nr blastxgi|3504058994e-12043.52%PREDICTED: fatty-acid amide hydrolase 2-like [Bombus impatiens]
Group
Gene OntologyGO:00168848.7e-172carbon-nitrogen ligase activity, with glutamine as amido-N-donor
KEGG pathwaydpo:Dpse_GA206782e-68 
 K01426 (E3.5.1.4, amiE)maps-> Styrene degradation
    Benzoate degradation via CoA ligation
    Arginine and proline metabolism
    Tryptophan metabolism
    Phenylalanine metabolism
    Cyanoamino acid metabolism
InterPro domain[72-546] IPR0001208.7e-172Amidase
[73-539] IPR0236314.5e-96Amidase signature domain
Orthology groupMCL15852 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209503-TA
ATGGTTACAAAAAATGTTAAAAAACCTAAAACTGAAAAACCAGATAGACGCAAAGATAAAAGCAACAAAATATGTAAAGGAATGATTGTGAATATATTGATAAGCATATACTTTACTCTCCGCTATTATTTGGATATGTTAATCGACTACGCGTTTTCACTTTATTGGGATGAATATCGTCAGCAATTGCCAAATTTAGAGAAAAAACATGCAATGCTTATGGAAAGTGCAGTGAAATTAGCTGAGAAAATACGAAAAAAAGAACTAAAATCTGAAGATCTTGTTACTGCCTGCATTGAGAGAATTAAACAGGTCAATCCAATTCTAAATGCGGTAACAGACCAGAGATTTGAAGAAGCTCTGAAGGAAGCTCGAGAAATTGATAAGAAGATTGAAGATGGACTTCCCGATGAGGAATTTAAGAACAAACCGTTCTTGGGGGTACCTTTCACTGCCAAGGAGTCCCACGCTGTTAACGGGATGCTCCACACTCTCGGTGTTCGAGCTCGTCGTGATGTCCGCGCCGAGTACGATGCGGAGTGTGTGAGGTTGCTTAGAGAGGCCGGGGCTTTGCCCCTCGCTGTTACCAACGTACCAGAAATCAACAAATGGCAGGAGACTCGCAATATGGTCTTCGGTCAGACGAACAATCCTTACGACACGGGCCGCACTGTCGGTGGCTCCAGTGGCGGCGAGGCAGCCCTACACGCGGCGCTGGCCTCGCCTATATCACTGTGTTCGGATATTGGCGGCTCGACTCGTATGCCCGCCTTCTACTGCGGTCTCTACGGGTACAATCCCACGGCCGGACACACCAGCCTTAAAGGATCAGCTCTCCGGAGCGGTGAGGATCCAACGATCGCGTCCATCGGCTTCGTCAGCAAACATCCCGAGGACCTGGCACCTCTCACTAAGATCGTCGCCGGTGAGAAAGCCGGATTGCTAGATTTGGATAGGAAAGTCGACATTAAGGATATCAAATTCTACTACGTCGAGGACGTGAAAGATTTAAGGATCAGTCCCGTGTGCAGCGAGCTTAAGAAGGCCATGCATAAAGTAACATCGAAGCTGTCGAAGGCGAGCGAAGCACCGAAGCGGTATAGTCACGCGGGGTTCAACCACTGCTTCGCGTTATGGAAACACGCCATGACACGAGAAACCGAAGACTTCGCTAAACTGCTCACTGACAACCATGGAAGGGCTTACGGAGTTATAGAGCTGGGAAAAAAGTTAATCGGTCAATCTGACTTCACATTGGCCGCTATCCTCAAGCTGTTGGACGAGCAAGTGTTCCCGGCTGTGCCTCCAGCTTGGGCCGACCAGCTCACAGACAGCTTGAGGGATGATCTCATTACGTTGCTCGGTGATACAGGTGTTCTTATATTCCCTTCAGCGCCGAGCCCCTGTCGCCCTCACTACACCCTGTATACTGGTCCATTTAACTTTGCTCTATGGGGTATATTCAACGCTCTTAAATTCCCAGCTGTACAGGTGCCGGTGGGTCTGTCCGCCGGTCTGCCGCTCGGCGTCCAGCTGGTGGCGGCGCCTGGACGGGACGCGTTACTTCTAAATGTTGCAGCATACCTGGAGGAACACCTGGGAGGATTCACACCACCTTGTGCTGTACCACTCAATAATGCTTAG

Protein sequence:

>DPOGS209503-PA
MVTKNVKKPKTEKPDRRKDKSNKICKGMIVNILISIYFTLRYYLDMLIDYAFSLYWDEYRQQLPNLEKKHAMLMESAVKLAEKIRKKELKSEDLVTACIERIKQVNPILNAVTDQRFEEALKEAREIDKKIEDGLPDEEFKNKPFLGVPFTAKESHAVNGMLHTLGVRARRDVRAEYDAECVRLLREAGALPLAVTNVPEINKWQETRNMVFGQTNNPYDTGRTVGGSSGGEAALHAALASPISLCSDIGGSTRMPAFYCGLYGYNPTAGHTSLKGSALRSGEDPTIASIGFVSKHPEDLAPLTKIVAGEKAGLLDLDRKVDIKDIKFYYVEDVKDLRISPVCSELKKAMHKVTSKLSKASEAPKRYSHAGFNHCFALWKHAMTRETEDFAKLLTDNHGRAYGVIELGKKLIGQSDFTLAAILKLLDEQVFPAVPPAWADQLTDSLRDDLITLLGDTGVLIFPSAPSPCRPHYTLYTGPFNFALWGIFNALKFPAVQVPVGLSAGLPLGVQLVAAPGRDALLLNVAAYLEEHLGGFTPPCAVPLNNA-