Monarch geneset OGS2.0

DPOGS210609
TranscriptDPOGS210609-TA963 bp
ProteinDPOGS210609-PA320 aa
Genomic positionDPSCF300168 + 27359-28718
RNAseq coverage459x (Rank: top 27%)
Annotation
HeliconiusHMEL0058986e-7682.12% 
BombyxBGIBMGA014414-TA5e-13975.79% 
DrosophilaCG3793-PB8e-9755.66% 
EBI UniRef50UniRef50_Q9V3K79e-9555.66%CG3793 n=30 Tax=Coelomata RepID=Q9V3K7_DROME
NCBI RefSeqXP_314892.49e-9854.69%AGAP008768-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|2897408673e-9755.99%leucine carboxyl methyltransferase 1 protein [Glossina morsitans morsitans]
NCBI nr blastxgi|2897408678e-9556.17%leucine carboxyl methyltransferase 1 protein [Glossina morsitans morsitans]
Group
Gene OntologyGO:00081681.9e-143methyltransferase activity
KEGG pathwayhsa:514511e-55 
 K00599 (E2.1.1.-)maps-> Naphthalene and anthracene degradation
    Tyrosine metabolism
    Histidine metabolism
    Selenoamino acid metabolism
InterPro domain[1-321] IPR0166511.9e-143Leucine carboxyl methyltransferase, LCTM1 1
[11-310] IPR0211212.1e-124Leucine carboxyl methyltransferase, eukaryotic
[13-204] IPR0072131.7e-34Leucine carboxyl methyltransferase
Orthology groupMCL14248 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210609-TA
ATGAATAATATAATATCTTGGAGTGCCGAAGACGAGGCTATCATTGCTACAAACACGGATGCTACTGAATGCAAGCGGTGTGCAGTTGAATTGGGTTATTGGAAAGATGAATACATTTCATATTTTGCTAAACATGTCGATCGAAAAGCTCCAGAAATAAATCGTGGATATTATGCGAGAGTTAAAGCAATGGAAATGTTTATTCATCAATTTTTAGAGAGATGTGGTACCAAGTGTCAGATCATCAACCTGGGATGTGGTTTTGATACTCTGTATTGGCGCCTCAAGGACACTACACAAGCCGTCAGCAACTTTATAGAGTTGGACTTTCCATCGGTAACAAGCAAGAAATGTCACATCATCAAACGTAACAAGCAGCTATTGGAGAAGATTTGCAAAGAAGGCATAAATGGGGAGGTCGTGATCCGGTCTGGTGATCTCCACTCTGACGGTTACCATCTGCTGGGCTGCGACCTGCGTTGTTTGGAGGAGGTCCGTCGCAAGTTGCAGGCGGCCGGCGCCACTGCCGAGGCACCCGCGCTGTTGCTCGCGGAATGCGTGTTGGTGTACCTGAGGCCCGAGGCGGCGCTGGCGCTGCTCCGCCACCTGGCCGCCGCCTTCCCTCGCTGCGTGCTCCTGCTTTACGAGCAGTGTAACCTGTCCGACCGCTTCGGCGAGGTCATGCTGCGCAACCTGAGCGCGCGCGGATGTCCGCTGGCTGGCGCCGAGCACTGCCGCGAGCCGGCGGCTCAAGCCGAACGCCTCGTGTCACTAGGCTTCGACGCGGCGCGCTCCTGGGACATGGAGACCGTGTGGCGCTCCTTCCCCGAGGACGAGCGCTCGCGAGTGGACGCGCTGGAGATGCTGGACGAGCGCGAGCTGCTCCTGCAGCTGAACACGCATTACGCGCTGACGGTGGCCACTCGCGGGGAACTGTTCGCCGACCTCGACCTCGCCGGGTAG

Protein sequence:

>DPOGS210609-PA
MNNIISWSAEDEAIIATNTDATECKRCAVELGYWKDEYISYFAKHVDRKAPEINRGYYARVKAMEMFIHQFLERCGTKCQIINLGCGFDTLYWRLKDTTQAVSNFIELDFPSVTSKKCHIIKRNKQLLEKICKEGINGEVVIRSGDLHSDGYHLLGCDLRCLEEVRRKLQAAGATAEAPALLLAECVLVYLRPEAALALLRHLAAAFPRCVLLLYEQCNLSDRFGEVMLRNLSARGCPLAGAEHCREPAAQAERLVSLGFDAARSWDMETVWRSFPEDERSRVDALEMLDERELLLQLNTHYALTVATRGELFADLDLAG-