Monarch geneset OGS2.0

DPOGS206896
TranscriptDPOGS206896-TA999 bp
ProteinDPOGS206896-PA332 aa
Genomic positionDPSCF300001 - 1808364-1812095
RNAseq coverage1061x (Rank: top 12%)
Annotation
HeliconiusHMEL0068604e-13375.40% 
BombyxBGIBMGA012853-TA1e-12774.83% 
Drosophilapr-set7-PA2e-6557.14% 
EBI UniRef50UniRef50_Q7QAW52e-6559.57%AGAP012481-PA (Fragment) n=1 Tax=Anopheles gambiae str. PEST RepID=Q7QAW5_ANOGA
NCBI RefSeqXP_313242.44e-6659.57%Anopheles gambiae str. PEST AGAP012481-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1582917078e-6559.57%Anopheles gambiae str. PEST AGAP012481-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|910853157e-6257.87%PREDICTED: similar to histone-lysine n-methyltransferase [Tribolium castaneum]
Group
Gene OntologyGO:00180244.5e-65histone-lysine N-methyltransferase activity
GO:00055152.2e-25protein binding
KEGG pathwayaga:AgaP_AGAP0124811e-65 
 K11428 (SETD8)maps-> Lysine degradation
InterPro domain[5-333] IPR0168584.5e-65Histone H4-K20 methyltransferase
[197-323] IPR0012142.2e-25SET domain
Orthology groupMCL12430 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206896-TA
ATGTTAGAAAGCAATGACTGCGAGTTCCCATTCGATGGTGTTCTTCGTTCTTATAACAGTGATTATCAAGTTCAAGAAGAAATGGTTCGAGTCAATTCGGTTACATCAACTCAAGAGGGTATAAAAACACCACACCGCATTGAGTTGTGTGAAGAAAAACAAGTTAAGCCAACGAGAAATTATAGAAAACGCAGAATCATTTCATCAGCTTCTATAAAGACTGAAAATGAGTTGGACGAACCTCCGCTGAAGAAAACTGAAGTGAATTCAAAACCACTGAAAGCTTCTAACGGCTGCATCATGACGAATGATACGACATCCAGATCCAGGGCGTCTCGAGCAGCCGCCCGCTCTACTGACAGCCACAAACTGACAGAATACTTTAAGGAGGAACTGAAGAACACTAAAGTTGAACCTCAACCATCACCTGTAGAGAAAATAACCAAGGATGAGGTCAAAGTCAAAGGGAATACGAATCACAAGCTGACAGAATACTTCCCAGTTAGAAGGAGCGTAAGGAAGACCTCGAAATGTGTGATGGCGGAAAAAATGAGAGACTTCGAGCGAGCTGTAAGAGAACAAAGAGAGGATGGTTTACAGGTGGCGTACTTCAAGGACAAGGGTCGTGGCGTGGTGGCAACTAAGGCGTTCCAGAGGGGTGAGTTCGTGGTAGAATACGCGGGAGAGCTCGTGGGGGTCGCGGAAGCCAGGGAGAGGGAAAGACTGTACGCACAGGACCCCACCAAGGGATGCTATATGTACTACTTCAGATTACAAGACCAACAGTACTGTATCGACGCTACATCCGAGTCAGGTCGCCTGGGTCGTTTGGTAAACCACTCTAGAACTGGTAATCTGGTGACCAAAGCCTTGTGGGTGGACGGTCCACGTCTCGTACTGGTAGCAGGGACTGACGTTCAGCCCGGGGACGAACTGACGTATGACTACGGAGACCGCTCCAGGGAATCTCTGAGACATCACCCTTGGCTGGCGCTATGA

Protein sequence:

>DPOGS206896-PA
MLESNDCEFPFDGVLRSYNSDYQVQEEMVRVNSVTSTQEGIKTPHRIELCEEKQVKPTRNYRKRRIISSASIKTENELDEPPLKKTEVNSKPLKASNGCIMTNDTTSRSRASRAAARSTDSHKLTEYFKEELKNTKVEPQPSPVEKITKDEVKVKGNTNHKLTEYFPVRRSVRKTSKCVMAEKMRDFERAVREQREDGLQVAYFKDKGRGVVATKAFQRGEFVVEYAGELVGVAEARERERLYAQDPTKGCYMYYFRLQDQQYCIDATSESGRLGRLVNHSRTGNLVTKALWVDGPRLVLVAGTDVQPGDELTYDYGDRSRESLRHHPWLAL-