Monarch geneset OGS2.0

DPOGS200458
TranscriptDPOGS200458-TA3435 bp
ProteinDPOGS200458-PA1144 aa
Genomic positionDPSCF300260 - 80432-94536
RNAseq coverage52x (Rank: top 70%)
Annotation
HeliconiusHMEL0145640.061.94% 
BombyxBGIBMGA011407-TA7e-15669.62% 
Drosophilaegg-PA1e-8754.05% 
EBI UniRef50UniRef50_E9J2541e-13946.11%Putative uncharacterized protein (Fragment) n=1 Tax=Solenopsis invicta RepID=E9J254_SOLIN
NCBI RefSeqXP_392624.31e-13944.86%PREDICTED: similar to CG30426-PA [Apis mellifera]
NCBI nr blastpgi|3407228511e-14046.37%PREDICTED: histone-lysine N-methyltransferase SETDB1-like [Bombus terrestris]
NCBI nr blastxgi|3800294472e-14144.81%PREDICTED: histone-lysine N-methyltransferase SETDB1-like [Apis florea]
Group
Gene OntologyGO:00055153.7e-25protein binding
GO:00056344.2e-18nucleus
GO:00082704.2e-18zinc ion binding
GO:00349684.2e-18histone lysine methylation
GO:00180244.2e-18histone-lysine N-methyltransferase activity
GO:00036772.4e-10DNA binding
KEGG pathwayame:4090984e-139 
 K11421 (SETDB)maps-> Lysine degradation
InterPro domain[797-1119] IPR0012143.7e-25SET domain
[160-271] IPR0077284.2e-18Pre-SET domain
[158-263] IPR0036064.1e-15Pre-SET zinc-binding sub-group
[646-717] IPR0161772.4e-10DNA-binding, integrase-type
[642-684] IPR0017391.3e-06Methyl-CpG DNA binding
Orthology groupMCL11836 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200458-TA
ATGCTACACATCTCTACCACCAACCACCAGGGATTTCCTCGTCAGCGCGCCGTTGCCAAGAAGACTACCACGAAGACTCGCCAATCATCCCGTACAGCCGTACAGAGCCTCGACCACTTTACTAGTAAACTAGTGTACTACAGTCCAAAGAAACATGTGAAGCCATACAAGATGGTGCCCCATACTTGCTCGACTGCGTGCAAGAGGACGGATGTTTTGGAACTCAAAGATTTAAAATCTTACAATCCATTAGCCAAGCCACTGTTGAGTGGCTGGGAGAGACAGATAGCCAATTTCAAGGGCAACAAGGTTGTATTGTACTTGTCTCCGTGCGGTCGCCGCGTCCGCTCTCCGCGGGAGCTACATCGCTATCTGCGAACCGTTGGTAGTCTGGACGGTCAGCTGGAGAAGCTCTTCACACCATCCACGCACTGTCTGGCCGAGTTTGTGCTCAACAAATACTGCGTCAGCAAGAAGGACTTATCAAATGGCAAAGAGAACGTCCCAGTGGCTTGCGTCAATTACTACGACGGATCACTGCCAGAGTTCTGTTTCTACAACACTGAGCGGACTCCGACCGCTGGGGTTCCACTCAACCTGGACCCGGAGTTCCTGTGTGGCTGTGACTGCGAGGACGACTGCGAGGACAAGAGCAAGTGCGCCTGCTGGCAGCTGACTCTGGAGGGCGCTAGGACGATAGGTCTGGAGGGGGAGAACGTCGGTTACGTTTATAGAAGACTAATGGAACCGCTCCCGACTGGTATTTACGAGTGCAACTCTAGGTGCAAGTGTAAAGACACTTGTCTTAACCGCGTCGCTCAATATCCACTTCAGCTAAATTTGCAGGTGTTCAAGACCCAGAACCGCGGTTGGGGCATTCGCACCCTGAATGACATACCCAAGGGGAGCTTCCTCTGTACTTACGCAGGGAAACTACTAACAGAGGCCACAGCTACCCTCGACGGTCTGAACGAGGGTGACGAGTACCTGGCGGAGTTGGACTACATCGAGGTCGTGGAACAGATGAAGGAGGGTTACGAAGAGGACATACCAGAGAACATCAAGAAGATGGATGAGGCTCAAATAGCGGAACAACTCTCGATGGCGGGCGAAGAAACACAGTCATCGTCTTCAGGGGAAAGCAGCCCCAAAAGCGCTGAAAATGACGACCTTAGCCTCGAAGACATTGGTCCGGGGGTCACAGAGTCCAGCAAAGAACTAAGGGGGAAAGACTCAAAGACAGACGAAGAAATAGAGAGTGCGGTGCTGAAAGTTACCGAGAGATTAGTGCCCACAGAAGAAGATGAAACAGTTTTCACAGAGGAACAGAAATCTGTTGTCATAGAAGTGGAAAGTTCAGTGCCCACGGAGGACGAACTCTCTGAAATGCAGGAGGAAATCGATGAAGATTATGATTCTTCGAGTGATGACGGAGAAGATCGAGAACCTTCGAATTTCTCAGCCAGTGCTGGGATGGGAGCAAAGAAGTTTAAATCAAAGTATAGGTCTGTCCGTAGTCTGTTTGGTGAAGATGAAGCCTGCTACATCTTGGACGCCAAGGTTCAAGGGAATATAGGCAGATATCTCAATGTAAGGGGATGTACAACACCGCGGTCCTTGTATGACCGGCAGAGTTGTGATCTGTATATAGAACTCAACGTCCTCGTTCCTGCAAAGACTCATTCCCCTGTCTCCGCCTCCCGCCAGCACTCGTGCGTGCCGAACGTGTTCGTCCAGAACGTGTTCGTGGACACGCACGACCCTCGCTTCCCGTGGGTGGCTTTCTTCGCTCTCACAGCCGTGCGGGCCGGGGGCGAGCTCACCTGGAACTACAACTACGACGTAGGTTCCGTGCCCGGGAAGGTCCTCTACTGTTACTGCGGGGCTCCGACGTGTCGCGGCAGACAGATAGCCAATTTCAAGGGCAACAAGGTTGTATTGTACTTGTCTCCGTGCGGTCGCCGCGTCCGCTCTCCGCGGGAGCTACATCGCTATCTGCGAACCGTTGGTTCAGACCTGCCAGTCGACCTCTTCGACTTCACACCATCCACGCACTGTCTGGCCGAGTTTGTGCTCAACAAATACTGCGTCAGCAAGAAGGACTTGTCAAATGGCAAAGAGAACGTCCCAGTGGCTTGCGTCAATTACTACGACGGATCACTGCCAGAGTTCTGTTTCTACAACACTGAGCGGACTCCGACCGCTGGGGTTCCACTCAACCTGGACCCGGAGTTCCTGTGTGGCTGTGACTGCGAGGACGACTGCGAGGACAAGAGCAAGTGCGCCTGCTGGCAGCTGACTCTGGAGGGCGCTAGGACGATAGGTCTGGAGGGGGAGAACATTCTCAACTCATTCCCCATGTATCAGGTGTTCAAGACCCAGAACCGCGGTTGGGGCATTCGCACCCTGAATGACATACCCAAGGGGAGCTTCCTCTGTACTTACGCAGGGAAACTACTAACAGAGGCCACAGCTACCCTCGACGGTCTGAACGAGGGTGACGAGTACCTGGCGGAGTTGGACTACATCGAGGTCGTGGAACAGATGAAGGAGGGTTACGAAGAGGACATACCAGAGGACATCAAGAAGATGGATGAGGCTCAAATAGCGGAACAACTCTCGATGGCGGGCGAAGAAACACAGTCATCGTCTTCAGGGGAAAGCAGCCCCAAAAGCGCTGAAAATGACGACCTTAGCCTCGAAGACATTGGTCCGGGGGTCACAGAGTCCAGCAAAGAACTAAGGGGGAAAGACTCAAAGACAGACGAAGAAATAGAGAGTGCGGTGCTGAATGTTACCGAGAAATTTGTGCCCACAGAAGAAGATGAAACAGTTTTCACAGAGGAACAAAAATCTGTTGTCATAGAAGTGGAAAGTTCAGTGCCCACGGAGGACGAACTCTCTGAAATGCAGGAGGAAATCGATGAAGATTATGATTCTTCGAGTGATGACGGAGAAGATCGAGAACCTTCGAATTTCTCAGCCAGTGCTGGGATGGGAGCAAAGAAGTTTAAATCAAAGTATAGGTCTGTCCGTAGTCTGTTTGGTGAAGATGAAGCCTGCTACATCTTGGACGCCAAGGTTCAAGGGAATATAGGCAGATATCTCAATGTAAGGGGATGTACAACACCGCGGTCCTTGTATGACCGGCAGAGTTGTGATCTGTATATAGAACTCAACGTCCTCGTTCCTGCAAAGACTCATTCCCCTGTCTCCGTCTCCCGCCAGCACTCGTGCGTGCCGAACGTGTTCGTCCAGAACGTGTTCGTGGACACGCACGACCCTCGCTTCCCGTGGGTGGCTTTCTTCGCTCTCACAGCCGTGCGGGCCGGGGGCGAGCTCACCTGGAACTACAACTACGACGTAGGTTCCGTGCCCGGGAAGGTCCTCTACTGTTACTGCGGGGCTCCGACGTGTCGCGGCAGGTTACTGTGA

Protein sequence:

>DPOGS200458-PA
MLHISTTNHQGFPRQRAVAKKTTTKTRQSSRTAVQSLDHFTSKLVYYSPKKHVKPYKMVPHTCSTACKRTDVLELKDLKSYNPLAKPLLSGWERQIANFKGNKVVLYLSPCGRRVRSPRELHRYLRTVGSLDGQLEKLFTPSTHCLAEFVLNKYCVSKKDLSNGKENVPVACVNYYDGSLPEFCFYNTERTPTAGVPLNLDPEFLCGCDCEDDCEDKSKCACWQLTLEGARTIGLEGENVGYVYRRLMEPLPTGIYECNSRCKCKDTCLNRVAQYPLQLNLQVFKTQNRGWGIRTLNDIPKGSFLCTYAGKLLTEATATLDGLNEGDEYLAELDYIEVVEQMKEGYEEDIPENIKKMDEAQIAEQLSMAGEETQSSSSGESSPKSAENDDLSLEDIGPGVTESSKELRGKDSKTDEEIESAVLKVTERLVPTEEDETVFTEEQKSVVIEVESSVPTEDELSEMQEEIDEDYDSSSDDGEDREPSNFSASAGMGAKKFKSKYRSVRSLFGEDEACYILDAKVQGNIGRYLNVRGCTTPRSLYDRQSCDLYIELNVLVPAKTHSPVSASRQHSCVPNVFVQNVFVDTHDPRFPWVAFFALTAVRAGGELTWNYNYDVGSVPGKVLYCYCGAPTCRGRQIANFKGNKVVLYLSPCGRRVRSPRELHRYLRTVGSDLPVDLFDFTPSTHCLAEFVLNKYCVSKKDLSNGKENVPVACVNYYDGSLPEFCFYNTERTPTAGVPLNLDPEFLCGCDCEDDCEDKSKCACWQLTLEGARTIGLEGENILNSFPMYQVFKTQNRGWGIRTLNDIPKGSFLCTYAGKLLTEATATLDGLNEGDEYLAELDYIEVVEQMKEGYEEDIPEDIKKMDEAQIAEQLSMAGEETQSSSSGESSPKSAENDDLSLEDIGPGVTESSKELRGKDSKTDEEIESAVLNVTEKFVPTEEDETVFTEEQKSVVIEVESSVPTEDELSEMQEEIDEDYDSSSDDGEDREPSNFSASAGMGAKKFKSKYRSVRSLFGEDEACYILDAKVQGNIGRYLNVRGCTTPRSLYDRQSCDLYIELNVLVPAKTHSPVSVSRQHSCVPNVFVQNVFVDTHDPRFPWVAFFALTAVRAGGELTWNYNYDVGSVPGKVLYCYCGAPTCRGRLL-