Monarch geneset OGS2.0

DPOGS207038
TranscriptDPOGS207038-TA4488 bp
ProteinDPOGS207038-PA1495 aa
Genomic positionDPSCF300001 + 1734486-1745421
RNAseq coverage265x (Rank: top 40%)
Annotation
HeliconiusHMEL0094080.073.34% 
BombyxBGIBMGA012978-TA0.067.92% 
DrosophilaCG40351-PC4e-14046.49% 
EBI UniRef50UniRef50_D2A4533e-14152.63%Putative uncharacterized protein GLEAN_15826 n=1 Tax=Tribolium castaneum RepID=D2A453_TRICA
NCBI RefSeqXP_002044683.11e-14244.88%GM18767 [Drosophila sechellia]
NCBI nr blastpgi|3838492464e-14255.64%PREDICTED: uncharacterized protein LOC100875701 [Megachile rotundata]
NCBI nr blastxgi|1892386200.042.93%PREDICTED: similar to CG40351 CG40351-PC [Tribolium castaneum]
Group
Gene OntologyGO:00180241.8e-151histone-lysine N-methyltransferase activity
GO:00055151.2e-33protein binding
GO:00036761.6e-08nucleic acid binding
GO:00001662.2e-08nucleotide binding
KEGG pathwaydse:Dsec_GM187673e-142 
 K11422 (SETD1, SET1)maps-> Lysine degradation
InterPro domain[69-1495] IPR0157221.8e-151Histone-lysine N-methyltransferase
[1356-1479] IPR0012141.2e-33SET domain
[95-163] IPR0005041.6e-08RNA recognition motif domain
[94-166] IPR0126772.2e-08Nucleotide-binding, alpha-beta plait
[1479-1495] IPR0036166.2e-06Post-SET domain
Orthology groupMCL11587 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207038-TA
ATGAATGGAGGAATGGAGCACAAAACACCAGGCCACAATGCAGTTCTTCACAAGGGACCTAAAAATTATAAACTTCTAATAGACCCATTCCTAGTAAAGGGAGCGACGAAAGTGTACAGATATGATGGTACTGTTCCTGGTACGTCTTACCCGTCGATACAATGCAGAGACCCGAGACCTCAACCGTCCAGAATATGGAATAAATTAGAACCAGCAGATTTACCGATACCTAGGTTTAGAATTGATAAGAATTATGTTGGTGTTCCGCCACAGTTAGAAATAACTATTGTAAATTTGAATGATAACATCGATAAAGCTTTCTTGTCCGACATGATGAATAAAGTTGGACCTTATGAAGAATTGACAATATTTTATCATCCGATGACTAATAGGCATTTAGGTTTTGCTAGAATTGTATTTCAGGATGTCAAATATTCCAAAATATGTATCGAAAAATATAATGGAAAATCTGTCATGGGGCAGGTACTTGAAGTTTTCCATGATTCATTTGGTAAGAAGTGTCAGGAGATGTTTGAAGATAAAACGGTGGAGAAAAAGCCCCAGCCGGCTCCGATCAAGCCTCCCGAGGATGCCCGAGTGGCCAAGCTAGATCCAGCTCTCAGCAAGAGATTAGAGGATAGCAAACTAGTTGATAAGGACCCATACCTTCGCAAGGAGCTGGAACACAATGACAGCAACACAAGATGGTCAGATGATGAAAGGGACAGAGAGTACAAACATCGCCTCCGCAGTAGAAGTGAGAGAGACAGAGATATTGACAGAGATAGAGATGGCAACAGAGAGAGAGAAAGACACAGAGACCGCTATGCCCGTACCAGTGAACAGAGTGAGTCCTATTCCAGCGCCCACGTTGAGATACCGTACGCTCCAACGCCAGTACCATACGACCCATACTACCAGACCCCTGGCTATGGATACGGCTACGGCACGAGCGCCGGCGCTGTTTGGTGGGGGGACTGGAGACAGCCGCACACCTCTCACCATCACTCACATATATTCCTCAAGTCGGAGCAGAGCAGCAGCAGCACGTGGACGGCCGCGGGCGAGCCGACCCCCTCCCCGCGGCACACACCCCTCGCGGCCCCTCAAACCCCGTACACGCCCGCGCCCCCCACTCCGCTCCCGGAAAAGGAGGTGAAGTGTAAGCCCGAGGAGCCTCTTCCTCCGAGTACGTCCGTCGTCAGCGACCCTGAGCCCAAGCCGCCTCCGCCCAGCGACGAACCCAAGAACGTCGACCTTGACACAAGGATAGCTTTGTTGCTGAAGGGAGCCAGCGGCGGAGGGGGTCTGGCGCCTCCCTTCCTGTCTCTGGGAATGACCTCTGAAGAGGAAGACGAAGACAGGAAGCCTAGGAACATACCTGACCTGGACACACACAACCCGCCGTCAGACGATGAAGGTTCTGTAAGTGAGGATAGAGAGAGTATAATATCATTGAACCAGAATAGAGAAGTTAACCCGGAACCGTTGTCTAATACTCCTTCACCATACCTATCAAGAGAGTTCTATCTTGAATGTCTTAAAGCGACGGTCGAGAGGAAAGCAAAAGAAGAGGAACGCAAAAAGTTCCCGCCAATAGACAAAATAGGTTCCGATATATCGTCGTCTGAAGACGAACTGCTCACTGGGGAGGAACCACGACGTTCACCTGTTAACCCGCCCGATAGAGATCAAGATAATTTGGACGATGATCAGATGTCTTTATCGTCTCTGTCGTCAACGGAGGCCAAAATCGAGGAGCAAGTCCCCGCTGAGGCGTATTACTACCCGCCCGCACACCCGCATTACTATCAGGCGATGTGGCCGACAACTGCCTATCCTCCGGGTGCTGTGGGCGCTATGGGGTCGGTTGCCGCTATGGGGCCCGTGGGAGCGGTGGGCCCAGTGGGCGCGGTGGGCGCTGCATACCCAGCCGCCGGCGACATGTCATTGTATGCTGGTGGCTTCGCTCCTCCCGTGATACACAGCTACCCTCCACCACGCACCGTCACTCACGAGGAGCTCGACAACCCTTACTACCCAACAATCAATAGTGTGATAGAGCGCGTCACGACTGAGCTTAAACAAATACTTAAGAAGGATTTCAATAAGAAAATGATAGAGAGCACTGCCTTTAAGAACTTCGAGGTTTGGTGGGACGAGCAGAGTCGGAAGACGAGACAGACTGTGAAACAAACTAAAGAAGATGTCGGACAACCATTACAAGATGTATCAAATAAGAAGGAGGAATCGGTGGATTCAATAAAATCTATAATGGAGTCTAGAGATCTGGGTCTAGATCTAGGCGGGTACAGTGTTGGTATTGGTCTTGGTCTCAGGGCGACCATACCAAAGATGCCCAGTTTCAGGAAGAAGAGAAAAATACCTTCGCCTGTTGTTATGGACGAGGACTCCAGTAAGAGACTGAGTGATCAGGAGGAAATCGTCCAGAACTCTGACGAAGAGAAGGAAGTACCGACCAGTCCTCGGAATAGGACAACAGGTTCATACCTCTCGACTGGCAGAAGAAGACAGTCGAGCAGCTCATCGAGGTCTTCGTCGTCGTCGTCTTCGCGATCCTCTTGGTCGGGTTCGGAGCGCTCTGTGAGGAAGGTCGCCCCAAGAATATACTCCGACACAGACGACTCGGACCTCGAAGACGCTGAAGTGCAGCAAATCAAGTTGGTGTCCAACAAGGAGAGACTCAGGCGAGTGTACTCATCGTCATCGGACAGCGAGGAAGAACAGAGAAGAAGAGAAAAAACTCCGATACCGGAAGTGGAAGCATCAGACGACCGCCTCGGTTCACCTATTCTGTCGCCGGAAGAGGAACCCAGAGATACAATACTCGATCGTGTATACTCGGACTCTGAGGAAGAGAGGGAATACCAGGAGCGTCGCCGTCGTAACACTGAGTATATGGAACAGATCGAACGAGAGTTCCTCGAGGAACAGCGTCGAGGACAACAGACCTCTGACACGGATGCACAGCAACAAAACGATAGTATAACGGAACCGAAACAAGAAAAGAGTAGAAGTAGTCCAAGCAAGAACTATCTGAAATCACCTGAGAAGAATAAAATGGCAGCTGAGGGTGATGTTGAAGAAGGTGAAATAAGTTCAGAGGAAGAACCTTTAGAAGTTAGAAGAAAGAAGGAGAAGAAACAGAAAAAGAAGACTGACAAGAGACGAAGAGTCACATCCGTCAGCGACCACAGCTTTACCGAGTCTGCGGTTAGCGTGAATGGCGTTAAGGAGGCTAGCGGTGCTGTGTCGGAGACGTCTTCGCCTCAGTCGCAGGCGTCCCAGGCTTCTCAGGCGTCCCAAGTGGCGTTGGACCACTCGTACTGTCGCCCGCCGCCCACTGAACGACCTACCACTACACATCTACAACACGATCACGGTTACACTTGGATGGCTGAACCGGAACCGGAAGCAGAATCGCCACCAGTCGCCATGGAAGAGAAGAGACGGGAGAAGACGGAGAGACCGTACAAAAGAAAACATCAGAATAAAAAGTTATCTGAAATTCAGAATAAATTATACGACGGTCGCGATGATTATAACAATAAGTACTCGTCCGTGACATTCAAGCAGCGCGATATAATGGCGGAGGTCCAAGTGATGTACGAGTTCCTCACCCGCGGTATAGACAGAGAGGATATAGAGTACCTCAGGCGGGCGTACGAGGCTCTGTTGGCGGAGGATGCTCAGGGGTACTGGCTCAATGACACCCACTGGGTCGAACATCCGCCCACTGACCTCACGTACTCACCGCCCAAGAAGAAATCCAAGCGATACAATAACATCTACGAGGACTTGCAAGGCCACTCGAGCGGTTCAGCACGTACGGAGGGCTACTACAAGATGGACGCCAAGTTGAAGGCGAAGTACAAGTATCACCACGGAAGAACCGCTGCGTTACCCCCGCCTGATGATAAGAAAGCCAGCAAGATGCAGCTGCTGTCGAGAGAGGCGCGCTCCAACCAGAGGAGACTGCTCACGGCATTCGGAACCGACACTGACTCAGATCTCCTCAAGTTCAATCAGCTCAAGTTCAGGAAGAAACAGCTCAAGTTCGCCAAATCTGGTATACACGACTGGGGTCTCTTTGCTCAGGAGGCGATAGCGGCGGACGAGATGGTTATCGAATACGTCGGTCAAATGGTCCGTCCCATAGTAGCGGATGTCCGCGAGGCTCACTACGAGGCCACTGGCATCGGTTCTTCATATCTGTTCCGTATAGACTTGGACACTATTATTGATGCAACCAAGTGCGGTAACCTGGCGCGTTTCATCAACCACAGCTGCAATCCAAACTGTTACGCAAAGATAATAACTATAGAATCACAGAAGAAAATCGTCATATACTCGAAACAGCCCATAGGAGTCGACGAGGAGATAACCTACGACTACAAGTTCCCTCTCGAAGACGAGAAGATACCTTGCCTGTGTGGAGCGCCGCAATGCCGTGGCTACCTTAACTAG

Protein sequence:

>DPOGS207038-PA
MNGGMEHKTPGHNAVLHKGPKNYKLLIDPFLVKGATKVYRYDGTVPGTSYPSIQCRDPRPQPSRIWNKLEPADLPIPRFRIDKNYVGVPPQLEITIVNLNDNIDKAFLSDMMNKVGPYEELTIFYHPMTNRHLGFARIVFQDVKYSKICIEKYNGKSVMGQVLEVFHDSFGKKCQEMFEDKTVEKKPQPAPIKPPEDARVAKLDPALSKRLEDSKLVDKDPYLRKELEHNDSNTRWSDDERDREYKHRLRSRSERDRDIDRDRDGNRERERHRDRYARTSEQSESYSSAHVEIPYAPTPVPYDPYYQTPGYGYGYGTSAGAVWWGDWRQPHTSHHHSHIFLKSEQSSSSTWTAAGEPTPSPRHTPLAAPQTPYTPAPPTPLPEKEVKCKPEEPLPPSTSVVSDPEPKPPPPSDEPKNVDLDTRIALLLKGASGGGGLAPPFLSLGMTSEEEDEDRKPRNIPDLDTHNPPSDDEGSVSEDRESIISLNQNREVNPEPLSNTPSPYLSREFYLECLKATVERKAKEEERKKFPPIDKIGSDISSSEDELLTGEEPRRSPVNPPDRDQDNLDDDQMSLSSLSSTEAKIEEQVPAEAYYYPPAHPHYYQAMWPTTAYPPGAVGAMGSVAAMGPVGAVGPVGAVGAAYPAAGDMSLYAGGFAPPVIHSYPPPRTVTHEELDNPYYPTINSVIERVTTELKQILKKDFNKKMIESTAFKNFEVWWDEQSRKTRQTVKQTKEDVGQPLQDVSNKKEESVDSIKSIMESRDLGLDLGGYSVGIGLGLRATIPKMPSFRKKRKIPSPVVMDEDSSKRLSDQEEIVQNSDEEKEVPTSPRNRTTGSYLSTGRRRQSSSSSRSSSSSSSRSSWSGSERSVRKVAPRIYSDTDDSDLEDAEVQQIKLVSNKERLRRVYSSSSDSEEEQRRREKTPIPEVEASDDRLGSPILSPEEEPRDTILDRVYSDSEEEREYQERRRRNTEYMEQIEREFLEEQRRGQQTSDTDAQQQNDSITEPKQEKSRSSPSKNYLKSPEKNKMAAEGDVEEGEISSEEEPLEVRRKKEKKQKKKTDKRRRVTSVSDHSFTESAVSVNGVKEASGAVSETSSPQSQASQASQASQVALDHSYCRPPPTERPTTTHLQHDHGYTWMAEPEPEAESPPVAMEEKRREKTERPYKRKHQNKKLSEIQNKLYDGRDDYNNKYSSVTFKQRDIMAEVQVMYEFLTRGIDREDIEYLRRAYEALLAEDAQGYWLNDTHWVEHPPTDLTYSPPKKKSKRYNNIYEDLQGHSSGSARTEGYYKMDAKLKAKYKYHHGRTAALPPPDDKKASKMQLLSREARSNQRRLLTAFGTDTDSDLLKFNQLKFRKKQLKFAKSGIHDWGLFAQEAIAADEMVIEYVGQMVRPIVADVREAHYEATGIGSSYLFRIDLDTIIDATKCGNLARFINHSCNPNCYAKIITIESQKKIVIYSKQPIGVDEEITYDYKFPLEDEKIPCLCGAPQCRGYLN-