Monarch geneset OGS2.0

DPOGS200578
TranscriptDPOGS200578-TA3513 bp
ProteinDPOGS200578-PA1170 aa
Genomic positionDPSCF300303 + 150321-157985
RNAseq coverage20x (Rank: top 79%)
Annotation
HeliconiusHMEL0169490.067.72% 
BombyxBGIBMGA002246-TA0.059.27% 
DrosophilaMes-4-PA2e-14846.90% 
EBI UniRef50UniRef50_D6WZP00.046.84%Putative uncharacterized protein n=2 Tax=Tribolium castaneum RepID=D6WZP0_TRICA
NCBI RefSeqXP_973711.10.046.84%PREDICTED: similar to NSD1 [Tribolium castaneum]
NCBI nr blastpgi|910909020.046.84%PREDICTED: similar to NSD1 [Tribolium castaneum]
NCBI nr blastxgi|2700140060.042.43%hypothetical protein TcasGA2_TC012700 [Tribolium castaneum]
Group
Gene OntologyGO:00055154.8e-40protein binding
GO:00056341.3e-13nucleus
GO:00180241.3e-13histone-lysine N-methyltransferase activity
GO:00082701.1e-07zinc ion binding
KEGG pathwaytca:6625270.0 
 K11424 (NSD1_2)maps-> Lysine degradation
InterPro domain[921-1044] IPR0012144.8e-40SET domain
[732-794] IPR0003135.5e-15PWWP
[870-920] IPR0065601.3e-13AWS
[670-745] IPR0110114.5e-11Zinc finger, FYVE/PHD-type
[682-731] IPR0130833.4e-08Zinc finger, RING/FYVE/PHD-type
[687-727] IPR0019651.1e-07Zinc finger, PHD-type
Orthology groupMCL10357 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200578-TA
ATGGAATTATGTGAAAAGGATAATATAAATCAAGAATTGGATAGAGTTGTAGAAAATATTTCACCAGACATAGAAACTATAAGACGTCGAAAGCGTTGTTTGAATGTGCCTTTAAACAAATCTATCGATATTGTTACGAAAACAAGTGAAGATAAAACGAATATCGACAAAGAAGAGAATTGTGACAATAATTTAAATAAAACTGCCGATGAATTACCTCAAAATGTTAGCAATAACGAAAAAGTAGATAGTATGGAAAATGATCGAGAAATATTGGCAATAAATGATATGCGGAACCCATGCGGCGATAATGATAATGTAGATTTGGAAACTAATGAAGAATCTAGTAGCACTACTCAGGTTCATACATCCGAAGAAAACGAAAATCAAAATACAACAAAGACTGAATTAACTATAAATACTAATACAAACAGCTCGGATGAGGAAGCTGAGGGTAAAAATCCCATCATAACAGATGAGCTATCAGATCAAAATGATGGGAAAGCAGATGAGATTAAAGAAATGGAGATGGAAAAAGATAATGATAATGTATCAGTTGTATCAGAGGGCAGCGATATATCAAGAAAGAAACGCGCCAGAGACAAGCCGTCTGATAAGAAATCTTTGTTATCTGATGTAGAATTCCTGAAATATCTGGAATTGAGACAGGATGCGGTCATAGACGAGCATCCCGAGCTCTCGCAGGAAGACATCACTAGCTATTTATACAAAACCTGGATATACGAGGAAAATTTGAAACCAGATATAAAGAAATGCGATGACATAGACCAAGCTAATTTAGTGAAGGGTTTGAACTTAGACCCAGCGCCGGTCAAAAAAGTCAGGAAGAGGGTTAAAGTTGACAAAGAGATTGCGTGCGAGGACACCGCTACCAAAGAGAAATCTAAAAGAAAAATAATACGCCCATACTATAAAGAGGAATTTTCAGACGGGGATGATAGTGTGGAATATTTTGATATATTTAAATCTAAAAAGGACCAAAAAGGAGCAATTGTGGACAGTAAAGAAATATATCAGAGCGACGGAACTGTCCTCGAACGTATTATAAACGTAGACGAGTACGTTCAGGACGAATACGATGACGTCGAAGAATACTTCAGACAGCTAACAGCGCCTAAACCTAACGTCTTTAAGGGTTACGCGAGGGAAAAGGTGTGCGAAATATGCGAGAAGGTCGGCGGCTTAGTCAAGTGCAAGGGTTGCCATTCGATGTTTCATGTGGAATGTGACAAGAAGGAAATCGAGGTTATAGAATGCCAGACGCCAACAAGAGGCAGGAGGAGGAAGAAGAAAACTAGAGGAAGGAAGACCAGGGACGATCACAACCAAGACTCCGGCAGCGACGAGAAGTCGCAAGACACCAACGGCTCGGACGAATTACATATGTCGCTGGAAGAAGAATCTCATATAATAGCAAATGCGGACGATTTTGAGGCTCAAATGTCCGTAAGAATGCAAGAAATACTCAAGGATCAGGACATTCAGTACGATTTCTATTCACGCGAGGAGCTGGATTGGAACGACACTCACGCGGGCGAATGTAAGGTCGTGGACATAAAGCCGAGGATGGATTCGATAGAAATAACGGATTATTCGGAATTCAAATGCAAGAACTGCCAGAAATACGATCCGCCGGTATGTTTCGTGTGTAAATATCCTATATCGCCCAAAGAGAAACAGGGTCACAGGCAGAAATGTCAAGTGGCTCATTGCAATAAGTATTACCACTTGGAATGCTTGGACCATTGGCCCCAAACACAATTCAACGGGGGAGAAATTTCTAGAACGAATAAGTTCAGCGAAGCCCTAACTTGCCCGAGGCACGTGTGCCACACTTGTGTCTGTGACGATCCCAGGGGTTGTAAGACGAGATTCAGCGGTGATAAATTAGCGAGATGCGTTCGCTGTCCGGCCACTTACCACACATTCACGAAATGTCTACCGGCTGGGTCACAGATACTGACCGCCTCCCATATAATATGTCCACGACATTATGAACACAGGCCTGGCAAAGTCCCCTGCCACGTGAACACCGGCTGGTGTTTCATATGCGCCCTGGGCGGATCTCTGATATGTTGTGAATACTGCCCGACGTCCTTTCACGCTGAGTGCCTTAATATTAAACCTCCTGAGGGTGGTTATATGTGCGAGGACTGTGAGACTGGTAGACTACCGCTGTACGGAGAAATGGTGTGGGTGAAGCTAGGACACTACAGGTGGTGGCCAGGTATAATTCTTCATCCGTCTGAGATTCCAGACAACATCCTAACCGTGAAACATACCCTCGGTGAATTTGTGGTCAGATTTTTTGGACAATACGACTACTACTGGGTCAATAGAGGCAGAGTGTTCCCGTTCCAAGAAGGTGATTCGGGTAAAGTTTCTAGTCAGAAATCCAAGATAGATGCAGCATTCACTATGGCGATGGAGCACGCACAAAGAGCTTGTTCGATTTTGAAAATGGCTGCGCCGAATGAAGAAGAGTCTTCTGACATAGCATCTTCATTGTTACCACCTCATTATGTTAAATTGAAGGTGAATAAACCTTGCGGGTCACTCTGCGGCAAGAAAATAGATTTAGAGGAAAGTTCATTGACCCAGTGCGAATGTGACCCTAATGATGTCGATCCTTGCGGTCCCTATACTCAATGTCTCAATAGAATGCTTCTAACTGAGTGCGGTCCGACGTGTCGCGCCGGAGATCGCTGTAACAACAGAGCGTTCGAGAAACGTCTTTACCCCAGGCTGGGACCCTATCGCACCCCGCATAGAGGCTGGGGGCTACGGACCATGCAGGATTTAAGAGCTGGCCAGTTCGTTATAGAGTATGTGGGGGAGCTGATAGACGAGGAGGAGTTCAGACGTCGCATGAACAGGAAACACGAGGTCCGGGATGAGAACTTCTATTTTTTAACGTTGGACAAAGAGCGCATGATAGACGCCGGGCCGAAAGGGAATCTGGCGAGGTTTATGAATCATTCCTGTGAGCCTAATTGCGAAACACAAAAGTGGACGGTGTTGGGCGACGTGCGTGTGGGATTGTTCGCGTTACGTGACATACCGGCAAACAGCGAGCTCACATTCAACTATAACCTGGAGACGTCGGGTATTGAGAAGAAAAGATGTATGTGTGGAGCCAAGAGGTGTTCAGGATATATAGGGGCTAAGCCTAAACAGGAGGACCAACCAAAGAAAATCAAGCCGCAGGTGAAAAGGATTTACAGGAAGCGCAAAGCGGAAGAATCGCCGTCTACGAGCCAGTACAAGAAACGAGGCAGACCCATAAAACCGCGAGAGCTGACCGAAATAGAAAAAGATCTTTTAATCATCAAAAATGCGACCAACGGCCTGTCTAGCGATTCAGAGTGCTCCAGGATAAGCATGGACAGCTGCAAAGATATAAAGGCGCTCAAAAGGAAAAGAATCAACCTGTCCACCGAGGAGTTGTCCCCGAAGAGGTCTAAGACGGATGAAATGAATTTGGTTTATTGA

Protein sequence:

>DPOGS200578-PA
MELCEKDNINQELDRVVENISPDIETIRRRKRCLNVPLNKSIDIVTKTSEDKTNIDKEENCDNNLNKTADELPQNVSNNEKVDSMENDREILAINDMRNPCGDNDNVDLETNEESSSTTQVHTSEENENQNTTKTELTINTNTNSSDEEAEGKNPIITDELSDQNDGKADEIKEMEMEKDNDNVSVVSEGSDISRKKRARDKPSDKKSLLSDVEFLKYLELRQDAVIDEHPELSQEDITSYLYKTWIYEENLKPDIKKCDDIDQANLVKGLNLDPAPVKKVRKRVKVDKEIACEDTATKEKSKRKIIRPYYKEEFSDGDDSVEYFDIFKSKKDQKGAIVDSKEIYQSDGTVLERIINVDEYVQDEYDDVEEYFRQLTAPKPNVFKGYAREKVCEICEKVGGLVKCKGCHSMFHVECDKKEIEVIECQTPTRGRRRKKKTRGRKTRDDHNQDSGSDEKSQDTNGSDELHMSLEEESHIIANADDFEAQMSVRMQEILKDQDIQYDFYSREELDWNDTHAGECKVVDIKPRMDSIEITDYSEFKCKNCQKYDPPVCFVCKYPISPKEKQGHRQKCQVAHCNKYYHLECLDHWPQTQFNGGEISRTNKFSEALTCPRHVCHTCVCDDPRGCKTRFSGDKLARCVRCPATYHTFTKCLPAGSQILTASHIICPRHYEHRPGKVPCHVNTGWCFICALGGSLICCEYCPTSFHAECLNIKPPEGGYMCEDCETGRLPLYGEMVWVKLGHYRWWPGIILHPSEIPDNILTVKHTLGEFVVRFFGQYDYYWVNRGRVFPFQEGDSGKVSSQKSKIDAAFTMAMEHAQRACSILKMAAPNEEESSDIASSLLPPHYVKLKVNKPCGSLCGKKIDLEESSLTQCECDPNDVDPCGPYTQCLNRMLLTECGPTCRAGDRCNNRAFEKRLYPRLGPYRTPHRGWGLRTMQDLRAGQFVIEYVGELIDEEEFRRRMNRKHEVRDENFYFLTLDKERMIDAGPKGNLARFMNHSCEPNCETQKWTVLGDVRVGLFALRDIPANSELTFNYNLETSGIEKKRCMCGAKRCSGYIGAKPKQEDQPKKIKPQVKRIYRKRKAEESPSTSQYKKRGRPIKPRELTEIEKDLLIIKNATNGLSSDSECSRISMDSCKDIKALKRKRINLSTEELSPKRSKTDEMNLVY-