Monarch geneset OGS2.0

DPOGS201979
TranscriptDPOGS201979-TA2310 bp
ProteinDPOGS201979-PA769 aa
Genomic positionDPSCF300060 - 228191-232712
RNAseq coverage293x (Rank: top 38%)
Annotation
HeliconiusHMEL0026320.089.93% 
BombyxBGIBMGA010551-TA0.079.01% 
Drosophilal(3)mbt-PB8e-3233.00% 
EBI UniRef50UniRef50_F4WHY30.046.82%Polycomb protein Sfmbt n=18 Tax=Coelomata RepID=F4WHY3_ACREC
NCBI RefSeqXP_967817.10.046.78%PREDICTED: similar to Scm-related gene containing four mbt domains CG16975-PB [Tribolium castaneum]
NCBI nr blastpgi|3504047720.047.19%PREDICTED: polycomb protein Sfmbt-like isoform 1 [Bombus impatiens]
NCBI nr blastxgi|3838554580.047.32%PREDICTED: polycomb protein Sfmbt-like [Megachile rotundata]
Group
Gene OntologyGO:00056345.3e-38nucleus
GO:00063555.3e-38regulation of transcription, DNA-dependent
GO:00055153.5e-11protein binding
KEGG pathway 
InterPro domain[458-556] IPR0040925.3e-38Mbt repeat
[688-756] IPR0137615.9e-14Sterile alpha motif-type
[688-754] IPR0109933.5e-11Sterile alpha motif homology
[694-754] IPR0211297.8e-08Sterile alpha motif, type 1
Orthology groupMCL11273 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201979-TA
ATGGCTTTTTACGAAATTGGTAGCAATGTATCTTCACATAATATAACGGATTTGTCAAACGACAATAATATTTTAGAAGATTTGTCTATAATTCCTTTAGAGAAAGATAGTTTTGCAATATGTGAATTGTGCGGGCGAGTGGGTCGCCGCGGTCAGTTCTACGCTCGTAATAATAAGTTCTGTTCACTAAGATGTCATGCCAGTAATAATTTAAGAAGGAGGTATTGTAACTGGAGATTCAAGTTGATGGAAGGGGCCGAACATATGGCAGGACTAATCCCGCTGGAACCACTGCCACAGTTGCAGCACTGGCAAGCAAATTTTAATGATCTCCAAGCAGGACCTATTAAGCAATCTGCTGCATTAATAGCTAACAGTTATGAATGGAAGGATGAACTATTTGGGTGTGATTTCCTGGCTGCCCCTGTCAGTCTGTTCAAACATGCACCATTACATGAAATGTGGGATAACACTTTCGAAGGAATGAAAGTAGAAGTAAAGAACACAGACTGTGAAAATTACTCAGAAAAGCTACATGATTACTTCTGGGTAGCGACAGTTCTCAAGGTTAAAGGCTATATGGGTCTTCTAAGGTATGAAGGTTTTGGTAGTGATGATTCTAAGGACTTTTGGGTTAATCTTTGTTGTTCTGAAGTACATCCAGTTGGCTGGTGTGCAACACGTGGTAAACCACTAATACCTCCTCGCAGTATAGAGGACAAATATACTGACTGGAAAAAGTTTCTAGTGAAACAGCTCACAGGAGCTCGAACTTTGCCTGCAAATTTTTATACTAAATTAAATGACAGCCTTGTATCAAGGTTTTCTATAGGTTCAATCATGGAAGTAGTAGATAAGAACAGGATTTCCCAAGTTAAAGTGGCTAGTGTATGTGAAATAGTCGGAAAACGTTTACACATAAAATACTATGATAGTTCGCCGGAGGATAATGGCTTCTGGTGTCATGAAGACTCCCCATTGATTCATCCTGTGGGCTGGGCGTTCCGAGTAGGACATTTGTTAGATGCTCCACAAAGTTACTGTTCAAGAGTAGCGGCAGGACGTCTTTTGCCAAATGATACAACATCGGATATGTTCTACAAGTACCCTACAAATGAACCGCCCTTGTTTTCTGAGGGTATGAAACTGGAAGCGATAGATCCATTGAATCTGTCAGCGGTTTGTGCGGCGACTGTGATGCAAATTTTGAATGAAGGCTATATGATGATTAGGATAGATTGCTATCCCGCGGACGCATCGGGCGCGGACTGGTTTTGTTATCACCAACGATCGCCTTGCATATTCCCAGTGGGATTCGCTTTAGCAAATAATATCACCTTGGTGCCTCCGGCTGGCATGAGTAGGGAGCAATTTAGATGGGATCAATATTTAAGTGAATCCGGATGTGTGGCAGCTTCGAGATCTCTCTTCTCAGCCAGAGGTCACGTTGTGTCACACGGGTTCGTGGCAGGCATGAGATTGGAGTGCGCCGACCTAATGGACCCGCGTCTCGTCTGCGTTGCCACGGTCGCGAGGGTTGTTGCTGATTTACTCAAGGTTCATTTCGACGGTTGGGGTGGCGAGTACGACCAGTGGCTATGGGCTCACAGTACTGACGTGTATCCTGTGGGATGGTGTAGGGCTGTGGGTCACCGTTTGGAGGGACCTTTACAACCACCACGAGCACGCCGCCCTCCAGCCCGTCAGCCGAGACCCACTCGCAGGAAACGAAGACCTGCGGCAAAGGTGTCTCATCCACCAAACACTACAGAGGAAAGTTCACAGAATTCGGATATGGGTGAAACAAAAAGTCTTTCACCGCCGACTAGTATGTCTGAATCAGCTGATACAGAACGTTCTGAAGATGTTTCTGTTAAGATGGATGTTGATACAGAAGCCTCGGATAGTAGATTGGCAGATACAAGCCAGGATATTATGACGGGAACAACCACAGACAATTCTCAGGATGTGAATGAATCATCACCCAACGATTCCCTCAAAAGTGTAGATGACAAAGTCATACCCAGGCTAGTCAATACCAGTGTGCCCTTAGATGTTCTTAATTCTGTTGACCCAGAAAATTGGACCAGCGCTGATGTAGCAAAATTCCTTACAGTAAACGACTGCCAAACCTACTGTGTTAACTTTAACCAAATAACTGGACCTATGATGCTGCAACTGTCCAAAGATGAAATCATTGAACTCCTAGAAATGAAAGTAGGACCGTCACTAAAAATTTTCGACCTAATACAACAGCTGAAGTGCAAAATAAAACAACCGCAGTGTAGATTACTTGGAAGTTTTAAGTAG

Protein sequence:

>DPOGS201979-PA
MAFYEIGSNVSSHNITDLSNDNNILEDLSIIPLEKDSFAICELCGRVGRRGQFYARNNKFCSLRCHASNNLRRRYCNWRFKLMEGAEHMAGLIPLEPLPQLQHWQANFNDLQAGPIKQSAALIANSYEWKDELFGCDFLAAPVSLFKHAPLHEMWDNTFEGMKVEVKNTDCENYSEKLHDYFWVATVLKVKGYMGLLRYEGFGSDDSKDFWVNLCCSEVHPVGWCATRGKPLIPPRSIEDKYTDWKKFLVKQLTGARTLPANFYTKLNDSLVSRFSIGSIMEVVDKNRISQVKVASVCEIVGKRLHIKYYDSSPEDNGFWCHEDSPLIHPVGWAFRVGHLLDAPQSYCSRVAAGRLLPNDTTSDMFYKYPTNEPPLFSEGMKLEAIDPLNLSAVCAATVMQILNEGYMMIRIDCYPADASGADWFCYHQRSPCIFPVGFALANNITLVPPAGMSREQFRWDQYLSESGCVAASRSLFSARGHVVSHGFVAGMRLECADLMDPRLVCVATVARVVADLLKVHFDGWGGEYDQWLWAHSTDVYPVGWCRAVGHRLEGPLQPPRARRPPARQPRPTRRKRRPAAKVSHPPNTTEESSQNSDMGETKSLSPPTSMSESADTERSEDVSVKMDVDTEASDSRLADTSQDIMTGTTTDNSQDVNESSPNDSLKSVDDKVIPRLVNTSVPLDVLNSVDPENWTSADVAKFLTVNDCQTYCVNFNQITGPMMLQLSKDEIIELLEMKVGPSLKIFDLIQQLKCKIKQPQCRLLGSFK-