Monarch geneset OGS2.0

DPOGS211699
TranscriptDPOGS211699-TA3360 bp
ProteinDPOGS211699-PA1119 aa
Genomic positionDPSCF300374 + 64216-92062
RNAseq coverage345x (Rank: top 34%)
Annotation
HeliconiusHMEL0023310.070.72% 
BombyxBGIBMGA011397-TA0.060.10% 
DrosophilaCG3328-PA3e-13663.76% 
EBI UniRef50UniRef50_D2A4J21e-14549.17%Putative uncharacterized protein GLEAN_15354 n=2 Tax=Tribolium castaneum RepID=D2A4J2_TRICA
NCBI RefSeqXP_968063.11e-14549.75%PREDICTED: similar to CG3328 CG3328-PA [Tribolium castaneum]
NCBI nr blastpgi|2700087664e-14549.17%hypothetical protein TcasGA2_TC015354 [Tribolium castaneum]
NCBI nr blastxgi|3504034871e-14451.26%PREDICTED: myelin gene regulatory factor-like [Bombus impatiens]
Group
Gene OntologyGO:00036772.7e-26DNA binding
GO:00063553.7e-15regulation of transcription, DNA-dependent
GO:00037003.7e-15sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[301-445] IPR0240612.7e-26NDT80 DNA-binding domain
[289-447] IPR0089673.7e-15p53-like transcription factor, DNA-binding
Orthology groupMCL14029 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211699-TA
ATGGATTTGATCAGCGACCTGTCCCCCCTGACTGTCCTGTCACAGGTGATTGGCCACCTATATTACACGCTCGATCGCATCGCCAACGGTACAGCACAAGGTCCGGGTCTTATAGACGAATTGGAGGCTTTTATGGCCCCCGTCGGCGAGGGTTGCTTCGGCCAGCCTCCAGACGCACGCCGTATAGGGGGCCACCAGCTGCCTGAAAGTCCACCGGACTCTGGCTCCGAGAACCCCTACAGTCCTGAGACCCAGGTATCCCACACCATAGCGGTGTCACAGACCGTGTTAGGCACGGACTATATGCTGGTGCCGGAACACATGACCTCCCATGAAATTTTACAGCAAAACGGCGATTACATATACGAGGAATTGAAGAGTGATAGCATAGATCATGAGGTTTTAAGGAATAACCTCAATGATGTAGTGGTCCTACCAGCTGATCAGAACTTAGATTTGGGTATACGCTCTGTCAGGCACGATTTGGGTTTAACTGAACCCATAGCCTATAATAGATACGGCCAGATGAGGGTCGAGCTCCCGGAATTGGAGCAGGGTATATTAAACCCGCAGCTGGTGTCTTTGGGTCATGAAAACCTGACCCCGGTGTATACGAACCTTCAGGAACCCAGCGCCAAGAAGAGGAAGCACTCTCAAGACGTGAACTCCCAAGTGAAATGTGAACCAGCGGCCCTGTCTCCCGAGAGCGTAGCCCGCCCTCCCCCATCCGTGGACGGCTCCGAGGCCGGCGATGACCCGCCACTCCAGTGCATCAGGTTCTCAGCCTTCCAGCAGAACGTTTGGTGTCCGCTATACGACTGCAACTTGAAGCCTATTCTGAACACTTCGTACGTGGTCGGCGCTGACAAAGGCTTCAATTTCTCTCAGATAGACGAGGCTTTTGTCTGCCAGAAGAAGAACCACTTCCAGGTGACCTGCCAGATACAGGTCCAGGGGGAGGCTCAGTACGTGAAGACTCCAGACGGATTCAAGAAGATTAACAACTTCTGTCTGCATTTCTATGGCGTCAAAGCGGAAGACCCGAGCCAGGAGGTGAGGATCGAGCAGAGCCAATCAGACAGGACCAAGAAACCCTTCCATCCAGTGCCTGTGGACATCCGGCGCGAGGGCGCTAAGGTCACCGTGGGTCGTCTGCACTTCGCTGAAACCACAAACAATAACATGAGGAAGAAGGGGCGCCCCAACCCCGACCAGAGGCACTTCCAGCTGGTGGTGGCGCTGAGGGCTCACGTCGCGCACAGCGACTACATAGTGGCAGCGCAGGCTAGTGACAGGATCATTGTCAGGGCATCGAACCCGGGCCAGTTCGAGTCCGACTGCACTGAGAGCTGGTGGCAGAGAGGAGTCGCCGAGAACAGCGTCCATTACAGTGGGAGAGTCGGCATCAACACCGACCGGCCGGACGAGGCCTGTGTTATCAATGGAAACCTTAAAGTGATGGGACACATAGTACATCCGTCCGACGCCAGAGCTAAACACGACATTGAAGAGTTGGACACGGCGCAGCAACTGAGGAACGTGCAGAGCATACGAGTTGTTAAATTTCACTACGACCCGTCGTTCGCCCACCACACGGGCCTGGCGGGTCATGCAACGGTCCCCGACACGGGAGTGCTGGCACAGGAGGTGAGGGAGGTCATCCCGGACGCTGTCAAGGAGGCCGGTGACGTCACCCTCGCTAACGGGGACAGGATACGGAAGTTCCTCGTCGTGAACAAGGATCGCATCTTCATGGAAAATCTTGGAGCTGTGAAGGAGCTGTGCAAGGTGACAGGAAACCTGGAGACGAGGATCGACCAGCTCGAGAGACTCACCAGGAGGCGGACCACCAGACATGACAGCGCCATCAGTAACGATTCGCGAGTGTCGATCACATCTTCGAGGTCCTTCTACAGTGATGGGAACATCTCTATCGATCAGATAAGAGACATCGCCCGCAGCATCCGCAGACACGAGTGTTGCCACAAACTCAGTCACAAATCCCCAAAATTCACCCGAAAACAGTGCAAAAACTGTCATACGAATTACACGAAATATGGGAAATATTACAACTACAATAAGACATGTATCAAGAATAAAGAAATGAAGGACAGTGAACATCCGTATCCGGATTATGTTAATAGTGCGGAACAGAAGTCACCGGAAGATACTTATACGGTAACAAAAAAGAAGGACTCGTGTCTGTGGTTGCGGGACGACGAGAGCTATCAACATGGCAACACTAAACTGCCGTTCTGCTGCAGGAGGAAGTACAGATACGGCGGCAGCGGGGAGCTAATATCAAATAAATTCCTGCAGATCGTTATCACTATACTGATATTCGTCATGGCTATCTGCTTAGTTGCGATGTCAGCGCTGTACTTCCGGGAGCACCGCGAGCTGGTGTCGCTGAGGGAGAGGCGGGTGACGCCGCGGACAGCGAAACATAACACTATACAGAATCTTAAAATATCACAGCACATGGTGGTCAAGAAAACGGCAAAAGAGAAAGCCCCATACAAAGTGTCCCCGCAGTACGTGACGTCATCACCGGAGCCTCCGACGACCCACACGACAAGGAACTACGTGAAGACATTGAGTACTCCTGTAGATGCTCCGTCCTTGTTTCGCTCAGCGGCCACTATCGGAGCTGGCTGCGGGTCCTCGAGCACTGACAATGAGCTGGATTCTGGTTGTCAGTCGTCATGCTCGGACCCGTCCCAGGTGTTCAACAGCCAGCCCCTGGAGAGCATCAGGACCGACGAGGAGGAGAAACAGAACGACACACAGACGGAGAGAAAGATCCTCACGCCGGTGATGACGGAGAACAACTACCTGGAGAAGAACGAGAGCAGAGTGAGGCGCGCCGCGGACAGCGACGAGGCGCCCAGGAGCCAGGAGGAGATGGCGCTGGAGGGGGACGGCGAGGGGGGAGGGGAATGCGACACGCTCACCCTGGGTGTCGTGAGCAAGAGCTACACCAACGCGAGCGTGTTCAGCGAGCGCGTGTGTGTCCGCGTGCTACGGAATTATACGTACACGCTGCCGGTGTCCGCCTGCCTGCACCAGAAATACCTGGACGTGGTCTTCAGATCATCCAAGCTGAGGGAGGTCCGTCTGTGCGACCTTCAGTGCAAATCCGAGTCCATGAAGACCTGCCAGGTGGAGCGCGAGTCCTCAAAACCTATACCGGCTGGTGACATCTGGACAGCCAGGGTCGGACTCCGCTGCAGACTCGACCGCTGGATGAAGATAAGGGCTGGATTCGTCCCTCTCAAGGACCTGTGCTACCTGAACCCCGAGGACAAGATCCCATTCGTTGAGTTCAACATACACATATACAGGGACTGCAGGAACTGA

Protein sequence:

>DPOGS211699-PA
MDLISDLSPLTVLSQVIGHLYYTLDRIANGTAQGPGLIDELEAFMAPVGEGCFGQPPDARRIGGHQLPESPPDSGSENPYSPETQVSHTIAVSQTVLGTDYMLVPEHMTSHEILQQNGDYIYEELKSDSIDHEVLRNNLNDVVVLPADQNLDLGIRSVRHDLGLTEPIAYNRYGQMRVELPELEQGILNPQLVSLGHENLTPVYTNLQEPSAKKRKHSQDVNSQVKCEPAALSPESVARPPPSVDGSEAGDDPPLQCIRFSAFQQNVWCPLYDCNLKPILNTSYVVGADKGFNFSQIDEAFVCQKKNHFQVTCQIQVQGEAQYVKTPDGFKKINNFCLHFYGVKAEDPSQEVRIEQSQSDRTKKPFHPVPVDIRREGAKVTVGRLHFAETTNNNMRKKGRPNPDQRHFQLVVALRAHVAHSDYIVAAQASDRIIVRASNPGQFESDCTESWWQRGVAENSVHYSGRVGINTDRPDEACVINGNLKVMGHIVHPSDARAKHDIEELDTAQQLRNVQSIRVVKFHYDPSFAHHTGLAGHATVPDTGVLAQEVREVIPDAVKEAGDVTLANGDRIRKFLVVNKDRIFMENLGAVKELCKVTGNLETRIDQLERLTRRRTTRHDSAISNDSRVSITSSRSFYSDGNISIDQIRDIARSIRRHECCHKLSHKSPKFTRKQCKNCHTNYTKYGKYYNYNKTCIKNKEMKDSEHPYPDYVNSAEQKSPEDTYTVTKKKDSCLWLRDDESYQHGNTKLPFCCRRKYRYGGSGELISNKFLQIVITILIFVMAICLVAMSALYFREHRELVSLRERRVTPRTAKHNTIQNLKISQHMVVKKTAKEKAPYKVSPQYVTSSPEPPTTHTTRNYVKTLSTPVDAPSLFRSAATIGAGCGSSSTDNELDSGCQSSCSDPSQVFNSQPLESIRTDEEEKQNDTQTERKILTPVMTENNYLEKNESRVRRAADSDEAPRSQEEMALEGDGEGGGECDTLTLGVVSKSYTNASVFSERVCVRVLRNYTYTLPVSACLHQKYLDVVFRSSKLREVRLCDLQCKSESMKTCQVERESSKPIPAGDIWTARVGLRCRLDRWMKIRAGFVPLKDLCYLNPEDKIPFVEFNIHIYRDCRN-