Monarch geneset OGS2.0

DPOGS202670
TranscriptDPOGS202670-TA1434 bp
ProteinDPOGS202670-PA477 aa
Genomic positionDPSCF300039 + 445163-463164
RNAseq coverage185x (Rank: top 49%)
Annotation
HeliconiusHMEL0051155e-17380.09% 
BombyxBGIBMGA001296-TA0.083.90% 
DrosophilaEts98B-PA1e-7651.11% 
EBI UniRef50UniRef50_E2ATR91e-9253.89%DNA-binding protein D-ETS-4 n=6 Tax=Formicidae RepID=E2ATR9_CAMFO
NCBI RefSeqXP_968441.11e-9142.78%PREDICTED: similar to Ets at 98B CG5583-PA [Tribolium castaneum]
NCBI nr blastpgi|3071711654e-9253.89%DNA-binding protein D-ETS-4 [Camponotus floridanus]
NCBI nr blastxgi|3071711656e-9349.14%DNA-binding protein D-ETS-4 [Camponotus floridanus]
Group
Gene OntologyGO:00063553.2e-49regulation of transcription, DNA-dependent
GO:00435653.2e-49sequence-specific DNA binding
GO:00037003.2e-49sequence-specific DNA binding transcription factor activity
GO:00055158.8e-24protein binding
GO:00056341.6e-20nucleus
KEGG pathway 
InterPro domain[388-476] IPR0004183.2e-49Ets
[376-472] IPR0119912.8e-38Winged helix-turn-helix transcription repressor DNA-binding
[249-340] IPR0109938.8e-24Sterile alpha motif homology
[251-336] IPR0137612e-21Sterile alpha motif-type
[255-337] IPR0031181.6e-20Sterile alpha motif/pointed
Orthology groupMCL11730 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202670-TA
ATGTTGTGGGAGAGTAATCTTAACGAGGATCCGGAGACGATGCCACAGACGGTGCCTCGGTCCGGGTGCAGCCCACCATCGGCATCCCAAGGAGAGAGGGCAGTGCCAACATCGCCAGCCGACATTCATCACCTCCTGCGTCTTTTAGGCGCCGAATCACCACCAGAACCGATGTTGCACTCTCCACCAACACACACTAAGCCACCACCTCCATATCCAGAGGACAATAACTATCTTGAACGCATATACGATTTTGAAGGATTCCCCACACCTTCCCCTTCGTCGGACGAGGGCTCAATACCAACAGTAGCACTACAACCGTCTTCCCCTTATAACTCGTTTCAATATTCTCCAGTTTTCATCAAAGAGGAACCAAACAGGCTAACAGTGCCAGGGTTCGCGTCACCTTATCCACTGTCACCATCGGGCTCCTGCGTTTCTTACAGCAGTAACAATCAATATTCCTCACCAGTTCCACAACAAGAAGAATATATTAACATCGAAGATCTCCTTAAAGAAAATCAAATATTACAAGACAGTATCCAACAAAATTATATTACACCTAAAATTGAAGTCGAAGAACCCAGGGATCATATTCTTTTAAGATCAGCTCTTGAAGATACAACATTCCAAAAAAGATTAAACTTAAGGCCCTTTGAATTAGGAAGTGTCAAAATGGAAGAAAGCAGCGGTGGTCCCGGCGAGGAGGCCCTAGTTGCCCCCGACATTGATCGCGTTCTTTCTATGGCCATAGAACAGTCAAAGCGAGATGTCGATAACACGTGCACAGTACTGGGTATATCACCAGACCCAATGCAATGGAGCTCTAGTGACGTTAAAGCTTGGGTGATGTTCACACTGAGACACTTCAACCTGCCGATGGTACCATCCGAGTATTTCGCAATGGACGGAACAGCTCTTGTTGCGCTCACTGAAGAGGAATTTAATCAAAGGGCTCCACAAGCGGGTAGCACGCTGTACGCGCAACTGGAGATCTGGAAAGCTGCACGACATGAGGGTTGGAGGAGCCAGTGGACTGAGCAACGGCCGCCTACACCAGCACCGCCCGCACCTGCCACTGAGGACATGAGCGATGATGATGCAGAATCCATTGTAGCAAATGTCAGCCAAGGTGGTGGCAAAGTGAAGACTGGTAGCACCCACATCCATCTTTGGCAGTTCCTCAAAGAACTTTTGGCTTCACCACATATACACGGATCAGCGATACGGTGGTTAGACAGAAGTAACGGTGTATTCAAAATTGAAGATTCAGTTCGTGTCGCCAGACTGTGGGGCAAAAGAAAAAACAGGCCCGCAATGAACTACGACAAACTATCGCGTAGCATTAGACAATACTACAAGAAAGGCATCATGAAGAAAACGGAAAGAAGTCAGAGGCTCGTCTATCAGTTCTGTCATCCCTACTGTTTATAA

Protein sequence:

>DPOGS202670-PA
MLWESNLNEDPETMPQTVPRSGCSPPSASQGERAVPTSPADIHHLLRLLGAESPPEPMLHSPPTHTKPPPPYPEDNNYLERIYDFEGFPTPSPSSDEGSIPTVALQPSSPYNSFQYSPVFIKEEPNRLTVPGFASPYPLSPSGSCVSYSSNNQYSSPVPQQEEYINIEDLLKENQILQDSIQQNYITPKIEVEEPRDHILLRSALEDTTFQKRLNLRPFELGSVKMEESSGGPGEEALVAPDIDRVLSMAIEQSKRDVDNTCTVLGISPDPMQWSSSDVKAWVMFTLRHFNLPMVPSEYFAMDGTALVALTEEEFNQRAPQAGSTLYAQLEIWKAARHEGWRSQWTEQRPPTPAPPAPATEDMSDDDAESIVANVSQGGGKVKTGSTHIHLWQFLKELLASPHIHGSAIRWLDRSNGVFKIEDSVRVARLWGKRKNRPAMNYDKLSRSIRQYYKKGIMKKTERSQRLVYQFCHPYCL-