Monarch geneset OGS2.0

DPOGS206778
TranscriptDPOGS206778-TA801 bp
ProteinDPOGS206778-PA266 aa
Genomic positionDPSCF300001 - 5863641-5881531
RNAseq coverage42x (Rank: top 72%)
Annotation
HeliconiusHMEL0061768e-12987.45% 
BombyxBGIBMGA010695-TA5e-7396.43% 
DrosophilaDr-PA7e-4171.19% 
EBI UniRef50UniRef50_Q58Y767e-6457.55%Muscle segment homeodomain protein n=2 Tax=Tribolium castaneum RepID=Q58Y76_TRICA
NCBI RefSeqNP_001034495.11e-6457.55%muscle segment homeodomain protein [Tribolium castaneum]
NCBI nr blastpgi|2700140523e-6457.55%hypothetical protein TcasGA2_TC012748 [Tribolium castaneum]
NCBI nr blastxgi|2700140524e-6458.90%hypothetical protein TcasGA2_TC012748 [Tribolium castaneum]
Group
Gene OntologyGO:00063552e-24regulation of transcription, DNA-dependent
GO:00435652e-24sequence-specific DNA binding
GO:00037002e-24sequence-specific DNA binding transcription factor activity
GO:00036771.3e-22DNA binding
GO:00055159.3e-22protein binding
KEGG pathway 
InterPro domain[143-205] IPR0013562e-24Homeobox
[128-203] IPR0122871.3e-22Homeodomain-related
[117-204] IPR0090579.3e-22Homeodomain-like
[165-176] IPR0204791.8e-06Homeobox, eukaryotic
Orthology groupMCL18365 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206778-TA
ATGAAGACTTCGCTTGAATGCGAGCGGTCCGAGGCCGCTGAGAGCGGTAAGGGTCGTATATCGTTTAGCGTAGATGCCTTACTTGGCAGCAAGAGCGACACGACCAGAAACACGCCTGACGCAGTCAGCAATGACGCCGAGAGCGCTGTTGAATCTGACGATAGTGACGTTGATATAGAAGACGTCGAGTCTAATGTCGGTGATGACAGGGATGACAGAGAAGCGAACGATGATGATGAAGCCAGGAGCGGGGTGGTCGTACCACAGCCCCTCCTGCCAAGGATCTACCAGGGGCCCTCGCACGCCTGGCCGTTTGGAGCCTTCCCATGGATGGCGCCCAACCCTATGTTCAGGGCTGGCTCTCCTAACGCAGGAGCTCCGAGCGGTCCGCCCGTAGTCCGCTGTCAGTTAAGGAAACACAAGCCCAACAGGAAGCCGAGAACACCGTTCACAACACAGCAGCTCCTGGCCTTGGAGAAGAAGTTCAGGGACAAACAGTACCTGAGCATCGCGGAGAGAGCTGAATTCTCATCGTCATTGAGACTGACGGAAACTCAGGTGAAAATATGGTTCCAAAACCGGCGAGCGAAAGCCAAGCGTCTACAGGAAGCGGAGATAGAAAAACTTCGTCTCTCAGCTCGTCCTCTCCTACCGCCTTCGTTCGCGCTGTTCGGAGGTGGAACACCACCACTGTTCGCCGCCATGGCCGCGGCGAGACCGCAGCTCAGCTTCCTGGGCGGCCCGCCCACGCACCAACACGCCATCAACATGAACATACTCAACTCCCTCCAACCGCATTGA

Protein sequence:

>DPOGS206778-PA
MKTSLECERSEAAESGKGRISFSVDALLGSKSDTTRNTPDAVSNDAESAVESDDSDVDIEDVESNVGDDRDDREANDDDEARSGVVVPQPLLPRIYQGPSHAWPFGAFPWMAPNPMFRAGSPNAGAPSGPPVVRCQLRKHKPNRKPRTPFTTQQLLALEKKFRDKQYLSIAERAEFSSSLRLTETQVKIWFQNRRAKAKRLQEAEIEKLRLSARPLLPPSFALFGGGTPPLFAAMAAARPQLSFLGGPPTHQHAINMNILNSLQPH-