Monarch geneset OGS2.0

DPOGS206776
TranscriptDPOGS206776-TA1083 bp
ProteinDPOGS206776-PA360 aa
Genomic positionDPSCF300001 - 5983923-5985080
RNAseq coverage28x (Rank: top 76%)
Annotation
HeliconiusHMEL0061806e-10357.78% 
BombyxBGIBMGA010704-TA1e-6446.97% 
Drosophilatin-PA7e-1972.88% 
EBI UniRef50UniRef50_D6WZY75e-1966.67%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WZY7_TRICA
NCBI RefSeqXP_975046.19e-2066.67%PREDICTED: similar to AGAP003670-PA [Tribolium castaneum]
NCBI nr blastpgi|3227936291e-1835.95%hypothetical protein SINV_08571 [Solenopsis invicta]
NCBI nr blastxgi|3072148321e-2232.95%Homeobox protein Nkx-2.6 [Harpegnathos saltator]
Group
Gene OntologyGO:00063551.8e-24regulation of transcription, DNA-dependent
GO:00435651.8e-24sequence-specific DNA binding
GO:00037001.8e-24sequence-specific DNA binding transcription factor activity
GO:00036773.7e-21DNA binding
GO:00055151.3e-20protein binding
KEGG pathway 
InterPro domain[207-269] IPR0013561.8e-24Homeobox
[189-266] IPR0122873.7e-21Homeodomain-related
[206-280] IPR0090571.3e-20Homeodomain-like
[229-240] IPR0204793e-06Homeobox, eukaryotic
Orthology groupMCL25761 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206776-TA
ATGGACTGCGAACAAATAACGTCATACGATATCAAGAACTATGAGTGCAAGAACTACGAGTACGAGAAACGCTGCGAGGAATTCTACGACAAGAATTACAGAGTCAAGTTCGACGACCAGCACCAGAGCAGTCTGAGCACGCCGTTCCTCGTCAAGGACATACTGAACATTAACCAGGCGCCGTACTACGAGAGGAATGATGCTTGGAAGGTTGAGCGTAGGAATGAATGTGAACCGCTGCATCAGAGCCAGTACTGCCAGGAGTACTTCAGCCAAATGTACCCCAACATACCCATAAACACCGAGCCGTACTGGTCACAGGAGGTGCACGATACTAAAATAGAGGACTACTACAACTACAACTACAACTACAATCATAATTTATATCATCAGAATCATGACTATTCTGAATTAACTCCGCAAGTTGAAGTCCAGGGGAAGTTTCAGAATATGGAAACAGAATCTCAACCGCCGGGAACCGGAGTGAAGGTCGTGGAGAAGACGATACAACAGCTCGGGACAGAGACAACGGCTTACACGCAAACACTACCGAAATATCCAGCCATGGCTAGAAAACAAACGAAGCAATGTAAACCGGATCGTAAGGAGAGGAATGTGAAAAGAAAACCAAGGATCTTGTTCTCCCAAACACAAGTCCACGCGTTGGAAGTGAGGTTCATAGCGCAGAAATACCTCACAGCACCGGAGAGAGAACAACTCGCGAAGACATTAAACCTATCCCCGACTCAGGTGAAAATATGGTTCCAGAATCGAAGGTACAAGAGCAAACGAATTAAATCACCGGAAGTGTCGACATCAACGGACGCTAAACCGATGAAAAACATCGGCCGGAAGTTGTACAGAACTGAAAACAGAGATCTGAGATACGAAACATACAAGCAGGAGAGCGAGAGTTTGGAAAGTGAATTAACATCGACCATGTACTTCGACGACAGCATCACGTACGGCAGCGAAAAATATTACGCACAAGAAGACGTCTCCGGCACAATGTATTCGAAATTCAAAAGCGAAGGCTACAAAGAAAACGAAATAAAAAAGTTCAATCCAAACTATATATGTTGA

Protein sequence:

>DPOGS206776-PA
MDCEQITSYDIKNYECKNYEYEKRCEEFYDKNYRVKFDDQHQSSLSTPFLVKDILNINQAPYYERNDAWKVERRNECEPLHQSQYCQEYFSQMYPNIPINTEPYWSQEVHDTKIEDYYNYNYNYNHNLYHQNHDYSELTPQVEVQGKFQNMETESQPPGTGVKVVEKTIQQLGTETTAYTQTLPKYPAMARKQTKQCKPDRKERNVKRKPRILFSQTQVHALEVRFIAQKYLTAPEREQLAKTLNLSPTQVKIWFQNRRYKSKRIKSPEVSTSTDAKPMKNIGRKLYRTENRDLRYETYKQESESLESELTSTMYFDDSITYGSEKYYAQEDVSGTMYSKFKSEGYKENEIKKFNPNYIC-