Monarch geneset OGS2.0

DPOGS215543
TranscriptDPOGS215543-TA1098 bp
ProteinDPOGS215543-PA365 aa
Genomic positionDPSCF300129 - 208029-230852
RNAseq coverage15x (Rank: top 82%)
Annotation
HeliconiusHMEL0116253e-8888.65% 
BombyxBGIBMGA002297-TA8e-7285.41% 
DrosophilaC15-PA2e-7051.52% 
EBI UniRef50UniRef50_Q4V7291e-6852.29%IP08859p n=18 Tax=Diptera RepID=Q4V729_DROME
NCBI RefSeqXP_002429553.14e-7853.80%T-cell leukemia homeobox protein, putative [Pediculus humanus corporis]
NCBI nr blastpgi|2700142442e-7757.69%hypothetical protein TcasGA2_TC011749 [Tribolium castaneum]
NCBI nr blastxgi|2700142443e-8055.56%hypothetical protein TcasGA2_TC011749 [Tribolium castaneum]
Group
Gene OntologyGO:00063559.9e-25regulation of transcription, DNA-dependent
GO:00435659.9e-25sequence-specific DNA binding
GO:00037009.9e-25sequence-specific DNA binding transcription factor activity
GO:00036777.9e-23DNA binding
GO:00055151e-20protein binding
KEGG pathway 
InterPro domain[202-264] IPR0013569.9e-25Homeobox
[195-260] IPR0122877.9e-23Homeodomain-related
[193-260] IPR0090571e-20Homeodomain-like
Orthology groupMCL14837 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215543-TA
ATGTCGTCTGTGCATTCTGATGATGATCAGGAGGAGATAAATGTTGACTCTGACAGTAGACTATCAGGTCAATACGAGAGAAGCATCGCTTCAGATAGAGATTCGATACATAGTTACAACGAGAGATACAGAGAATCACCCCCCGACCATCTGACTGATACCCAGCAGAGGGTAAATTTACCTTTCAGCATATCACGTTTGTTGGGTAAACAGTTCGAGGATATTGATAAGAGGAATTCAGGAGAGAGTGATGATAGAAGCGACCAAGATACGGAGTCTGAAGGAGCGAAGGACGATTCTAAGGAAGCCGGTCTAGCGTTGAACTTCAGCCATAACCCGGGATTATATCCTAACTCGAGTTTGTTGTTGCGGCCCGGTTTAAATCTGGGGGCTGGTTATGGATTCCCGGGTAATCCGGGAGTTGTCAGGGTGCCAGCACATAGAGCTCTGGGGGCTTTGGGAGCGTGGGGCGTAGCCCTTGATCCGATGAGACAAGCGGCTGCTGCAGCCTTCGCCCATCAAGTTGTGAAAGACAGATTAAACGCATCATTTCCTATAACGAGACGAATTGGTCACCCGTATCAGAACAGAACACCTCCTAAGAGAAAGAAACCTCGAACATCATTCACTCGGATGCAGATAGCTGAACTGGAGAAGAGATTCCACAAACAGAAATACCTCGCATCAGCTGAACGCGCCTCGCTAGCCAAAACCTTGAAAATGACGGACGCCCAAGTGAAAACATGGTTCCAGAACCGACGAACGAAATGGAGGCGTCAGACAGCTGAGGAGAGGGAAGCTGAGCGACAAGCCGCTAATCGACTGATGTTATCCCTTCAAGCAGAAGCTCTAAGCAAGGGCTATATGCCAGAACCGCCGCCCGGTGCGCCGCTGTCGGCGCTGCATTACCAAAACCCAAATACGCAACAGCAGAGTAACACCGCTTTAAGCGCTCTACAAAATCTTCAGCCCTGGGCAGGTAACCAGGCTACCTTCATGAACGGCCCCGCTCCCCCCGCGCCCCAACCGCCCCCCGTCGCCTCACTCTACCGCACCGTATTTATTTCTACCAAACTTCTGTACTTAATAATTCTTTAA

Protein sequence:

>DPOGS215543-PA
MSSVHSDDDQEEINVDSDSRLSGQYERSIASDRDSIHSYNERYRESPPDHLTDTQQRVNLPFSISRLLGKQFEDIDKRNSGESDDRSDQDTESEGAKDDSKEAGLALNFSHNPGLYPNSSLLLRPGLNLGAGYGFPGNPGVVRVPAHRALGALGAWGVALDPMRQAAAAAFAHQVVKDRLNASFPITRRIGHPYQNRTPPKRKKPRTSFTRMQIAELEKRFHKQKYLASAERASLAKTLKMTDAQVKTWFQNRRTKWRRQTAEEREAERQAANRLMLSLQAEALSKGYMPEPPPGAPLSALHYQNPNTQQQSNTALSALQNLQPWAGNQATFMNGPAPPAPQPPPVASLYRTVFISTKLLYLIIL-