Monarch geneset OGS2.0

DPOGS207894
TranscriptDPOGS207894-TA972 bp
ProteinDPOGS207894-PA323 aa
Genomic positionDPSCF300101 + 276623-288578
RNAseq coverage13x (Rank: top 83%)
Annotation
HeliconiusHMEL0102473e-3282.28% 
BombyxBGIBMGA008482-TA1e-7266.67% 
DrosophilaH2.0-PA3e-3086.76% 
EBI UniRef50UniRef50_D2A1C41e-3543.36%Putative uncharacterized protein GLEAN_08368 n=1 Tax=Tribolium castaneum RepID=D2A1C4_TRICA
NCBI RefSeqXP_001121170.12e-3166.36%PREDICTED: similar to H2.0-like homeo box 1 [Apis mellifera]
NCBI nr blastpgi|2700061994e-3543.36%hypothetical protein TcasGA2_TC008368 [Tribolium castaneum]
NCBI nr blastxgi|2700061992e-3844.69%hypothetical protein TcasGA2_TC008368 [Tribolium castaneum]
Group
Gene OntologyGO:00036773.7e-23DNA binding
GO:00063553.7e-23regulation of transcription, DNA-dependent
GO:00435651.7e-21sequence-specific DNA binding
GO:00037001.7e-21sequence-specific DNA binding transcription factor activity
GO:00055151.7e-20protein binding
GO:00056342e-06nucleus
KEGG pathway 
InterPro domain[228-294] IPR0122873.7e-23Homeodomain-related
[233-297] IPR0013561.7e-21Homeobox
[209-296] IPR0090571.7e-20Homeodomain-like
[264-273] IPR0000472e-06Helix-turn-helix motif, lambda-like repressor
[257-268] IPR0204792.8e-06Homeobox, eukaryotic
Orthology groupMCL22183 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207894-TA
ATGACTTCGGTTAGTGGTGATGTTAACACGTCCAAGTTGAAATTCAGTGTTGATAGTATTCTGGGTAACAATGAGAGTCGTCAGGATGGTTCTGTGTCGGAGGTCAGAGGACAGGAGCCGCCCTGTGCGGGATGTGTCGCTGCACTTTACAGATGCTGTAGAGATGAGCCTCCCCTACTGCAGCTACAATTGCCTTTACCATATGCACATCTGCATCCAGCGTTGAGGCCCACCTCAGGAGGTTTGATGAATTATGGAGTCATGACTTCGGTTAGTGGTGATGTTAACACGTCCAAGTTGAAATTCAGTGTTGATAGTATTCTGGGTAACAATGAGAGTCGTCAGGATGGTTCTGTGTCGGAGGTCAGAGGACAGGAGCCGCCCTGTGCGGGATGTGTCGCTGCACTTTACAGATGCTGTAGAGATGAGCCTCCCCTACTGCAGCTACAATTGCCTTTACCATATGCACATCTGCATCCAGCGTTGAGGCCCACCTCAGATGCTCTACATTTACGTCTATTGAGAGCTTCAGCACAAACATATGTATGGATAAATGCCGACCAAACTCAACTACAGCCAATTGATTACACCTTCTTAGGCTATGAAAGGAAGGATGCTGTCTATCGCTTGCCTCTGCCGACATCGTCTCCAACACAATCATCTGCGCCACGTTGTGACGTTTCGTCGAGCGGTAAAAGAAAACGTTCGTGGTCGCGAGCTGTTTTTAGTAACCTTCAGAGGAAAGGCTTGGAGAGACGGTTTCAAATACAAAAATACATTACCAAGCCTGATAGGAGGCAGCTAGCCGCCACCCTCGGTCTGACTGATGCTCAGGTGAAAGTGTGGTTTCAAAATCGTCGTATGAAATGGCGTCACACGAAAGAAGGTAGAGGGTTGGCGGGAACGAGAGACACAGACACCGCAGTCCATGAAGATAATGAAGATGTCGATGTTGATACACTTAGCGATTAA

Protein sequence:

>DPOGS207894-PA
MTSVSGDVNTSKLKFSVDSILGNNESRQDGSVSEVRGQEPPCAGCVAALYRCCRDEPPLLQLQLPLPYAHLHPALRPTSGGLMNYGVMTSVSGDVNTSKLKFSVDSILGNNESRQDGSVSEVRGQEPPCAGCVAALYRCCRDEPPLLQLQLPLPYAHLHPALRPTSDALHLRLLRASAQTYVWINADQTQLQPIDYTFLGYERKDAVYRLPLPTSSPTQSSAPRCDVSSSGKRKRSWSRAVFSNLQRKGLERRFQIQKYITKPDRRQLAATLGLTDAQVKVWFQNRRMKWRHTKEGRGLAGTRDTDTAVHEDNEDVDVDTLSD-