Monarch geneset OGS2.0

DPOGS204069
TranscriptDPOGS204069-TA1026 bp
ProteinDPOGS204069-PA341 aa
Genomic positionDPSCF300200 + 44107-53902
RNAseq coverage33x (Rank: top 75%)
Annotation
HeliconiusHMEL0131337e-16495.03% 
BombyxBGIBMGA010810-TA4e-8392.55% 
DrosophilaHmx-PC3e-3790.00% 
EBI UniRef50UniRef50_D6X1Q33e-6142.92%Putative uncharacterized protein n=2 Tax=Bilateria RepID=D6X1Q3_TRICA
NCBI RefSeqXP_967446.15e-6242.92%PREDICTED: similar to H6 family homeobox 3 [Tribolium castaneum]
NCBI nr blastpgi|910900231e-6042.92%PREDICTED: similar to H6 family homeobox 3 [Tribolium castaneum]
NCBI nr blastxgi|910900232e-6544.22%PREDICTED: similar to H6 family homeobox 3 [Tribolium castaneum]
Group
Gene OntologyGO:00036776.3e-26DNA binding
GO:00063556.3e-26regulation of transcription, DNA-dependent
GO:00435652.4e-23sequence-specific DNA binding
GO:00037002.4e-23sequence-specific DNA binding transcription factor activity
GO:00055153e-23protein binding
KEGG pathway 
InterPro domain[185-265] IPR0122876.3e-26Homeodomain-related
[207-269] IPR0013562.4e-23Homeobox
[198-276] IPR0090573e-23Homeodomain-like
[229-240] IPR0204793.1e-06Homeobox, eukaryotic
Orthology groupMCL19591 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204069-TA
ATGGATTCGCCCGTATCATCGCAAGATAACGACATCGAGGTCAACGTTGTATCGGGAGCCAGTAGTCCTGACATACCGTCGTCGCCATCTAGAAAAGAACAGAAGAGTCCATTCTATGATCTCAAGAAGGATAAGGAAGATAAGGCAACCGGAAGTGCTCCATACACCAGCTTTTCCATCAGTTCCATATTGAACAGGGCGGAACCGAGACGAGAGAATGTTCTGGAGGCGGCGCTCGCAGCAAACCACTTCGGTAACGCGAATGACGCTATGCTGTCCAGACTGGGTCTGATCTCTCAGTGGAGCGCGCTGGCTGGGCGTTACGGTGGCCTGGTCCCAGCTCCCTGGTACTGGCAGAGACCACACGAGAGAACCCCTCCCAGAGAGGATTCCACCACTAATGAAGAGGGCTCCGCTGGCGGTTCGCCCCGGCCTCGCAGTCCCCCGCGGGCACCCTCGCCCCCCTCTCCATCGCGGTCCTCCCCCCGCTCCCCGCGTCTCCATCCAGCGCTATACCCCCAGCAGCCGGACGTTCAAGACTCAGACCCTGACGTCGACGAGAGCGATGATTTCGACAAAGACGAGAAAAGAGATCCGAGCTCGAGCCTCGGCAAACGTAAGAAGAAGACGCGGACGGTTTTCTCGCGCTCGCAGGTTTTCCAGCTGGAGTCGACCTTCGATATGAAGAGGTACCTCAGCAGCTCGGAGCGCGCGGGGCTCGCGGCTAGCCTCCACCTCACGGAGACCCAGGTGAAGATATGGTTCCAGAACAGAAGGAACAAGTGGAAGAGACAGCTGGCGGCGGAATTAGAAGCGGCTAACATGGCGCACGCCGCGCAGAGACTGGTCCGCGTACCGATCCTGTACCACGAGTCTAGACCGACTTCCATACCGCATACTCCACTGCCGATACATCCTCAGTTCAATGAAGCCGGGGTGTATTTCCAGCCTCAAGCGCCGCAGCAGCTGCAGCCGCTGCCCTCTTCGCCAGTCACTTCCCGGGCGCCGCTGTCCAGTCTAGTGTAG

Protein sequence:

>DPOGS204069-PA
MDSPVSSQDNDIEVNVVSGASSPDIPSSPSRKEQKSPFYDLKKDKEDKATGSAPYTSFSISSILNRAEPRRENVLEAALAANHFGNANDAMLSRLGLISQWSALAGRYGGLVPAPWYWQRPHERTPPREDSTTNEEGSAGGSPRPRSPPRAPSPPSPSRSSPRSPRLHPALYPQQPDVQDSDPDVDESDDFDKDEKRDPSSSLGKRKKKTRTVFSRSQVFQLESTFDMKRYLSSSERAGLAASLHLTETQVKIWFQNRRNKWKRQLAAELEAANMAHAAQRLVRVPILYHESRPTSIPHTPLPIHPQFNEAGVYFQPQAPQQLQPLPSSPVTSRAPLSSLV-