Monarch geneset OGS2.0

DPOGS210999
TranscriptDPOGS210999-TA891 bp
ProteinDPOGS210999-PA296 aa
Genomic positionDPSCF300004 + 358853-360068
RNAseq coverage19x (Rank: top 80%)
Annotation
HeliconiusHMEL0250517e-4348.81% 
BombyxBGIBMGA006397-TA9e-1148.53% 
DrosophilaCG18599-PA1e-1147.22% 
EBI UniRef50UniRef50_B4K7I47e-1051.72%GI24764 n=2 Tax=Drosophila RepID=B4K7I4_DROMO
NCBI RefSeqXP_002016932.18e-1147.22%GL21794 [Drosophila persimilis]
NCBI nr blastpgi|1951520111e-0947.22%GL21794 [Drosophila persimilis]
NCBI nr blastxgi|2420216011e-1032.87%Homeobox protein Hox-B1, putative [Pediculus humanus corporis]
Group
Gene OntologyGO:00055157.6e-20protein binding
GO:00063554.8e-19regulation of transcription, DNA-dependent
GO:00435654.8e-19sequence-specific DNA binding
GO:00037004.8e-19sequence-specific DNA binding transcription factor activity
GO:00036771.9e-18DNA binding
KEGG pathway 
InterPro domain[134-208] IPR0090577.6e-20Homeodomain-like
[135-197] IPR0013564.8e-19Homeobox
[132-195] IPR0122871.9e-18Homeodomain-related
Orthology groupMCL34857 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210999-TA
ATGGCAGTTTTTAACAGTTTTACAGACTCATGTGAGAATATTGCAACGAGTAGTATTTTAGAATATTACAATACCGTTGGGGAATATGTTAAATCTACAATACCACAGTTTGGCAGCAACAATCATACTTCTCAGAAACCATTGGTTGTGATTAAGGATAATAATGTTAATTTGTGGCCGAAGCAGATGATCGCTAATAGTTCGACATCTCAATATGCTTGCACGTTTAAGGATAAGAAGATGAAATATGTTCATAACCACATTCCAAATAAGAATTTTCAGAATACAAATAATATGCTACTTAATATATCGTATGTCCAACCAACTATCTGCAAAGTTGAGACGAGGAAACAACTCTACTATAATTCAAACAAACATAACTCATTGAAAGTTAATATGAAAAAGAAGAGAAAGCGAACTATATTCACAACAGAACAAATTATTGCTTTGGAAGCAATATTTCAGAAGAAGCCTTATATAAATCGAGACGAAAGATTAATGTTAATGAAAAAATTGCAAGTCAGTGAAAAATCCATAAAGGTTTGGTTCCAAAATCGTCGCCGTTTGACTGATAAGAAAGACAAAGATTACGAATCTGATTCCCCGTCATCAGAAGATTCCTCTGAGATGACGGGTGATAGACTTACATATATCGAATCACAAATCAATAAGAATACGGATGAGCATGGATACGTCACCTTAAATGACAGAGTGATGAGTGATTTAGTTCATGTTATAAATGATTATTTGTCTAAAGACGTATCATGGAGCCAACCCTTATGTGATGATAATAAAACAACAGTTCTGAATGACGGAGTCCATGGCGACATAGTTATGTACGAACCGATATCACCAGTGAGCTTAACAGATATTTTGGATGACTTCATTTAA

Protein sequence:

>DPOGS210999-PA
MAVFNSFTDSCENIATSSILEYYNTVGEYVKSTIPQFGSNNHTSQKPLVVIKDNNVNLWPKQMIANSSTSQYACTFKDKKMKYVHNHIPNKNFQNTNNMLLNISYVQPTICKVETRKQLYYNSNKHNSLKVNMKKKRKRTIFTTEQIIALEAIFQKKPYINRDERLMLMKKLQVSEKSIKVWFQNRRRLTDKKDKDYESDSPSSEDSSEMTGDRLTYIESQINKNTDEHGYVTLNDRVMSDLVHVINDYLSKDVSWSQPLCDDNKTTVLNDGVHGDIVMYEPISPVSLTDILDDFI-