Monarch geneset OGS2.0

DPOGS202349
TranscriptDPOGS202349-TA945 bp
ProteinDPOGS202349-PA314 aa
Genomic positionDPSCF300104 - 599847-617337
RNAseq coverage602x (Rank: top 21%)
Annotation
HeliconiusHMEL0155731e-12173.33% 
BombyxBGIBMGA014479-TA4e-5987.40% 
Drosophilalab-PA2e-3092.06% 
EBI UniRef50UniRef50_D6W9451e-4942.23%Labial n=28 Tax=Bilateria RepID=D6W945_TRICA
NCBI RefSeqNP_001107762.11e-4942.08%labial [Tribolium castaneum]
NCBI nr blastpgi|2700028104e-4942.23%labial [Tribolium castaneum]
NCBI nr blastxgi|2700028105e-5442.82%labial [Tribolium castaneum]
Group
Gene OntologyGO:00063557e-26regulation of transcription, DNA-dependent
GO:00435657e-26sequence-specific DNA binding
GO:00037007e-26sequence-specific DNA binding transcription factor activity
GO:00036778.1e-24DNA binding
GO:00055153e-22protein binding
KEGG pathway 
InterPro domain[224-286] IPR0013567e-26Homeobox
[225-282] IPR0122878.1e-24Homeodomain-related
[215-293] IPR0090573e-22Homeodomain-like
[246-257] IPR0204794.9e-08Homeobox, eukaryotic
Orthology groupMCL26188 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202349-TA
ATGATGATCATGGACGTCGGCATGTACAACAACCAACAGAACTACAGCGGAGAGTGCTACCAGCAGGGCGCCGGCTACCAGGCCGCGTACGACGGCTACCACGATGGCTACTACGACCCCGCCGGATACTACGAGCCGCTGATACACGGACACGCGCACGACGCCGCGCCCGTCATCAGCACCGACACCGGCCTCTGCTACACCAACCTGGACTATGGAGACCAGCAGGCCTCCGCCTACAACCTGCCGCCGCCCGCGCCGCCTCACCACGACACCTTCAAGCACCGAGAAGAAGATCACCTTGACCACAAGCTGGACAATCACTACGTCGACACCAAATACAACATGCACTTCGTGGACGAGGCGCAGTATGGCCACGGCTCGCCTCTCGTCTGCGGGGACTTCGACTCCTACCCCAAGGACTTCGGGGAGCTGAGAGACGGCTTCCCCCGGGAGGAGGTGCAGGCGCAGCACGTCCAGGCACACGCGACGCACGCCGCCGTACCCACGTACAAGTGGATGCAGGTCAAGAGGAACATCCCGAAGCCGGCCGCCCCGAAGCTGATGATTCCACCAGCGGACTTCGTGAACCAGGGCGTGCTTGGGAGCCCTCAGGACCAGCGCACTCCCAATGGCGGCCAGATGCTGAACAACCCGCTCCTCAACCTCAACAACACAGGTCGAACTAACTTTACAAACAAACAGCTCACGGAACTGGAGAAAGAGTTCCACTTCAACAAATACCTCACGAGGGCGAGGAGAATAGAAATCGCCTCCGCCCTACAGCTGAACGAGACGCAGGTCAAGATATGGTTCCAGAACAGGCGGATGAAACAGAAGAAGAGGATCAAAGAAGGGCTGATCGTTGCTCCAGAAGTGAGCGCCTCCACCTCCCACACCTCCCACTCCATCGGCTCCAACGAAAACAGTCGGGAATCCAGCTAA

Protein sequence:

>DPOGS202349-PA
MMIMDVGMYNNQQNYSGECYQQGAGYQAAYDGYHDGYYDPAGYYEPLIHGHAHDAAPVISTDTGLCYTNLDYGDQQASAYNLPPPAPPHHDTFKHREEDHLDHKLDNHYVDTKYNMHFVDEAQYGHGSPLVCGDFDSYPKDFGELRDGFPREEVQAQHVQAHATHAAVPTYKWMQVKRNIPKPAAPKLMIPPADFVNQGVLGSPQDQRTPNGGQMLNNPLLNLNNTGRTNFTNKQLTELEKEFHFNKYLTRARRIEIASALQLNETQVKIWFQNRRMKQKKRIKEGLIVAPEVSASTSHTSHSIGSNENSRESS-