Monarch geneset OGS2.0

DPOGS209211
TranscriptDPOGS209211-TA876 bp
ProteinDPOGS209211-PA291 aa
Genomic positionDPSCF300061 + 1145991-1146866
RNAseq coverage20x (Rank: top 79%)
Annotation
HeliconiusHMEL0147703e-11373.09% 
BombyxBGIBMGA010980-TA9e-11274.16% 
Drosophilaind-PA5e-2582.61% 
EBI UniRef50UniRef50_Q58Y778e-2861.54%Intermediate neuroblasts defective protein n=3 Tax=Protostomia RepID=Q58Y77_TRICA
NCBI RefSeqNP_001034494.11e-2861.54%intermediate neuroblasts defective [Tribolium castaneum]
NCBI nr blastpgi|2700156335e-2862.39%hypothetical protein TcasGA2_TC006888 [Tribolium castaneum]
NCBI nr blastxgi|1700478206e-3340.27%intermediate neuroblasts defective protein [Culex quinquefasciatus]
Group
Gene OntologyGO:00063555.7e-25regulation of transcription, DNA-dependent
GO:00435655.7e-25sequence-specific DNA binding
GO:00037005.7e-25sequence-specific DNA binding transcription factor activity
GO:00036771.1e-23DNA binding
GO:00055153e-22protein binding
KEGG pathway 
InterPro domain[181-243] IPR0013565.7e-25Homeobox
[150-240] IPR0122871.1e-23Homeodomain-related
[155-242] IPR0090573e-22Homeodomain-like
[203-214] IPR0204795.8e-07Homeobox, eukaryotic
Orthology groupMCL19593 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209211-TA
ATGTCGAGATCATTCCTAGTGGACGCCTTGATCAGTGACACCAAAGACAACAACACAGAAATGAAGAGCGACCATCTCACCTACAACCTGGGCAACTTGGACACGAGACCGAAGTTCCTCCCGTACCCTTACCCAGGCAGTATCAACCTGCTGTCTCTCGGCCTCCAGCAGCAGCGAGCGCCAGACCTGTTCCGACCGTTCCTGGAACAATTGAATTTCCGCTACCCGATGTTACATCAGCTGCCCCGACAGACGGACTTCTTTGGTCCCGCTCACGAGACTCGCCCCTTCGAAGGTTTCAAAACCGAAGATCAGGAGACGGTTGGTTTAGTGAATAGAGCTAAGAAATCTGTGTCACCGTACTTGCACCATCCTTACAAATCGACCGCGACTTCACCATCCAAGAGCCAGGGTCAGAGGTCACCGTCTTTATCTAGCGATAGTCGGAACGGCTCCCCGAGCCCGCCCCTCGGACATCCCGAAGAACTCCTACCCGGATACTCAAAAGAACTAAAACGGCTACCCTTAAAAGAAGATTCGAGCAAACGCATTAGAACAGCTTTCACGGGGACACAACTCCTTGAGCTGGAGAGAGAGTTCTCCATGAACATGTATCTATCGAGACTGAGGAGGATAGAGATCGCCTCCAGGCTGAAGCTGTCAGAGAAACAAGTGAAGATATGGTTCCAGAACCGACGCGTCAAGCTCAAGAAAGAAGAGACCCCGCTCGCTAACGAGGGGAGAGGAAAGAGATGCTGCTGCAGCAAGGGAACCTGCTCCAAGAGCTCCACCTCCTGCGACGACGAGCAGGGACAGATAGACGTGGTCACCGACTACGACACGTGTGAAGCACAGAACCTGTCCAGGTACTCCTGA

Protein sequence:

>DPOGS209211-PA
MSRSFLVDALISDTKDNNTEMKSDHLTYNLGNLDTRPKFLPYPYPGSINLLSLGLQQQRAPDLFRPFLEQLNFRYPMLHQLPRQTDFFGPAHETRPFEGFKTEDQETVGLVNRAKKSVSPYLHHPYKSTATSPSKSQGQRSPSLSSDSRNGSPSPPLGHPEELLPGYSKELKRLPLKEDSSKRIRTAFTGTQLLELEREFSMNMYLSRLRRIEIASRLKLSEKQVKIWFQNRRVKLKKEETPLANEGRGKRCCCSKGTCSKSSTSCDDEQGQIDVVTDYDTCEAQNLSRYS-