Monarch geneset OGS2.0

DPOGS216193
TranscriptDPOGS216193-TA1227 bp
ProteinDPOGS216193-PA408 aa
Genomic positionDPSCF300080 - 53317-55790
RNAseq coverage0x (Rank: top 99%)
Annotation
HeliconiusHMEL0158470.078.16% 
BombyxBGIBMGA004601-TA8e-14263.83% 
Drosophilaal-PA8e-2664.10% 
EBI UniRef50UniRef50_O421159e-2570.27%Aristaless-related homeobox protein n=6 Tax=Clupeocephala RepID=ARX_DANRE
NCBI RefSeqXP_969925.13e-2589.66%PREDICTED: similar to transcription factor protein [Tribolium castaneum]
NCBI nr blastpgi|3485120082e-2470.27%PREDICTED: aristaless-related homeobox protein-like [Oreochromis niloticus]
NCBI nr blastxgi|910878795e-2462.24%PREDICTED: similar to transcription factor protein [Tribolium castaneum]
Group
Gene OntologyGO:00036773e-28DNA binding
GO:00063553e-28regulation of transcription, DNA-dependent
GO:00435654.1e-26sequence-specific DNA binding
GO:00037004.1e-26sequence-specific DNA binding transcription factor activity
GO:00055154.5e-25protein binding
KEGG pathway 
InterPro domain[63-147] IPR0122873e-28Homeodomain-related
[87-149] IPR0013564.1e-26Homeobox
[78-156] IPR0090574.5e-25Homeodomain-like
Orthology groupMCL21058 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS216193-TA
ATGGATGAATCTTTAGACGGACGGGAAGCCGGTGCTGGTGCATCTGGATTCGATTCAGCTGCCATTTTAAACGACGACGTGTCACCAACCACCAGTTTCTGTTACAACATACCAAATCTTTTTAATGGACTTGGACCGGGATCAGTTCTGTGTGAAAGACATGAGACAGATGTCAACGAAAGCTATCCAACAAATACGGAACCTTCAGATGAACCAGCTGATCAACAGTATGAAGATGATGAAGAGTATTCACCTAATTCTAGTAAGAATCGAACAACTTTCTCGAATGTTCAACTGGAACAATTGGAAGCTGCCTTTCATAAGACACACTACCCTGACGTCTTCTTCAGAGAGGAGTTGGCGATGAGAATTGACCTGACCGAAGCAAGAGTTCAGGTCTGGTTTCAGAACCGTCGGGCTAAATGGAGGAAACAACAAAAGGCTGGATGTGAGCCTTACCCTCGCTCCACGAGATCCCCTGACCCTCGGTCTCCTTCAGCTCTCGCCCTTGAAATGAGGGATTTTATTACCATACCAGTATCTTCATCTCAAATAGTCAGCCTCGCTGACAACTCCAATCAAAACAGTTACGCTTCCAAGTCAAAACAATCATCTCCCGCGATACATATATCAGCGTTGCCAACTAGGCAGAATTCTCAGGTGGATCTACCCTCGTTTTCGATTCCACTGCAATATAACTTAGGATCGTTTAAGAAAATAGGAGAAGATAACCAGTTAGACTTAGACACTTCGACGTCAGATGTACAAAACCACTGGGATACTCGATTGATGCCGAATTTCAATATAATGAATATAAAACCGATGTTAGAAAATCGAGAAAATCTCGAAATGTCCGAAGTGAAATATGAAGTGAAAAACGAAAATGTGAGATCATACAATGATATGTCAAATTCGATTCCGGCTACGATGGAACATGAAGACATATTGGGCAAAGAAGCCGTGAGTGAAGGTCAAACTATATCACCGGATTTTGAGATTGGGAGATACCAGAGTTACCACAGCGAAAATGAAAAGAGATATAGAGAAACTGGCTTCGATGAATCTATAATAGAGGATGGTAGAAATGACTTCGGGTGCGACGGTGACTTTCTCTGTAAAAGTAACTATGATAAAATTGATGAAAAAAGAAACGACTTTGAAGAATGCCATCTATTGGACGCGGACAGCAGCGCTGTCGGGATGAGTAATTTTGAAAGCAATCTTTAA

Protein sequence:

>DPOGS216193-PA
MDESLDGREAGAGASGFDSAAILNDDVSPTTSFCYNIPNLFNGLGPGSVLCERHETDVNESYPTNTEPSDEPADQQYEDDEEYSPNSSKNRTTFSNVQLEQLEAAFHKTHYPDVFFREELAMRIDLTEARVQVWFQNRRAKWRKQQKAGCEPYPRSTRSPDPRSPSALALEMRDFITIPVSSSQIVSLADNSNQNSYASKSKQSSPAIHISALPTRQNSQVDLPSFSIPLQYNLGSFKKIGEDNQLDLDTSTSDVQNHWDTRLMPNFNIMNIKPMLENRENLEMSEVKYEVKNENVRSYNDMSNSIPATMEHEDILGKEAVSEGQTISPDFEIGRYQSYHSENEKRYRETGFDESIIEDGRNDFGCDGDFLCKSNYDKIDEKRNDFEECHLLDADSSAVGMSNFESNL-