Monarch geneset OGS2.0

DPOGS216194
TranscriptDPOGS216194-TA1260 bp
ProteinDPOGS216194-PA419 aa
Genomic positionDPSCF300080 - 49384-51860
RNAseq coverage1x (Rank: top 95%)
Annotation
HeliconiusHMEL0158470.078.50% 
BombyxBGIBMGA004601-TA5e-14966.50% 
Drosophilaal-PA3e-3076.39% 
EBI UniRef50UniRef50_O421152e-2965.62%Aristaless-related homeobox protein n=6 Tax=Clupeocephala RepID=ARX_DANRE
NCBI RefSeqXP_969925.11e-2990.77%PREDICTED: similar to transcription factor protein [Tribolium castaneum]
NCBI nr blastpgi|3485120084e-2965.62%PREDICTED: aristaless-related homeobox protein-like [Oreochromis niloticus]
NCBI nr blastxgi|3017825752e-2956.91%PREDICTED: LOW QUALITY PROTEIN: homeobox protein ARX-like [Ailuropoda melanoleuca]
Group
Gene OntologyGO:00036771.4e-28DNA binding
GO:00063551.4e-28regulation of transcription, DNA-dependent
GO:00435659.7e-27sequence-specific DNA binding
GO:00037009.7e-27sequence-specific DNA binding transcription factor activity
GO:00055151.2e-26protein binding
KEGG pathway 
InterPro domain[75-158] IPR0122871.4e-28Homeodomain-related
[98-160] IPR0013569.7e-27Homeobox
[89-167] IPR0090571.2e-26Homeodomain-like
Orthology groupMCL21058 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS216194-TA
ATGGATGAATCTTTAGACGGACGGGAAGCCGGTGCTGGTGCATCTGGATTCGATTCAGCTGCCATTTTAAACGACGACGTGTCACCAACCACCAGTTTCTGTTACAACATACCAAATCTTTTTAATGGACTTGGACCGGGATCAGTTCTGTGTGAAAGACATGAGACAGATGTCAACGAAAGCTATCCAACAAATACGGAACCTTCAGATGAACCAGCTGATCAACAGTATGAAGATGATGAAGAGTATTCACCTAATTCTAGTACCAAAGGTGATATACCAAAACGTAAACAAAGAAGATACAGAACAACTTTCTCGAATGTTCAACTGGAACAATTGGAAGCTGCCTTTCATAAGACACACTACCCTGACGTCTTCTTCAGAGAGGAGTTGGCGATGAGAATTGACCTGACCGAAGCAAGAGTTCAGGTCTGGTTTCAGAACCGTCGGGCTAAATGGAGGAAACAACAAAAGGCTGGATGTGAGCCTTACCCTCGCTCCACGAGATCCCCTGACCCTCGGTCTCCTTCTGCTCTCGCCCTTGAAATGAGGGATTTTATTACCATACCAGTATCTTCATCTCAAATAGTCAGCCTCGCTGACAACTCCAATCAAAACAGTTACGCTTCCAAGTCAAAACAATCATCTCCCGCGATACATATATCAGCGTTGCCAACTAGGCAGAATTCTCAGGTGGATCTACCCTCGTTTTCGATTCCACTGCAATATAACTTAGGATCGTTTAAGAAAATAGGAGAAGATAACCAGTTAGACTTAGACACTTCGACGTCAGATGTACAAAACCACTGGGATACTCGATTGATGCCGAATTTCAATATAATGAATATAAAACCGATGTTAGAAAATCGAGAAAATCTCGAAATGTCCGAAGTGAAATATGAAGTGAAAAACGAAAATGTGAGATCATACAATGATATGTCAAATTCGATTCCGGCTACGATGGAACATGAAGACATATTGGGCAAAGAAGCCGTGAGTGAAGGTCAAACTATATCACCGGATTTTGAGATTGGGAGATACCAGAGTTACCACAGCGAAAATGAAAAGAGATATAGAGAAACTGGCTTCGATGAATCTATAATAGAGGATGGTAGAAATGACTTCGGGTGCGACGGTGACTTTCTCTGTAAAAGTAACTATGATAAAATTGATGAAAAAAGAAACGACTTTGAAGAATGCCATCTATTGGACGCGGACAGCAGCGCTGTCGGGATGAGTAATTTTGAAAGCAATCTTTAA

Protein sequence:

>DPOGS216194-PA
MDESLDGREAGAGASGFDSAAILNDDVSPTTSFCYNIPNLFNGLGPGSVLCERHETDVNESYPTNTEPSDEPADQQYEDDEEYSPNSSTKGDIPKRKQRRYRTTFSNVQLEQLEAAFHKTHYPDVFFREELAMRIDLTEARVQVWFQNRRAKWRKQQKAGCEPYPRSTRSPDPRSPSALALEMRDFITIPVSSSQIVSLADNSNQNSYASKSKQSSPAIHISALPTRQNSQVDLPSFSIPLQYNLGSFKKIGEDNQLDLDTSTSDVQNHWDTRLMPNFNIMNIKPMLENRENLEMSEVKYEVKNENVRSYNDMSNSIPATMEHEDILGKEAVSEGQTISPDFEIGRYQSYHSENEKRYRETGFDESIIEDGRNDFGCDGDFLCKSNYDKIDEKRNDFEECHLLDADSSAVGMSNFESNL-