Monarch geneset OGS2.0

DPOGS204692
TranscriptDPOGS204692-TA1665 bp
ProteinDPOGS204692-PA554 aa
Genomic positionDPSCF300170 + 307910-345704
RNAseq coverage113x (Rank: top 59%)
Annotation
HeliconiusHMEL0162512e-3687.23% 
BombyxBGIBMGA007471-TA3e-3291.43% 
Drosophilaal-PA2e-2674.29% 
EBI UniRef50UniRef50_E0VGP11e-6183.21%Homeobox protein arx, putative n=1 Tax=Pediculus humanus corporis RepID=E0VGP1_PEDHC
NCBI RefSeqXP_002425285.12e-6283.21%homeobox protein arx, putative [Pediculus humanus corporis]
NCBI nr blastpgi|2420090084e-6183.21%homeobox protein arx, putative [Pediculus humanus corporis]
NCBI nr blastxgi|1951172528e-7142.17%GI23866 [Drosophila mojavensis]
Group
Gene OntologyGO:00036772.1e-28DNA binding
GO:00063552.1e-28regulation of transcription, DNA-dependent
GO:00435651.1e-26sequence-specific DNA binding
GO:00037001.1e-26sequence-specific DNA binding transcription factor activity
GO:00055151.5e-24protein binding
KEGG pathway 
InterPro domain[146-215] IPR0122872.1e-28Homeodomain-related
[156-218] IPR0013561.1e-26Homeobox
[130-216] IPR0090571.5e-24Homeodomain-like
Orthology groupMCL16824 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204692-TA
ATGCAAATCGTGTCGGAGAGACCCGTCCGCGACGAGACTCCGTCTCGTTCATGCATCTATCTCGAGATCGTCGCCTCTCTAGCGCCGGATGCTGGCATCTCGGCGTGTGGGAATGAGATGCATCCAGCCTTTCGGACGTCACACGCCAGGCTTCTGTCCGCAGTCGGTCTCGCGCCGTGGTCGCGGGTGCGTGTGTCGGGCTATGTCAGGGTGAGCGCGCTTCTTGTCACGAGTAAGGCGGGAAAACTTCCCGCCCACAACGCAGCAACAATATTTTCAGACGCTAATAAGCTGGCGAGTGGGCCGGGCGGTGGGCGCGGGTTGTTCTGCTATCATTGCCCGCCGAGCCTGCCCCCGCACCAGCACCGTCTTCCAACCCTGGAGTACCCCTTCACAGCATCACATCCCTACACCAGCTATTCCTACCACCCCGCCATCCACGATGACACTTTCGTTAGACGCAAACAGAGACGAAACAGAACCACCTTTACATTACAGCAGCTGGAAGAGCTGGAGACGGCGTTTGCACAGACGCATTACCCGGATGTGTTCACTAGAGAGGATCTAGCACTCAAGATAAACCTCACCGAAGCTAGAGTTCAGGTTTGGTTTCAAAACAGACGGGCTAAGTGGCGGAAAGCGGAAAGACTAAAAGAGGAACAGCGCAAACGAGAGGGAGCTGAAGTTTTGGCTAAGAGGGATCCAGCGGATGATAAGGGTTCTTCGGAGTGCGGAATGTCTCGAGGGTCTGGGGATGCATCTCCAATGTCAACTGGCGTGTCCCCTCGCGCGTCTCCCCCGGTAACGCCAGGGTCACCTCGTCGTTCCCCCCACCGTTCACCCAATAGGTCACCAAGATTGGAAAGATCTGAGACCTGTGCCTCTCCCGCTCCTTCGGTTGGCAGCGCAGGTTCCCGCGAGCCAGACCCTCGCCCGCCGCACAACATCTTCTCTCCTTTCGATCATTATGAAGACGGCTCGCAGAAACGTCGCACTAGTTCCCTTTATCCAGAGGAATCTAGCATCAAATGTTTGGACTCAAAGCCTATTTTGGCGTACAGTAAACCAGTCTGCGGCCGGTTGTACAGCAATTACTTCATTAAGGGTACTGCGAGCGTTAGTCTAATGAACGGTAATTATGTTGCGCTGACGCTCGTAAATGAGCGGTGGAGCGAAGGGAGTAAACAAGCAAATCTCCAAACGAGTAGCGAAGAACGTCAAAGGCACGAGCGATTCCCGCCAAATTGCATTTACGCGTCCCAGAGGCAGGGCCTTATCGAATTCTCTCTCAAAGAGCCGCCATTTTTAAAAGTTCCTACTCTCAGTGGAGCGTTCCGTTCATCAGCCCCAGGCGGTGCTGACCCGCCTCCGCTGTTCCTGCCTCCCCATCTCTCTCATCTCTCGCAGCATCTCAACCATCTATCGCAGCCTTTCTTCCCGTTAAAAGGTTGGGGAGCACCTTGCCCGTGTTGTCCCAAAGAAGAAGCTCGCTCAACCAGCGTGGCTGAGTTGAGACGTAAAGCTCACGAACATTCCGCTGCGTTACTGCAATCGCTAGCAAATTTCCAGTCGCGAGCGTTCCCGCTTCCGCTCCCGCTGCCGCCTCTGCCGCTCCCGCTGTTACACGAGCCACCGCCGTCGGAACCTCCCAAACATCTCGAATAA

Protein sequence:

>DPOGS204692-PA
MQIVSERPVRDETPSRSCIYLEIVASLAPDAGISACGNEMHPAFRTSHARLLSAVGLAPWSRVRVSGYVRVSALLVTSKAGKLPAHNAATIFSDANKLASGPGGGRGLFCYHCPPSLPPHQHRLPTLEYPFTASHPYTSYSYHPAIHDDTFVRRKQRRNRTTFTLQQLEELETAFAQTHYPDVFTREDLALKINLTEARVQVWFQNRRAKWRKAERLKEEQRKREGAEVLAKRDPADDKGSSECGMSRGSGDASPMSTGVSPRASPPVTPGSPRRSPHRSPNRSPRLERSETCASPAPSVGSAGSREPDPRPPHNIFSPFDHYEDGSQKRRTSSLYPEESSIKCLDSKPILAYSKPVCGRLYSNYFIKGTASVSLMNGNYVALTLVNERWSEGSKQANLQTSSEERQRHERFPPNCIYASQRQGLIEFSLKEPPFLKVPTLSGAFRSSAPGGADPPPLFLPPHLSHLSQHLNHLSQPFFPLKGWGAPCPCCPKEEARSTSVAELRRKAHEHSAALLQSLANFQSRAFPLPLPLPPLPLPLLHEPPPSEPPKHLE-