Monarch geneset OGS2.0

DPOGS203985
TranscriptDPOGS203985-TA1080 bp
ProteinDPOGS203985-PA359 aa
Genomic positionDPSCF300005 + 1206170-1216495
RNAseq coverage40x (Rank: top 72%)
Annotation
HeliconiusHMEL0084040.087.47% 
BombyxBGIBMGA002127-TA1e-6076.00% 
Drosophilaap-PA1e-7945.14% 
EBI UniRef50UniRef50_E2B1F11e-10457.30%LIM/homeobox protein Lhx9 n=16 Tax=Neoptera RepID=E2B1F1_CAMFO
NCBI RefSeqNP_001139388.19e-11558.76%apterous [Tribolium castaneum]
NCBI nr blastpgi|3289251242e-16476.58%apterous B alpha [Bombyx mori]
NCBI nr blastxgi|3289251241e-16477.22%apterous B alpha [Bombyx mori]
Group
Gene OntologyGO:00036775e-24DNA binding
GO:00063555e-24regulation of transcription, DNA-dependent
GO:00435653.9e-23sequence-specific DNA binding
GO:00037003.9e-23sequence-specific DNA binding transcription factor activity
GO:00055159.7e-22protein binding
GO:00082702.1e-17zinc ion binding
KEGG pathway 
InterPro domain[235-301] IPR0122875e-24Homeodomain-related
[236-298] IPR0013563.9e-23Homeobox
[220-293] IPR0090579.7e-22Homeodomain-like
[79-133] IPR0017812.1e-17Zinc finger, LIM-type
Orthology groupMCL12901 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203985-TA
ATGCTTAAGGAAAGAGAAAGCTCCGAGGGGTCGCCTGCTACCCCGGACGAGTGCGCGGGTTGTGGGGGCAGAATACAGGACAGATATTACCTACTTGCTGTGGACCGTCAATGGCACGGTGCTTGTCTGCGCTGCTGCGAGTGCCGACTGCCGCTAGACAGCGAACTAACATGCTTCTCCAGAGACGGAAACATTTATTGCAAAGACGATTATTACAGATTATTCTGCGTGAAGCGATGCGCGAGATGCGGTAACGGGATTACCGCTAACGAGCTGGTAATGAGAGCCCGAGACATGGTGTTTCACCTGACTTGCTTCACATGTGTCGCCTGCGGGACCCTGTTGTCTAAGGGAGATGTGTTCGGGATGAGGAACAGCCTGGTGTACTGCAGACCGCACTACGATAGCGTCTGCATGGATGACTTCTGTGAGGAAGACGTCAATAGTGTTTACAGGTGTCAAGAATTGAACAGCGAAGGTGACTCTCCGAATCAGTACTTTCCCGTAGGCGTTAACCAGAAGGGTCGGCCGAGAAAGAGGAAGATAGCCCACGGTCCTCATGAAGACATGCAAGTACAGACCATGAGAATGGCCAGCACGGCGTTAGACATCCTTCACCGGGCTGACCTATCATCGTCAATGGAGTCCTTGGCTTACGATTCTTCGGTTGCATCACCGGGAAGTGTTTCAAGTCATACACAGCGAACTAAGCGCATGCGCACCAGCTTTAAACATCACCAACTTCGCACGATGAAATCGTATTTCGCCATTAACCAGAACCCAGATGCAAAGGATCTTAAGCAATTGGCTCAGAAGACTGGCTTATCTAAGAGAGTTTTACAGGTTTGGTTTCAAAATGCTCGAGCGAAATGGCGTAGAAATATGATGAGACAGGAATCGAATCAGCTTGGACTGATGACTCCTAATGGAAGCACTGGACACTCTGTAAATGGTGGTCTTGTCACAGGAGTTCCTCCACCAAATGTAGATCCTGGAATGATAATGTCTGAACCATTGCAACCCATACAAGACATTAGGGTTCACACTCCTCATCCAATGAGCTTCGGAGAAATGTACTGA

Protein sequence:

>DPOGS203985-PA
MLKERESSEGSPATPDECAGCGGRIQDRYYLLAVDRQWHGACLRCCECRLPLDSELTCFSRDGNIYCKDDYYRLFCVKRCARCGNGITANELVMRARDMVFHLTCFTCVACGTLLSKGDVFGMRNSLVYCRPHYDSVCMDDFCEEDVNSVYRCQELNSEGDSPNQYFPVGVNQKGRPRKRKIAHGPHEDMQVQTMRMASTALDILHRADLSSSMESLAYDSSVASPGSVSSHTQRTKRMRTSFKHHQLRTMKSYFAINQNPDAKDLKQLAQKTGLSKRVLQVWFQNARAKWRRNMMRQESNQLGLMTPNGSTGHSVNGGLVTGVPPPNVDPGMIMSEPLQPIQDIRVHTPHPMSFGEMY-