Monarch geneset OGS2.0

DPOGS203016
TranscriptDPOGS203016-TA861 bp
ProteinDPOGS203016-PA286 aa
Genomic positionDPSCF300068 + 262341-271786
RNAseq coverage720x (Rank: top 18%)
Annotation
HeliconiusHMEL0110294e-13497.13% 
BombyxBGIBMGA003874-TA6e-11777.70% 
DrosophilaPdp1-PD2e-8771.13% 
EBI UniRef50UniRef50_Q8SZT12e-8571.13%GH27708p n=35 Tax=Eumetazoa RepID=Q8SZT1_DROME
NCBI RefSeqXP_001602868.11e-9274.80%PREDICTED: similar to CG17888-PC [Nasonia vitripennis]
NCBI nr blastpgi|3800282888e-9575.20%PREDICTED: hepatic leukemia factor-like [Apis florea]
NCBI nr blastxgi|3800282887e-10069.59%PREDICTED: hepatic leukemia factor-like [Apis florea]
Group
Gene OntologyGO:00063554e-17regulation of transcription, DNA-dependent
GO:00435654e-17sequence-specific DNA binding
GO:00037004e-17sequence-specific DNA binding transcription factor activity
GO:00469834e-17protein dimerization activity
KEGG pathwayame:4084492e-88 
 K09057 (HLF)maps-> Circadian rhythm - fly
InterPro domain[219-272] IPR0117004e-17Basic leucine zipper
[218-282] IPR0048272.8e-09Basic-leucine zipper (bZIP) transcription factor
Orthology groupMCL12326 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203016-TA
ATGTCATTAGCAAGGGGTCCGAGGCTCGAAGGCCTCTGTGGGGGGGCGGCGGTGGGGCGGCAGCCGCCCCCGCAGCTGCTGCCGCCGGCGCCGCCGCCGCCGCTGCCCTCGCCTCCGCAACTGCAGCACGGCCACGTCCAGGACGATGATAGGTGGAGTCAGTACCACATTTGGAGGCAGCACGTTTTTGTCAATGGCAAGTCGGTGAACAATAAAGATCCCCTGGAAGACAAGAAAGATGACAGCGAGTTGTGGGAGGCGCAGGCCGCCTTCCTGGGACCCAACCTCTGGGACAAGACGCTACCCTACGACCCAGATCTCAAGTACGTGGACTTAGACGAGTTCCTGTCCGAGAATGGCATGCCCGGAGAGGGTTTGGGCGGGGCGCATCTTGGCAGCTCGGCGTTCGGCGCTGGCGCGGGCCTGGGGCTCGGCCTGCAGGCGCCCGTCACCAAGCGCGAGCGCTCGCCCTCGCCCTCCGACTGCATGAGCCCTGACACCATCAACCCGCCACTATCGCCAGCCGACTCCACATTCTCTATGGCGTCGTCCGGTCGTGACTTTGATCCCCGCACAAGGGCATTCTCAGACGAGGAGCTGAAGCCCCAGCCCATGATCAAAAAGTCCCGGAAACAGTTTGTACCCGATGACCTGAAAGACGACAAGTACTGGGCCCGTCGGCGGAAGAACAACATGGCCGCCAAGAGATCTCGTGACGCTCGCCGCATGAAGGAAAACCAAATCGCTCTGCGGGCTGGCTACCTCGAAAAGGAGAACATGGGTCTACGGCAGGAAGTGGAATTGCTGAAGAAGGAGAACCACATCCTGCGCGAGAAGCTTTCAAAGTACGCGGACGTATAA

Protein sequence:

>DPOGS203016-PA
MSLARGPRLEGLCGGAAVGRQPPPQLLPPAPPPPLPSPPQLQHGHVQDDDRWSQYHIWRQHVFVNGKSVNNKDPLEDKKDDSELWEAQAAFLGPNLWDKTLPYDPDLKYVDLDEFLSENGMPGEGLGGAHLGSSAFGAGAGLGLGLQAPVTKRERSPSPSDCMSPDTINPPLSPADSTFSMASSGRDFDPRTRAFSDEELKPQPMIKKSRKQFVPDDLKDDKYWARRRKNNMAAKRSRDARRMKENQIALRAGYLEKENMGLRQEVELLKKENHILREKLSKYADV-