Monarch geneset OGS2.0

DPOGS200164
TranscriptDPOGS200164-TA1086 bp
ProteinDPOGS200164-PA361 aa
Genomic positionDPSCF300128 + 271837-273615
RNAseq coverage3336x (Rank: top 4%)
Annotation
HeliconiusHMEL0057152e-6350.35% 
BombyxBGIBMGA002915-TA1e-5565.95% 
DrosophilaXbp1-PC3e-2263.64% 
EBI UniRef50UniRef50_UPI00022CA93E6e-3032.06%UPI00022CA93E related cluster n=1 Tax=unknown RepID=UPI00022CA93E
NCBI RefSeqXP_392383.22e-2430.63%PREDICTED: similar to X box binding protein-1 CG9415-PB, isoform B [Apis mellifera]
NCBI nr blastpgi|3504212212e-2932.06%PREDICTED: hypothetical protein LOC100747421 [Bombus impatiens]
NCBI nr blastxgi|2700141683e-3134.82%hypothetical protein TcasGA2_TC012878 [Tribolium castaneum]
Group
Gene OntologyGO:00063554.9e-15regulation of transcription, DNA-dependent
GO:00435654.9e-15sequence-specific DNA binding
GO:00037004.9e-15sequence-specific DNA binding transcription factor activity
GO:00469831.1e-13protein dimerization activity
KEGG pathwayame:4088535e-24 
 K09027 (XBP1)maps-> Protein processing in endoplasmic reticulum
InterPro domain[42-106] IPR0048274.9e-15Basic-leucine zipper (bZIP) transcription factor
[43-94] IPR0117001.1e-13Basic leucine zipper
Orthology groupMCL18056 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200164-TA
ATGAGCGCTCCGATAATCATAACTGTGCCTAATAATTATCTGGCGGTGGACGATGTGGAGTCGAAGGTGGTTCTCGATGTGTCTCCTAGTCCACCGTCCAGGAAAAGGAGGCTGGACCATCTAACATGGGAAGAAAAGATGCAAAGGAAGAAACTCAAGAACAGAGTTGCAGCTCAGACATCGCGCGACCGGAAGAAGGCGAAGATGGATGAAATGGAGGGTCGTATCAAGCACTTCATGGACTTAAACGAGCGGCTTCTTGGTGAGGTGGAGAACCTAAAGGCGATGAACGAGCGGCTTCTGAGTGAGAACTCAGCCCTGCGCGAGGCGGCGAGGAGCGTCGCGGTGGCCCCGAGACCAGCAGAGTCTCATCCTCAGCAGAAGGTGGGGCCCCTGTCGGCACTCAACGCGGCTCGTCTAGTGATGCTGATGTATGTGCTCTCTCAGAACTCCTGCAACACTTGGACTCCCCCGAGTATTTGGACACCCTCCACCAACTTGCAGATCAATTACTCCAAGAAATTGATGGAAAAACTGCAGGAGAAGCTGCCGATCAGCAATAGACATTGTCCTGAAAGAAATGAAGTGGTGGGGTCCACAGCAGAACAATTGGAATCCTGTCAAAGTAGAGACATAGACACTCTCAAGCATGATATAGACAGGATACTGTTACAGCACTCATATGCACATCCCTATCCTAAGACAGACACTATTAAAGAAGAAAACGGGGACAAGGGTGACTTATTCTATGCAAGCTACGAAGCAAATGATTGTGTGACAATTGAAGTTCCTTGCGAGGAACAAACAGAAGAATCGGCTCCAATAAAATTGGACACTGATTTTAATAAATTTACGGACGACTGTTTGGATGTCACATTGGAATCTGATATGAAGTTATTGTCACCTCTGCCTATGTCAATAAAATCTGTGGATGAAAATGTATTGGCAGTGTCCCCGTCACACAGTAACTTGAGCTCTGACATGGGCTACGAGTCACTCTCCTCCCCGCTCAGTGAACCTGAGTCTATGGATCTGTCAGATTTTTGGTGTGAATCATTCCCGGAACTGTTCCCGGACCTGGTGTAA

Protein sequence:

>DPOGS200164-PA
MSAPIIITVPNNYLAVDDVESKVVLDVSPSPPSRKRRLDHLTWEEKMQRKKLKNRVAAQTSRDRKKAKMDEMEGRIKHFMDLNERLLGEVENLKAMNERLLSENSALREAARSVAVAPRPAESHPQQKVGPLSALNAARLVMLMYVLSQNSCNTWTPPSIWTPSTNLQINYSKKLMEKLQEKLPISNRHCPERNEVVGSTAEQLESCQSRDIDTLKHDIDRILLQHSYAHPYPKTDTIKEENGDKGDLFYASYEANDCVTIEVPCEEQTEESAPIKLDTDFNKFTDDCLDVTLESDMKLLSPLPMSIKSVDENVLAVSPSHSNLSSDMGYESLSSPLSEPESMDLSDFWCESFPELFPDLV-