Monarch geneset OGS2.0

DPOGS207206
TranscriptDPOGS207206-TA1071 bp
ProteinDPOGS207206-PA356 aa
Genomic positionDPSCF300001 + 5823266-5825250
RNAseq coverage126x (Rank: top 57%)
Annotation
HeliconiusHMEL0093311e-4898.85% 
BombyxBGIBMGA014580-TA3e-48100.00% 
DrosophilaHis4r-PA4e-48100.00% 
EBI UniRef50UniRef50_G3HPV73e-4798.85%Histone H4 n=26 Tax=Eukaryota RepID=G3HPV7_CRIGR
NCBI RefSeqXP_002075990.12e-47100.00%GK12392 [Drosophila willistoni]
NCBI nr blastpgi|1947724687e-47100.00%GF20391 [Drosophila ananassae]
NCBI nr blastxgi|2244951143e-6967.13%insect intestinal mucin 2 [Mamestra configurata]
Group
Gene OntologyGO:00056348.8e-86nucleus
GO:00036778.8e-86DNA binding
GO:00063348.8e-86nucleosome assembly
GO:00007868.8e-86nucleosome
GO:00080612.6e-10chitin binding
GO:00060302.6e-10chitin metabolic process
GO:00055762.6e-10extracellular region
KEGG pathwaydan:Dana_GF203911e-47 
 K11254 (H4)maps-> Systemic lupus erythematosus
InterPro domain[1-87] IPR0019518.8e-86Histone H4
[3-88] IPR0090722.2e-41Histone-fold
[28-88] IPR0071253.1e-12Histone core
[299-352] IPR0025572.6e-10Chitin binding domain
Orthology group 
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207206-TA
ATGACCGGTCGCGGTAAAGGAGGAAAGGGTCTGGGAAAAGGTGGAGCCAAGCGACACAGGAAAGTACTCCGTGATAACATCCAGGGTATCACCAAACCGGCCATACGTCGTCTGGCACGCAGAGGCGGCGTCAAACGTATCTCCGGTCTGATATACGAAGAGACCAGAGGCGTTCTCAAAGTATTCCTAGAGAACGTCATCCGCGACGCCGTCACGTACACCGAGCACGCGAAGAGGAAGACCGTCACAGCCATGGACGTCCAGAAGAGGAACAACAACAACAACAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCAACACCGACACCGACAACATCATCATGACGATATCATCTCATCGTCACGAAGTGATCGATGATACTACCACAACCACACTAACAACAACAACAACTACTACTACTACTACGACGCCGGCGCCAACAACCACTACTACAACTACCACAACGCCGGCACCTACAACCACGACCACAACAACAACGACCCCTGCGCCCACAACCACAACCACCACAACTACACCTGCACCCACAACCACGACAACCACAACTACACCTGCGCCCACAACCACGACAACTACAACGACACCAGCACCCACAACTACTATAACCACAACAACTCGCAAGTGCAGACCCAGAACTACGACAACACCTGTACCGACATCAACAACAACAGTGGCACCGACTGAACCAGATTTTTTGGAAAACGGTTGCCCAGTAAATCCACACATACATTGGCTGCTACCTCATGAGTCGGATTGTAATTTGTTCTACTACTGTGTTTGGGGAAGATTAGTGCTACGGCAATGTCCTGCAACTCTACACTTCAATAGAGTTATACAGGTAAGAGATCTGAAATAA

Protein sequence:

>DPOGS207206-PA
MTGRGKGGKGLGKGGAKRHRKVLRDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVQKRNNNNNNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTNTDTDNIIMTISSHRHEVIDDTTTTTLTTTTTTTTTTTPAPTTTTTTTTTPAPTTTTTTTTTPAPTTTTTTTTPAPTTTTTTTTPAPTTTTTTTTPAPTTTITTTTRKCRPRTTTTPVPTSTTTVAPTEPDFLENGCPVNPHIHWLLPHESDCNLFYYCVWGRLVLRQCPATLHFNRVIQVRDLK-