Monarch geneset OGS2.0

DPOGS205530
TranscriptDPOGS205530-TA921 bp
ProteinDPOGS205530-PA306 aa
Genomic positionDPSCF300056 + 263412-269473
RNAseq coverage1216x (Rank: top 10%)
Annotation
HeliconiusHMEL0112943e-13068.60% 
BombyxBGIBMGA000082-TA1e-8979.10% 
DrosophilaArp14D-PB5e-8587.35% 
EBI UniRef50UniRef50_G3QZJ75e-13167.05%Uncharacterized protein n=13 Tax=Eukaryota RepID=G3QZJ7_GORGO
NCBI RefSeqXP_309181.43e-8576.62%AGAP000985-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3545046114e-11260.12%PREDICTED: actin-related protein 2-like [Cricetulus griseus]
NCBI nr blastxgi|3545046111e-10960.12%PREDICTED: actin-related protein 2-like [Cricetulus griseus]
Group
KEGG pathwayptr:7449966e-43 
 K05692 (ACTB_G1)maps-> Pathogenic Escherichia coli infection
    Regulation of actin cytoskeleton
    Viral myocarditis
    Bacterial invasion of epithelial cells
    Tight junction
    Adherens junction
    Arrhythmogenic right ventricular cardiomyopathy (ARVC)
    Phototransduction - fly
    Vibrio cholerae infection
    Dilated cardiomyopathy
    Shigellosis
    Leukocyte transendothelial migration
    Hypertrophic cardiomyopathy (HCM)
    Phagosome
    Focal adhesion
InterPro domain[1-303] IPR0040002.3e-177Actin-like
Orthology groupMCL11972 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205530-TA
ATGGATGATCAGGGAAGAAAAGTTATCGTTTGTGACAATGGCACAGGTTTCGTAAAATGCGGATACGCCGGCAGTAACTTCCCTGCGTTCATTTTTCCGTCAATGGTCGGCCGGCCGATTATTAGAGCGGAAAATAAAATCGGAGACATTGATGTTAAGGATTTGATGGTCGGAGATGAAGCGTCTCAACTCCGTTCCATGCTCGAAGTCAGTTATCCCATGGAAAATGGCGTGGTTAGAAACTGGGAGGATATGTGTCACGTCTGGGATTACACATTTGGTCCCAGCAAGATGAATATTGACCCCAAGGACACCAAAATACTGCTCACCGAGCCACCGATGAACCCCACTAAAAATAGAGAGAAAATGATTGAGGTTATGTTTGAGAAGTACGGTTTCGACAGCGCCTACATAGCCATACAAGCGGTGCTCACGCTGTACGCCCAGGGACTGATATCGGGAGTGGTGGTGGACTCAGACGGGAGGGTCATCAAAGTGGGGGGAGAACGGTTTGAAGCACCCGAGGCCCTGTTCCAGCCGCACCTCATCAATGTCGAGGGTCAAGGGATCGCTGAACTTGTCTTCAACACTATACAGGCGGCGGACATAGACATGAGGAACGAGTTGTACAAACACATCGTGCTGTCCGGGGGCTCCACCATGTACCCGGGGCTGCCCTCGCGGCTCGAGAGGGAGATCAAGCAACTGTACCTGGAGCGAGTGCTGAGGAACGACTCGGAGAAGCTGGCGAAATTCAAGATCCGTATTGAGGATCCGCCGCGCCGTAAGGACATGGTGTTCATCGGCGGCGCCGTGCTGGCCGAGGTGTGCAAGAACAGGGACAACTTCTGGCTCACCAGGAGCGACTACATGGAACAGGGCATTTCCTGTCTCCGCAAGCTCGGACCTCGGGCCACTTAA

Protein sequence:

>DPOGS205530-PA
MDDQGRKVIVCDNGTGFVKCGYAGSNFPAFIFPSMVGRPIIRAENKIGDIDVKDLMVGDEASQLRSMLEVSYPMENGVVRNWEDMCHVWDYTFGPSKMNIDPKDTKILLTEPPMNPTKNREKMIEVMFEKYGFDSAYIAIQAVLTLYAQGLISGVVVDSDGRVIKVGGERFEAPEALFQPHLINVEGQGIAELVFNTIQAADIDMRNELYKHIVLSGGSTMYPGLPSRLEREIKQLYLERVLRNDSEKLAKFKIRIEDPPRRKDMVFIGGAVLAEVCKNRDNFWLTRSDYMEQGISCLRKLGPRAT-