Monarch geneset OGS2.0

DPOGS211850
TranscriptDPOGS211850-TA1131 bp
ProteinDPOGS211850-PA376 aa
Genomic positionDPSCF300011 - 1256142-1273339
RNAseq coverage36x (Rank: top 74%)
Annotation
HeliconiusHMEL0213681e-8082.46% 
BombyxBGIBMGA001224-TA2e-15684.09% 
DrosophilaScgdelta-PD7e-7451.65% 
EBI UniRef50UniRef50_D2A3J97e-7551.76%Putative uncharacterized protein GLEAN_07505 n=3 Tax=Tribolium castaneum RepID=D2A3J9_TRICA
NCBI RefSeqXP_001869861.17e-8555.31%conserved hypothetical protein [Culex quinquefasciatus]
NCBI nr blastpgi|1700712771e-8355.31%conserved hypothetical protein [Culex quinquefasciatus]
NCBI nr blastxgi|1700712779e-8055.93%conserved hypothetical protein [Culex quinquefasciatus]
Group
Gene OntologyGO:00160211.2e-75integral to membrane
GO:00070101.2e-75cytoskeleton organization
GO:00160121.2e-75sarcoglycan complex
KEGG pathwaydre:3249617e-56 
 K12563 (SGCD)maps-> Dilated cardiomyopathy
    Viral myocarditis
    Arrhythmogenic right ventricular cardiomyopathy (ARVC)
    Hypertrophic cardiomyopathy (HCM)
InterPro domain[103-363] IPR0068751.2e-75Sarcoglycan complex subunit protein
Orthology groupMCL11034 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211850-TA
ATGGAAGATAAAACCTTGAGCCAGAAACGACACGCGCCGGCGACCAGGAACCACATCGTCTATACCGACCACAAAGGCAGAAATGCTAATGGCACGGCAGCGCACTGTCCGCCACCGTCCGCCACCGTCCGCCACTTCTCGCCACCGACCACCATGATCGGCCACTACGATCCACTGAGAGCCACTGCCGGCCACTGTCCACCATCGATCCCGTACGCGGACAAAATCACCCCGGAGCCGATCCTGAACAAAAACGGAGGGCGCGATACAAAGGCGGACTCCATCAGGAACAGTTATAATAGCCAATTCAAAGTTGGCATTTACGGCTGGAGAAAGAAATGTCTCTACATTCTAGTCATGACGCTGATGCTTATGATGATTGTCAATCTCGCCTTGACGCTGTGGATTCTCAAAGTATTGGATTTCAATTCGGAAGGGATGGGTCAGCTCCGTATAGTGCCGGGTGGGCTGCAGCTGCTGGGCCAGGCTCTCGTGCTGGACTCCCTGTTCGCGTCCAGCATCAAGTCCCGCCGCGGCCAGCCCATCGCCATCGAGTCCTCCAGAAACTTTACGATCTCAACCAGAGACTCGCACGGCATGACACAGACCAGACTATTCTTAGGTCATGATCGTCTAGAAGTGAACGTGGGTAAGCTGGAGGTGCGGGATAGTAGGGGAAGCTTGGTGTTGGGGGCGGAGCGGGGCGCCGTCACCGTGGGCGCTGACAACCTGGTGGTGGCGAGTCCGGCGGGCGCCTCCTTCACCACGGCCGTGCAGACTCCGCTCGTCAAATCGCCACCCTCCAAGCCCTTGACACTGGAGTCACCAACTCGTTCTCTGGAGATGCACGCGGCGCAGAGCATCTCCATGGAGTCTCGTGCCGGAGACATCAGCGCCAGCTGCCTCACCACCTTCAGACTGAGATCCATCGCTGGTGCGATAAGACTGGACGCTCCGAGCATATACATGCCCAAGTTGAAGTCGGCACTACCCCTGCCCCCGTCGGCGCACACCCACGACCCGCATCATCAGAATATCTACCAGCTGTGTGCGTGCGCCAACGGCAAGCTGTTCCTGGCGCCACCTCACGGAGTCTGCGCGGCCAGAGATGAAAGCTTGATCTGCCGATGA

Protein sequence:

>DPOGS211850-PA
MEDKTLSQKRHAPATRNHIVYTDHKGRNANGTAAHCPPPSATVRHFSPPTTMIGHYDPLRATAGHCPPSIPYADKITPEPILNKNGGRDTKADSIRNSYNSQFKVGIYGWRKKCLYILVMTLMLMMIVNLALTLWILKVLDFNSEGMGQLRIVPGGLQLLGQALVLDSLFASSIKSRRGQPIAIESSRNFTISTRDSHGMTQTRLFLGHDRLEVNVGKLEVRDSRGSLVLGAERGAVTVGADNLVVASPAGASFTTAVQTPLVKSPPSKPLTLESPTRSLEMHAAQSISMESRAGDISASCLTTFRLRSIAGAIRLDAPSIYMPKLKSALPLPPSAHTHDPHHQNIYQLCACANGKLFLAPPHGVCAARDESLICR-