Monarch geneset OGS2.0

DPOGS210287
TranscriptDPOGS210287-TA1497 bp
ProteinDPOGS210287-PA498 aa
Genomic positionDPSCF300216 + 320127-325446
RNAseq coverage247x (Rank: top 42%)
Annotation
HeliconiusHMEL0169776e-14459.15% 
BombyxBGIBMGA000030-TA4e-17164.23% 
DrosophilaScgalpha-PA6e-7234.66% 
EBI UniRef50UniRef50_D6X1T32e-8541.16%Putative uncharacterized protein n=2 Tax=Tribolium castaneum RepID=D6X1T3_TRICA
NCBI RefSeqXP_001811825.12e-8541.98%PREDICTED: similar to 50-kda dystrophin-associated glycoprotein [Tribolium castaneum]
NCBI nr blastpgi|2700135176e-8541.16%hypothetical protein TcasGA2_TC012123 [Tribolium castaneum]
NCBI nr blastxgi|2700135172e-8440.44%hypothetical protein TcasGA2_TC012123 [Tribolium castaneum]
Group
Gene OntologyGO:00160123.6e-76sarcoglycan complex
GO:00160208.6e-06membrane
GO:00055098.6e-06calcium ion binding
KEGG pathwayoaa:1000872915e-18 
 K12565 (SGCA)maps-> Dilated cardiomyopathy
    Viral myocarditis
    Arrhythmogenic right ventricular cardiomyopathy (ARVC)
    Hypertrophic cardiomyopathy (HCM)
InterPro domain[23-379] IPR0089083.6e-76Sarcoglycan alphaepsilon
[24-121] IPR0159198.6e-06Cadherin-like
Orthology groupMCL12932 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210287-TA
ATGAATAAAATGATGCGACAACCGTTGCTTTGGTTGATATTTATGAGAGCGGTTGTATGTCATGTTTATAATGCTGTCGAAACTGAAATGTTCTCTATACCCATCAGTCCTAACTTATTCAACTGGACTTATCAAGAATTTGATCAGCAGTACCGTTTCCACGCGTCCTTGATCGGTAAACCTGAATTGCCGATATGGCTTCGTTACATCTACAGCGGGCGACATCACTCGGGATTCATTTTTGGCACGCCGCCCCGAAATACTGAATCTCCTATTACGTTAGAGGTGATAGGGTTGAACCGTCAGGACTATGAAACCCGCCGGGTGCTGTTAACCCTGAAGGTTCTTCCCAAGGAGAAGATGGCTCGCCACGAGGTCGAGTTCAAGATAGACAATCTTAATGTTGAAGATCTTCTCGATGAGCATAGAATGAGCCGTCTGAAGGACATACTACGTACTAAACTATGGTTTGAGAGCAGCGAGGATCTGTATCCGACGTTCCTTGCATCAGCTATAGACTTGGGAGCCAGGCTACCGTTGAAGCCCAGCGATGGAGAAGGTCTGGTGATACGTCTGGGTAGTTCTCACCCGTTCTCGTCGGAGATGAAACGTCTCAGAGAGGAGGTACGCCCTCTCAGCAGACTACCCAGCTGTCCGAGGGAATACAAGAGAACAACCGTGGAGAGACTGTTCAGAGACGCCGGCTTCACACTGGACTGGTGTAACTTTGAGCTGTACAATACAATATACGGTCCACGGTCCACGGATCACTTGGAATACTTAACTGAGATTCCTTCACCCATAAATCGCGTCCGATCTGAAAGTCGCGAAGTGTGGACGGCGCCTAACAAGCAATCCTTGCCGACGAGGAGTTACGCGAAACAATTGACAGCAGCGATAGTGGGACCGTTGATTTTGCTGCTGCTATCGGTAGCAGCACTAACCGGTGTGCTGTGCTTCCATTATGCTGCTATAAGAGATCCCGAGTCAGACGTATTCTTAGACGGCATTTACCATATATGCGAAGATTACAGAAACCGAAGAGCGCACAAGTCATCTAACGTTGAAATATGCAAATATGGTACTAGCAACACAGAGCAAACTCAATTAGCTGACAACACCAGCACTAAAAGTTTAGGAATCAGTCCAAGCAGCAGTCTAGCGCGGCCCTACAGTCCTAAATCGACGACAAACTTAGCCGGCAGCTACAACCGACCTCAACCACCGCCGTACGGGACCCTCCATCATAGGAAACTGGACAAAACACCCGACAAAAGGTCGCGTTCACTGGAAGAATCATTAAAATTATTAAACGAAGCCAACATAGCTACGGAGTACGAGAGGAATCCGATCATAGACTACGCCGACAGCACGGACGATTACATATCAATAAAACCTGACACTGATTACATTATTAATAAAATGCAAAACGATTTGGACGACATAGTGGTTCCTGAACTCGCTAAATACGGCATATCCGGCATAGGACCGATTTGA

Protein sequence:

>DPOGS210287-PA
MNKMMRQPLLWLIFMRAVVCHVYNAVETEMFSIPISPNLFNWTYQEFDQQYRFHASLIGKPELPIWLRYIYSGRHHSGFIFGTPPRNTESPITLEVIGLNRQDYETRRVLLTLKVLPKEKMARHEVEFKIDNLNVEDLLDEHRMSRLKDILRTKLWFESSEDLYPTFLASAIDLGARLPLKPSDGEGLVIRLGSSHPFSSEMKRLREEVRPLSRLPSCPREYKRTTVERLFRDAGFTLDWCNFELYNTIYGPRSTDHLEYLTEIPSPINRVRSESREVWTAPNKQSLPTRSYAKQLTAAIVGPLILLLLSVAALTGVLCFHYAAIRDPESDVFLDGIYHICEDYRNRRAHKSSNVEICKYGTSNTEQTQLADNTSTKSLGISPSSSLARPYSPKSTTNLAGSYNRPQPPPYGTLHHRKLDKTPDKRSRSLEESLKLLNEANIATEYERNPIIDYADSTDDYISIKPDTDYIINKMQNDLDDIVVPELAKYGISGIGPI-