Monarch geneset OGS2.0

DPOGS210684
TranscriptDPOGS210684-TA1089 bp
ProteinDPOGS210684-PA362 aa
Genomic positionDPSCF300013 - 896469-904231
RNAseq coverage191x (Rank: top 48%)
Annotation
HeliconiusHMEL0054252e-11484.10% 
BombyxBGIBMGA006299-TA2e-8165.50% 
DrosophilaCASK-PF3e-16667.22% 
EBI UniRef50UniRef50_Q24210-42e-17169.12%Isoform G of Peripheral plasma membrane protein CASK n=48 Tax=Bilateria RepID=Q24210-4
NCBI RefSeqXP_002058710.11e-17672.21%GJ14160 [Drosophila virilis]
NCBI nr blastpgi|2700034422e-17974.44%hypothetical protein TcasGA2_TC002673 [Tribolium castaneum]
NCBI nr blastxgi|1951436096e-16972.46%GL23741 [Drosophila persimilis]
Group
Gene OntologyGO:00055154.4e-44protein binding
KEGG pathwaydvi:Dvir_GJ141604e-176 
 K06103 (CASK)maps-> Tight junction
InterPro domain[204-347] IPR0081444.4e-44Guanylate kinase
[182-350] IPR0081452.7e-39Guanylate kinase/L-type calcium channel
[54-199] IPR0014525.2e-38Src homology-3 domain
[1-82] IPR0014783.3e-17PDZ/DHR/GLGF
[91-154] IPR0115111.2e-08Variant SH3
Orthology groupMCL10346 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210684-TA
ATGGGCATTACACTAAAACTGGCTGATGACGGACGTTGCATTGTTGCTCGGATCATGCATGGTGGAATGATACACCGACAAGCGACTCTACACGTGGGAGATGAGATCAAGGAAATTAATGGCACACCAGTAGCCAATCAATCTGTCGCTCAACTTCAAAGGATGCTGCGTGAAGCGCGTGGTTCAGTGACGTTCAAGATTGTGCCATCATACCGATCGGCTCCTCCGCCGTGCGAGCTATTCCGGATTAAGCCTTCGCCCGTGCTAATATTCGTTAGAGCGCAGTTTGACTACGATCCCTTAGAGGACGAACTAATACCCTGCGCTCAAGCTGGTATAGCGTTCAGCACAGGGGATATACTGCAGATAATCAGCAAGGATGACTCCCATTGGTGGCAGGCGCGGAAGGACGCATCAGGTGGATCAGCCGGTCTCATACCGAGTCCTGAACTACAAGAGTGGCGCGCCGCTTGCGCTGCCGCCGAAAGAAGTAACACAGATCAAGTGAATTGTTCTATATTCGGAAGAAAAAAGAAACAGGCCAAAGACAAATATTTGGCGAAACACAACGCCGTGTTCGATCAACTAGATGTTGTAACATACGAAGAAGTTGTTAAACTTCCCTACACCACGAGACCCCCTCGAACCGATGAGGAAAACGGCAGACATTACTACTTCGTTACCCACGACGAGATGATGGCTGACATAGCCGCTAACGAGTACCTCGAATACGGAACCCACGAGGACGCGATGTACGGAACAAAACTAGAGACGATACGCCGCATACATTCTGAGCGTCGCATAGCCATATTGGATGTGGAGCCACAAGCTCTTAAAATACTACGAACAGCGGAGTTCGCGCCATACGTGGTTTTCGTGGCCGCACCCTCTCTTAACAATGTCGCTGATTACGACGGTTCCTTAGAGGTGCTCGCGCGCGAGTCTGAGACGCTCCGCCGTACATACGGCCATTACTTCGACATGTCCATAGTCAACAATGACATTGACGACACACTCGGCCAGCTGGAGGCGGCACTAGCTAGGATGCGGTCCACACCACAGTGGGTACCAGTCTCCTGGGTTTACTGA

Protein sequence:

>DPOGS210684-PA
MGITLKLADDGRCIVARIMHGGMIHRQATLHVGDEIKEINGTPVANQSVAQLQRMLREARGSVTFKIVPSYRSAPPPCELFRIKPSPVLIFVRAQFDYDPLEDELIPCAQAGIAFSTGDILQIISKDDSHWWQARKDASGGSAGLIPSPELQEWRAACAAAERSNTDQVNCSIFGRKKKQAKDKYLAKHNAVFDQLDVVTYEEVVKLPYTTRPPRTDEENGRHYYFVTHDEMMADIAANEYLEYGTHEDAMYGTKLETIRRIHSERRIAILDVEPQALKILRTAEFAPYVVFVAAPSLNNVADYDGSLEVLARESETLRRTYGHYFDMSIVNNDIDDTLGQLEAALARMRSTPQWVPVSWVY-