Monarch geneset OGS2.0

DPOGS208873
TranscriptDPOGS208873-TA2616 bp
ProteinDPOGS208873-PA871 aa
Genomic positionDPSCF300009 - 1478539-1498818
RNAseq coverage142x (Rank: top 54%)
Annotation
HeliconiusHMEL0160720.096.01% 
BombyxBGIBMGA008105-TA0.093.48% 
DrosophilaSKIP-PE2e-14440.77% 
EBI UniRef50UniRef50_UPI00021A75ED0.057.00%UPI00021A75ED related cluster n=3 Tax=unknown RepID=UPI00021A75ED
NCBI RefSeqXP_972950.20.060.07%PREDICTED: similar to CG31163 CG31163-PD [Tribolium castaneum]
NCBI nr blastpgi|1892347200.060.07%PREDICTED: similar to CG31163 CG31163-PD [Tribolium castaneum]
NCBI nr blastxgi|1892347200.060.20%PREDICTED: similar to CG31163 CG31163-PD [Tribolium castaneum]
Group
Gene OntologyGO:00055153.6e-16protein binding
KEGG pathwaybmy:Bm1_396151e-08 
 K04438 (CRK, CRKII)maps-> Regulation of actin cytoskeleton
    MAPK signaling pathway
    Bacterial invasion of epithelial cells
    Fc gamma R-mediated phagocytosis
    Renal cell carcinoma
    Pathways in cancer
    Shigellosis
    Chemokine signaling pathway
    Neurotrophin signaling pathway
    Insulin signaling pathway
    Focal adhesion
    ErbB signaling pathway
    Chronic myeloid leukemia
InterPro domain[652-719] IPR0137618.3e-21Sterile alpha motif-type
[644-721] IPR0109933.6e-16Sterile alpha motif homology
[567-643] IPR0014522.3e-11Src homology-3 domain
[654-714] IPR0211295.9e-11Sterile alpha motif, type 1
[650-717] IPR0016605.9e-09Sterile alpha motif domain
[575-629] IPR0115112.2e-07Variant SH3
Orthology groupMCL15880 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208873-TA
ATGAAAGATGGAGACCGAGGCTATTTAGAAGGCCTGGCTTCGCGGTATGCCGATTTATTCTCAACACATTACGGAGATGTACTAGATCATCTGGAAGAACTTAGAAGACATGAATGGGATGAAATGTCTCCTCGGATGAGAGTCCTCGGCGGCACCGGGACACCTCAAACACCACCTTCAGCTAGTCCTGGTTCATTGGCCTTGAATAATTTGACCACATCCCACTCTCAACCTATTTACGTACCAGGAAAATATTCGCCGTCAAGCTGTCTCACAGACAAGGAAGAAGACGATATTTACGGATTCGGCTATGGCGTGTTTGGCAAGCAAATGTTGCAAAGACAGCAACAACAAAAACAACTTCTCCTTGCTCAACCCACGCAACCTTTGATGCACAATCAACAGCATAATTACCAGAGTTGTCTAAGTCCACGATCAGCTTACTTTTATGAGTTTCCGCCGAGTGATAACTGTCTGGGTACTGCTAAGAAGAAGGTTACGACATTTTCGAGATTGCTTCGCGGGTTGAAGTCACATAGAAAAGAGAAACATGGCTCCTGTTCGCCGAAACATTCCCACAGTCCACGACAAGGGTTGCCTCCTCAGAGGGTTGATACACCGGACAGCGTGCTTCAAAGTGGCTTGGGCTTAGGCCCTGGCCCTGATACAGCTTTGAGGTCTATGGTAGATCCAAGAGACTATGACCGCCTCCGCTTCTTTCAAATGAGTGCCGCCCAACCAAACACCTTCGAAGAAACGATACATAGGTTAAAAGTTCAGGAAGCAATGAAGAAAAAAGATAAATTGGCGAGAGAACAAGAAGAGATTTTAAGAGACATAAGACATGGTCTGATGAATATGGGGAGAGATGGTGTCCGCGGGCCGTTTGGAGACGACACATACATGTATGACGATGAAGCAAGAGGGCTTTCCGGAAGGGGGCACTGGTACGATGAGCCACCCTATGAAAGCGACCCTGAAGACTTTTTAATGGGTGGTGGCGGTCCTGCAGCCTCATTTCAAAATGGACGAGTTTGTTTTACATTAAATTTACGAAATGAGTCAAGAGGTGAAGGTGTGATTTCCCTACGAACTGCCGGCGATATAAGTTTAGCAAGAACCCCTAGAAGAGGATTAATAATTCCGCAAAGCGGCCCCTACCCAACCACTGTGATACCCTTACGAACAGCGAGAGATAGGGAAAGCGGGGATTATGCGGCATCCGACATCCAATCTATCGGCTCGAGGTTGTCCGGCATATCGCTAGAATCAAGTCGATCCGAACGGGACTGTAGAAGGGGTTATAGGCAAATGTCAGGCTATCGGATAGGAATCGACCCTTTATCGCCAGCGTCCTCAGACTACGAAGATCAAGAAAGTGAAACAGATTCACAGCATATAGCAACTGTACATAAGTCAGCAGAAGAATGTGACGGTGTATCAAATCTTGCTGGTAAAGTCAGAGGTCTTAGACAAGATGTTCAAAGAAAAATTTCAAGGTTACGTCAAGAAGGAGGACCTGTCATCTCTTCCGATAGGCGAACTAGTGGCGATCAAGCTTTTCCATGTTCTAATAGTTCTTTTGAAAGTCTTCCCAGCGGCTCAGGCAGTAGCACCCAGGCATTGGTTCGTGCCGGTAGTAATCATTCGTCTCTCTCGGCAGAAGAAAACATAGAATTAAGTCCAGCTGGGCGGAGTCTACTCGTACCGCAAATGCTGTGCCGCGCTCGAGCACTCGTCGATTATGTCCCCAACATTTATGAAAAGGACGCCTTAAGATATAAGAAAGGAGATATCATTGAAGTTATTAACATGAATGCGTCCGGCATTTGGCGAGGTGTTCTAAACAATAAAGTTGGAAACTTTAAGTTTGGAAATGTTGAAGTTCTATCCGAACGGGATACAATGAGGTCAAGAACTTCTAAATGGTGCAAAAGTCGAGAAAGACTTTGGGAGACAAGACCGCGTACAGTTGAAGAATTATTGAGGCGAATAGATTTACCTGAATATATGGTTGCGTTTTCAAGAAATGGTTATGAAGACATAGAACTTTTTAAGGAGATTGAACCTTCCGATTTGGATTATTTAGGTATAATGACCCCTGATCATCGAACACGCATCCTTGCCGCCGTACAACTGCTACACCAGCTAGAAAGTGGCGAAGGTGAAGGTGAGGGTGACGGAGGTGGTTCCAGTTCGGAAGGAGGTGACTCTCCGTTTGGTCGTCGTCAATTTCCAAGAGATTCTGGCTGTTACGAAGCTGGAGTTGGTGTAGGCGGTGTTAGAGTACGCACATCTCCTCTCGTTCATAGAACGGATGAACCCTCCCACAGGCCTCCTGAACCAGCTCCTCAAGCAAAACGTTCTATTCGTCGTCGACAACCGGATGACGCTGAATGTGACAGAATCGAAAAGTACCCCGGCACAATCGGAGAGAAAACTAACGTACGTAGCGGGGGTCTGCCGGGAGGCGCTAGAGATGGCACATGTGAGTCCGATCATAAATTGAATGTTGTCAAGTTTGTAGCCGGAGGAGAGCCTTGCGCTTCGGAAAAGAGTAGTGACTCGGGTGTGAGCAGCTCTTCTTTGAGCTCGGCGCATCCTCATCGTCCCTGA

Protein sequence:

>DPOGS208873-PA
MKDGDRGYLEGLASRYADLFSTHYGDVLDHLEELRRHEWDEMSPRMRVLGGTGTPQTPPSASPGSLALNNLTTSHSQPIYVPGKYSPSSCLTDKEEDDIYGFGYGVFGKQMLQRQQQQKQLLLAQPTQPLMHNQQHNYQSCLSPRSAYFYEFPPSDNCLGTAKKKVTTFSRLLRGLKSHRKEKHGSCSPKHSHSPRQGLPPQRVDTPDSVLQSGLGLGPGPDTALRSMVDPRDYDRLRFFQMSAAQPNTFEETIHRLKVQEAMKKKDKLAREQEEILRDIRHGLMNMGRDGVRGPFGDDTYMYDDEARGLSGRGHWYDEPPYESDPEDFLMGGGGPAASFQNGRVCFTLNLRNESRGEGVISLRTAGDISLARTPRRGLIIPQSGPYPTTVIPLRTARDRESGDYAASDIQSIGSRLSGISLESSRSERDCRRGYRQMSGYRIGIDPLSPASSDYEDQESETDSQHIATVHKSAEECDGVSNLAGKVRGLRQDVQRKISRLRQEGGPVISSDRRTSGDQAFPCSNSSFESLPSGSGSSTQALVRAGSNHSSLSAEENIELSPAGRSLLVPQMLCRARALVDYVPNIYEKDALRYKKGDIIEVINMNASGIWRGVLNNKVGNFKFGNVEVLSERDTMRSRTSKWCKSRERLWETRPRTVEELLRRIDLPEYMVAFSRNGYEDIELFKEIEPSDLDYLGIMTPDHRTRILAAVQLLHQLESGEGEGEGDGGGSSSEGGDSPFGRRQFPRDSGCYEAGVGVGGVRVRTSPLVHRTDEPSHRPPEPAPQAKRSIRRRQPDDAECDRIEKYPGTIGEKTNVRSGGLPGGARDGTCESDHKLNVVKFVAGGEPCASEKSSDSGVSSSSLSSAHPHRP-