Monarch geneset OGS2.0

DPOGS210142
TranscriptDPOGS210142-TA1260 bp
ProteinDPOGS210142-PA419 aa
Genomic positionDPSCF300261 + 75104-78825
RNAseq coverage872x (Rank: top 15%)
Annotation
HeliconiusHMEL0116060.084.96% 
BombyxBGIBMGA003789-TA0.082.69% 
Drosophiladock-PC1e-16173.11% 
EBI UniRef50UniRef50_Q9VPU12e-15973.11%Dreadlocks, isoform B n=22 Tax=Coelomata RepID=Q9VPU1_DROME
NCBI RefSeqXP_969702.22e-17372.16%PREDICTED: similar to GA17645-PA [Tribolium castaneum]
NCBI nr blastpgi|1892370883e-17272.16%PREDICTED: similar to GA17645-PA [Tribolium castaneum]
NCBI nr blastxgi|3838475153e-17176.52%PREDICTED: cytoplasmic protein NCK1-like isoform 1 [Megachile rotundata]
Group
Gene OntologyGO:00055152.1e-29protein binding
KEGG pathwaytca:6582015e-173 
 K07365 (NCK)maps-> Pathogenic Escherichia coli infection
    T cell receptor signaling pathway
    ErbB signaling pathway
InterPro domain[23-414] IPR0173041.2e-229Cytoplasmic, NCK
[316-398] IPR0009802.1e-29SH2 motif
[143-198] IPR0014521e-21Src homology-3 domain
[80-99] IPR0001084.1e-06Neutrophil cytosol factor 2 p67phox
Orthology groupMCL13435 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210142-TA
ATGCTGGAGACGCGGTCTGTGCGTGCCGCTGAATCGGTGTGGGCGAATAAGCCGTTGCCGGCGCCGCACGCCATGGCCAGCTCCAGGCACGGAAAGAATACACAGGACGATGTCTGCTACGTGGTCGCCAAGTACGACTATGCGGCCCAAGGAGCACAGGAGCTGGACCTGCGGAAGAACGAGCGCTACCTCTTACTGGACGACTCCAAGCACTGGTGGCGCGTACAGAACGCGCGCAGTCAGTCGGGATACGTGCCCAGCAACTATGTCAAGAAGGAGAAGCCTTCGCTGTTCGATAGCATCAAGAAGAAGGTGAAGAAGGGTTCCGGCTCTAAGACCCTGCCGTCGAACAGTTCTCCAGTGCGTGGCGGGGGCGGCGGCGGGGAGTCCCCGGGCGCAAGGCGCGTGGAGCCCACGGAGGCGCTGGGCACGGCCGTCGTCAAGTACAACTATCAGGCGCAGCAACCCGACGAGCTCGCGCTCACCAAGGGGACACGCATACTCATACTGGAGAAGAGCAACGACGGCTGGTGGAGGGGGCAGTACCAGGGACACACCGGATGGTTTCCTTCAAACTACACGAGCGAGGAAGGAGACGAGGACACCGTCCACACTTACGCGATGGCTGAGAATGTACTCGATATTGTTGTGGCGCTGTACTCGTTCACGTCCAACAACGAGCAGGAGCTGTCGTTCGAGAAAGGTGACCGTCTGGAGATCATCGAGAGACCGCCCTCTGACCCCGAGTGGTACCGGGCTCGGGACAACCGCGGACAGATAGGGCTCGTGCCCAGAAACTACCTCCAGGAACTCGCAGACTACCTCACGCAGCCTTACAGCGAGGCGTCCGAGGGCGGGCCGTCCAGCGCGGTGGCTCGAGTGGGTGCGGGCGTGGCGGGCCCGGCGGGCGCAGGTGGCGGGGCTGTGGGCGGCGGGGCGGTCGGCCGCGCCTGGTACTTCGGCGCCATCACGCGCACTCACTGTGACGCGCTGCTCAACCAGCACGGGCACGACGGAGACTTCCTCATCAGAGACTCGGAGACCAACGTCGGCGACTACTCCGTGTCGCTGAAGGCGCCGGGTCGCAACAAACACTTCCGCGTGCAGGTGGAGGGCAACCTGTACTGTATCGGCCAGAGGAAGTTCACGACGCTGGACCAGCTCGTGGCGCACTACCAGCGAGCTCCCATCTACACCAACAAGCAGGGGGAGAAGCTCTACCTCGTGCGTCCTCTACCGCGCGCCAACCAGAACTGCTGA

Protein sequence:

>DPOGS210142-PA
MLETRSVRAAESVWANKPLPAPHAMASSRHGKNTQDDVCYVVAKYDYAAQGAQELDLRKNERYLLLDDSKHWWRVQNARSQSGYVPSNYVKKEKPSLFDSIKKKVKKGSGSKTLPSNSSPVRGGGGGGESPGARRVEPTEALGTAVVKYNYQAQQPDELALTKGTRILILEKSNDGWWRGQYQGHTGWFPSNYTSEEGDEDTVHTYAMAENVLDIVVALYSFTSNNEQELSFEKGDRLEIIERPPSDPEWYRARDNRGQIGLVPRNYLQELADYLTQPYSEASEGGPSSAVARVGAGVAGPAGAGGGAVGGGAVGRAWYFGAITRTHCDALLNQHGHDGDFLIRDSETNVGDYSVSLKAPGRNKHFRVQVEGNLYCIGQRKFTTLDQLVAHYQRAPIYTNKQGEKLYLVRPLPRANQNC-