Monarch geneset OGS2.0

DPOGS209853
TranscriptDPOGS209853-TA1233 bp
ProteinDPOGS209853-PA410 aa
Genomic positionDPSCF300842 - 3072-7168
RNAseq coverage292x (Rank: top 38%)
Annotation
HeliconiusHMEL0221365e-10762.82% 
BombyxBGIBMGA012110-TA2e-12362.93% 
DrosophilaCG10083-PA7e-5537.53% 
EBI UniRef50UniRef50_B4MNC34e-5436.66%GK17105 n=1 Tax=Drosophila willistoni RepID=B4MNC3_DROWI
NCBI RefSeqXP_002009229.12e-5635.84%GI11368 [Drosophila mojavensis]
NCBI nr blastpgi|1951295714e-5535.84%GI11368 [Drosophila mojavensis]
NCBI nr blastxgi|1984632354e-6136.89%GA10060 [Drosophila pseudoobscura pseudoobscura]
Group
Gene OntologyGO:00055155.2e-20protein binding
KEGG pathwaycfa:6102832e-18 
 K06106 (CTTN, EMS1)maps-> Shigellosis
    Pathogenic Escherichia coli infection
    Bacterial invasion of epithelial cells
    Tight junction
InterPro domain[349-409] IPR0014525.2e-20Src homology-3 domain
Orthology groupMCL23346 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209853-TA
ATGCATGCGAGGACTGAAGACGATCTGGACGAGAAGATCATATTGGATAAACTGGCTAAAGCTGGCTCCGCATTCAACTTCAAGTCTAGGACCGAGGAGTTACCGCCCAGTGGACCTGTCGGGACCACGTATAAGAAAGTTAATCCTGTTCAAGAAATCAATTCCAAGGAAAGGGACCAGTTTTGGATGAAAGAGGAATTAGAAGAGAAGAGGAGAGTCGAAGCGGAGAGGAAGCGCAGGGAAGAGGAAAAAAAGAAAGGCGAGGAAGAATTGAGAAGAAGAGATGAGTTAGAGGCAGCAAAGAGAGCGCAAACAGAACAACAAAAGCCCCCTCAATCGGACGCGGAGGGCCTCCGACGACAAAGGTCTCAAGAAGCTCGGGCCCTTATCGGTGCCTCGACCGCCGCCGCTAGGGCTGTGTTCCAAGAACATTTAGCGCAGGGCCAGCGAGTCACTAAGTCAAACTCCATACCAGAGAAGCCAATCAGAAGTTCAGTGATAGCGCAAAGGATAAACACTTTCTCTCAGAACACGTCACAGACGACCCCGAGACCGAAAACTCCGCCGGCGAAACAAGAGACGGACACTCTCGGCGACGAACACCAGCCACCAGCAAAAACCAGCCCGCTCAAAAACATCGATAAAATAAAAATGTGCCTCGAGAGCCCCTCGCAGATACAGCACGAGCCGATCATAGACCCGACCGACCTCAGCCCGAGCGCGGAGAATCAACCCGGCTCGTTCAGCGCCGTCAGCTACACCGAATACAACAAGCAGAGCGAGCCCATCATCAAGCAGAACATTCTGGACAACGAAATGTTCGACGCTTACTACAGCTCCAACGAAAACGACGACGAGGACAGCGACAACAAGTTCAGCACCATAAAGAGGTCGCCCTACACCAAGAACGGCGACAGGGAGCTGAGCAGGCAGGACACGGTCATAGAGAACGTCCGCTACAAGGAGAACGGTGCCGCGGACGTTGAAGATGTTTCGCCCGAGGGTATGGAGGAGAGCACCATATATGAAGACCTGGACGATGACCCCGGACTGTCAGCCAGAGCGCTGTATGACTACCAGGCTGCTGATGAGTCCGAGATCACATTCGACCCGGGTGATGTGATCACACACATTGAGCAGGTGGATGCCGGCTGGTGGCAGGGACTCGGCCCCCGGGGACATTTTGGACTCTTCCCCGCTAACTATGTAGAACTACTGCCTCCACCACGCTAG

Protein sequence:

>DPOGS209853-PA
MHARTEDDLDEKIILDKLAKAGSAFNFKSRTEELPPSGPVGTTYKKVNPVQEINSKERDQFWMKEELEEKRRVEAERKRREEEKKKGEEELRRRDELEAAKRAQTEQQKPPQSDAEGLRRQRSQEARALIGASTAAARAVFQEHLAQGQRVTKSNSIPEKPIRSSVIAQRINTFSQNTSQTTPRPKTPPAKQETDTLGDEHQPPAKTSPLKNIDKIKMCLESPSQIQHEPIIDPTDLSPSAENQPGSFSAVSYTEYNKQSEPIIKQNILDNEMFDAYYSSNENDDEDSDNKFSTIKRSPYTKNGDRELSRQDTVIENVRYKENGAADVEDVSPEGMEESTIYEDLDDDPGLSARALYDYQAADESEITFDPGDVITHIEQVDAGWWQGLGPRGHFGLFPANYVELLPPPR-