Monarch geneset OGS2.0

DPOGS214804
TranscriptDPOGS214804-TA1188 bp
ProteinDPOGS214804-PA395 aa
Genomic positionDPSCF300059 + 1706-5857
RNAseq coverage1x (Rank: top 95%)
Annotation
HeliconiusHMEL0221368e-9566.23% 
BombyxBGIBMGA012110-TA4e-11352.99% 
DrosophilaCG10083-PA3e-4139.39% 
EBI UniRef50UniRef50_B4MNC32e-4230.17%GK17105 n=1 Tax=Drosophila willistoni RepID=B4MNC3_DROWI
NCBI RefSeqXP_001958530.12e-4531.39%GF10969 [Drosophila ananassae]
NCBI nr blastpgi|1947524414e-4431.39%GF10969 [Drosophila ananassae]
NCBI nr blastxgi|1984632351e-4630.60%GA10060 [Drosophila pseudoobscura pseudoobscura]
Group
Gene OntologyGO:00055155.2e-20protein binding
KEGG pathwaycfa:6102833e-18 
 K06106 (CTTN, EMS1)maps-> Shigellosis
    Pathogenic Escherichia coli infection
    Bacterial invasion of epithelial cells
    Tight junction
InterPro domain[334-394] IPR0014525.2e-20Src homology-3 domain
Orthology groupMCL23346 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214804-TA
ATGTATCAGGGCGAAGGCGCTCCGACTGTACGAAAGGGAACGTGCGCTAATCATATCAAGCATGTGTCCCAGCTGTTCAACGGTTTCCATGCAACCATGCATGCGAGGACTGAAGACGATCTGGACGAGAAGATCATATTGGATAAACTGGCTAAAGCTGGCTCCGCATTCAACTTCAAGTCTAGGACCGAGGAGTTACCGCCCAGTGGACCTGTCGGGACCACGTATAAGAAAGTTAATCCTGTTGATAAGATAAAAACTGATACTTATATATACATGATGCCCCCTCAATCGGACGCGGAGGGCCTCCGACGACAAAGGTCTCAAGAAGCTCGGGCCCTTATCGGTGCCTCGACCGCCGCCGCTAGGGCTGTGTTCCAAGAACATTTAGCGCAGGGCCAGCGAGTCACTAAGTCAAACTCCATACCAGAGAAGCCAATCAGAAGTTCAGTGATAGCGCAAAGGATAAACACTTTCTCTCAGAACACGTCACAGACGACCCCGAGACCGAAAACTCCGCCGGCGAAACAAGAGACGGACACTCTCGGCGACGAACACCAGCCACCAGCAAAAACCAGCCCGCTCAAAAACATCGATAAAATAAAAATGTGCCTCGAGAGCCCCTCGCAGATACAGCACGAGCCGATCATAGACCCGACCGACCTCAGCCCGAGCGCGGAGAATCAACCCGGCTCGTTCAGCGCCGTCAGCTACACCGAATACAACAAGCAGAGCGAGCCCATCATCAAGCAGAACATTCTGGACAACGAAATGTTCGACGCTTACTACAGCTCCAACGAAAACGACGACGAGGACAGCGACAACAAGTTCAGCACCATAAAGAGGTCGCCCTACACCAAGAACGGCGACAGGGAGCTGAGCAGGCAGGACACGGTCATAGAGAACGTCCGCTACAAGGAGAACGGTGCCGCGGACGTTGAAGATGTTTCGCCCGAGGGTATGGAGGAGAGCACCATATATGAAGACCTGGACGATGACCCCGGACTGTCAGCCAGAGCGCTGTATGACTACCAGGCTGCTGATGAGTCCGAGATCACATTCGACCCGGGTGATGTGATCACACACATTGAGCAGGTGGATGCCGGCTGGTGGCAGGGTCTCGGCCCCCGGGGACATTTTGGCCTCTTCCCCGCTAACTATGTAGAACTATTGCCTCCACCACGCTAG

Protein sequence:

>DPOGS214804-PA
MYQGEGAPTVRKGTCANHIKHVSQLFNGFHATMHARTEDDLDEKIILDKLAKAGSAFNFKSRTEELPPSGPVGTTYKKVNPVDKIKTDTYIYMMPPQSDAEGLRRQRSQEARALIGASTAAARAVFQEHLAQGQRVTKSNSIPEKPIRSSVIAQRINTFSQNTSQTTPRPKTPPAKQETDTLGDEHQPPAKTSPLKNIDKIKMCLESPSQIQHEPIIDPTDLSPSAENQPGSFSAVSYTEYNKQSEPIIKQNILDNEMFDAYYSSNENDDEDSDNKFSTIKRSPYTKNGDRELSRQDTVIENVRYKENGAADVEDVSPEGMEESTIYEDLDDDPGLSARALYDYQAADESEITFDPGDVITHIEQVDAGWWQGLGPRGHFGLFPANYVELLPPPR-