Monarch geneset OGS2.0

DPOGS200071
TranscriptDPOGS200071-TA3270 bp
ProteinDPOGS200071-PA1089 aa
Genomic positionDPSCF300044 - 473855-485755
RNAseq coverage2106x (Rank: top 6%)
Annotation
HeliconiusHMEL0155478e-8253.61% 
BombyxBGIBMGA004592-TA0.085.05% 
Drosophilacals-PB2e-13551.08% 
EBI UniRef50UniRef50_E2A5K27e-16959.70%Calsyntenin-1 n=11 Tax=Pancrustacea RepID=E2A5K2_CAMFO
NCBI RefSeqXP_970864.12e-18061.67%PREDICTED: similar to AGAP007103-PA [Tribolium castaneum]
NCBI nr blastpgi|910849314e-17961.67%PREDICTED: similar to AGAP007103-PA [Tribolium castaneum]
NCBI nr blastxgi|910849311e-17761.67%PREDICTED: similar to AGAP007103-PA [Tribolium castaneum]
Group
Gene OntologyGO:00160202.7e-20membrane
GO:00055092.7e-20calcium ion binding
GO:00071563.8e-14homophilic cell adhesion
KEGG pathwaydre:1144242e-10 
 K05689 (CDHE, CDH1)maps-> Pathogenic Escherichia coli infection
    Thyroid cancer
    Bacterial invasion of epithelial cells
    Adherens junction
    Melanoma
    Pathways in cancer
    Endometrial cancer
    Cell adhesion molecules (CAMs)
    Bladder cancer
InterPro domain[232-453] IPR0089851.3e-27Concanavalin A-like lectin/glucanase
[50-154] IPR0159192.7e-20Cadherin-like
[54-155] IPR0021263.8e-14Cadherin
[245-385] IPR0133205.9e-08Concanavalin A-like lectin/glucanase, subgroup
Orthology groupMCL10948 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200071-TA
ATGGTGATTGACGAGAACGAGGCACGTCTCCGTGTACGTTACCCACTTAACTGTGAGAAACGCCGTAATTACAAGTTCGACATCGCCGCTGTTGGTTGTGATGGGTCTTACTCCAACACTGTACCGGTCCATATAACGGTGACGGATGTGAACGAATTCGCTCCGGTCTTCAATCAGGCTGCCTACGTCCGGTCAGTGGATGAGGGGAAGGTGTACGACGAGATCGTACGTGTGGAAGCGACTGATCGCGACTGTACCCCGCGTTACGGTGACGTCTGCAAGTACGAGATTGTAAACGATGGAGACAGGAACCAGCCGTTCGCTATCGACAATGAAGGTGTTATTCGTAATACGGAACCTCTGGAATACGATAAATCCCACAACCATATCCTGTCCGTGGTTGCATACGACTGTGGGATGATGCCATCCGCACCAGTTCTGGTCACTATCAAGGTCAACAAACCCTGCCGAGCTGGGTGGAAAGGTCTTGCTGAGCGTGTGGATTACGCTCCTGGAACAGGTCCTCTAGCACTTTTCCCAGCTGCTCGTTTAGAAGCATGTTCTAGCGACGAGCGCTGTCCAGGTGTTACAAGAATCCAAGCCGCTGTAACTCTCCAGGCTTCCAGAGCTGGTGTTGCCTGTGACAGGGACACATATACCTTACACGCCCAAAGAACAGTATGCGGTCTGGATCCCAAAACAGTGGATTTGCTACCCAGCCCTGGCGTAGGAAACGAGTGGGCGAAGTCTCTCAAACCCGACTCAGGTCGTGACGGCGAGCAGTTGTTCGAGTTCGACGGCGAGACATCAGCCGTTGTACCAGAATCAATACTACCACATTCCCTCGGCAGCACTTTCTCCGTCAGCACCTGGCTAAGACACGCCCCGCCCCCAGACCACGATAAGCACCGCAAGGAACACGTGCTGTGTCTCGCCGACGACCACAAAATGAATCGTCACCACTACGCGCTGTTCGTCCGCAACTGTCGTCTGATACTTCTTCTGAGACGTGACTTCGGTGAAGGTGATCTGAACATCTTCAGACCAGCCGAGTGGAGGTGGAAGCTGCCAGAGGTGTGCGACAACGAGTGGCATCACTACGCTATCAACGTGCGCTTCCCCAACGTAGAGCTGTACGTGGACGGTGAGCCGTACCGCGGCGAGAGGGGCCCGGAGGTCATCGACGACTGGCCGCTGCACCCGGCTCACGGCGTCAACACCACCATGGTAGTGGGCGCCTGCTGGCAGGGTACGGAGAGTGATATGAAGCACCATCTCCGCGGTTGGCTGGCGGGGCTCGGAGCATTGCCGGGGGCTGTTCAGCCGGCAACGGCGCTGAGATGCGCGGCCCGCTGTAGAGAGGGACTCAGTCTAGCGCCTGATCTATGTCGTGACGGCGAGCAGTTGTTCGAGTTCGACGGCGAGACATCAGCCGTTGTACCAGAATCAATACTACCACATTCCCTCGGCAGCACTTTCTCCGTCAGCACCTGGCTAAGACACGCCCCGCCCCCAGACCACGATAAGCACCGCAAGGAACACGTGCTGTGTCTCGCCGACGACCACAAAATGAATCGTCACCACTACGCGCTGTTCGTCCGCAACTGTCGTCTGATACTTCTTCTGAGACGTGACTTCGGTGAAGGTGATCTGAACATCTTCAGACCAGCCGAGTGGAGGTGGAAGCTGCCAGAGGTGTGCGACAACGAGTGGCATCACTACGCTATCAACGTGCGCTTCCCCAACGTAGAGCTGTACGTGGACGGTGAGCCGTACCGCGGCGAGAGGGGCCCGGAGGTCATCGACGACTGGCCGCTGCACCCGGCTCACGGCGTCAACACCACCATGGTAGTGGGCGCCTGCTGGCAGGGTACGGAGAGTGATATGAAGCACCATCTCCGCGGTTGGCTGGCGGGGCTCGGAGCATTGCCGGGGGCTGTTCAGCCGGCAACGGCGCTGAGATGCGCGGCCCGCTGTAGAGAGGGACTCAGTCTAGCGCCTGATCTATGTACGTCAACACACACACAGAGACATGCATTACACACACACACACATCTGAAGTCGGTGTCCGTGGAGGGTGACAGCGCCTCGGAGGTGGAGACCTTAGTGAGGCGGGTGGCGTATGGAGACGCCAGGGTGTTCCCGACGCCTGGAAGACGAAACGTACACCTGGCCACCACTATCACTTGTGATAACGGCCGAGTCATCAAGGCCCGCCCGGCCGAGTCCTACGTGATGGTGCTCGCGCCTCAGACGCCCACCATCCTGCTGAACGGCAGTGCGGATGCTGCTCGCGACTACGCACACTTCAGGGCAGGCCTGCCGGTGTTCCCTGATATAAGGGTGAGGGTGCTGGCCAGGAGCGGGGACGATATCAAAGAAGCGGAAACACAGAAGCTAGATTCGTGCGTGGTGTCGGTGTACCCCGCCCTGAACCCAGACCACGAGGCGTTGGCGCTGAAGAGCACGCCGGCCGACGACATCAGAGCGACCCTCACCAGGGACGGAGTCAGTCTTACAGGAGCTGATACGGTAGAAAACTACCAACAGGTGTTAAGAGAGATAGAGTACAGCAACAAGAAGCCCGCCTACTACCTCAACAGGGTGTTCAAACTGACGTGCTCCGAGCTCAACGGACGGTTCACGAGCAACGAATACGTACAGACGCTGACGGTGGAGCACCCGAGGGCGGCGTCGGACACCCGGGCGCTCCGCCCCGCCGGACTCGCAGACAAAATGGACGTCGTTAGGGAACATACCAGCAACAACATAGAGCCCGCCGTGGCTATGAATGTGCCGCGGGCGTTTGCATCACACTCGCAGCACTCGCAGCACGCGGCCGAGATCCCGGCGGCGCGGGTCCTGGACCTGCACGAGCGGCACAACTCCAACAATGTGGCAGTGGTGATAGGAGCGGTGATGGCGGGCGCGGTGGTAGCTCTCGTGGTGGTGGTCGCGGCCCGCCTGCGAGCCGCCAGGCCTTCGCCCCTGGCTAGGCCTTCGCCCCGACCCCGCCCTCTACGAGCTAATGACACAGAAATGGCTTGGGACGACTCCGCCCTCACCATCACCGTCAATCCGATGGAGGAGGCTACTGAGTGCGTCGTATCTCCTAGCCGAGTGTGTGAAGACAGTTCCTCGGCGGAATCCTGCTCGGATGAAGACTCCGATCACCACGACTCCTCCGACGAAGAAGGAGAGGTGATGGCCGGCAAGCAGCACAAGTACAGGAACATCAGCCAGCTCGAGTGGGACAACAGCACTATGTAA

Protein sequence:

>DPOGS200071-PA
MVIDENEARLRVRYPLNCEKRRNYKFDIAAVGCDGSYSNTVPVHITVTDVNEFAPVFNQAAYVRSVDEGKVYDEIVRVEATDRDCTPRYGDVCKYEIVNDGDRNQPFAIDNEGVIRNTEPLEYDKSHNHILSVVAYDCGMMPSAPVLVTIKVNKPCRAGWKGLAERVDYAPGTGPLALFPAARLEACSSDERCPGVTRIQAAVTLQASRAGVACDRDTYTLHAQRTVCGLDPKTVDLLPSPGVGNEWAKSLKPDSGRDGEQLFEFDGETSAVVPESILPHSLGSTFSVSTWLRHAPPPDHDKHRKEHVLCLADDHKMNRHHYALFVRNCRLILLLRRDFGEGDLNIFRPAEWRWKLPEVCDNEWHHYAINVRFPNVELYVDGEPYRGERGPEVIDDWPLHPAHGVNTTMVVGACWQGTESDMKHHLRGWLAGLGALPGAVQPATALRCAARCREGLSLAPDLCRDGEQLFEFDGETSAVVPESILPHSLGSTFSVSTWLRHAPPPDHDKHRKEHVLCLADDHKMNRHHYALFVRNCRLILLLRRDFGEGDLNIFRPAEWRWKLPEVCDNEWHHYAINVRFPNVELYVDGEPYRGERGPEVIDDWPLHPAHGVNTTMVVGACWQGTESDMKHHLRGWLAGLGALPGAVQPATALRCAARCREGLSLAPDLCTSTHTQRHALHTHTHLKSVSVEGDSASEVETLVRRVAYGDARVFPTPGRRNVHLATTITCDNGRVIKARPAESYVMVLAPQTPTILLNGSADAARDYAHFRAGLPVFPDIRVRVLARSGDDIKEAETQKLDSCVVSVYPALNPDHEALALKSTPADDIRATLTRDGVSLTGADTVENYQQVLREIEYSNKKPAYYLNRVFKLTCSELNGRFTSNEYVQTLTVEHPRAASDTRALRPAGLADKMDVVREHTSNNIEPAVAMNVPRAFASHSQHSQHAAEIPAARVLDLHERHNSNNVAVVIGAVMAGAVVALVVVVAARLRAARPSPLARPSPRPRPLRANDTEMAWDDSALTITVNPMEEATECVVSPSRVCEDSSSAESCSDEDSDHHDSSDEEGEVMAGKQHKYRNISQLEWDNSTM-