Monarch geneset OGS2.0

DPOGS203737
TranscriptDPOGS203737-TA2982 bp
ProteinDPOGS203737-PA993 aa
Genomic positionDPSCF300010 - 473844-489751
RNAseq coverage865x (Rank: top 15%)
Annotation
HeliconiusHMEL0029680.082.60% 
BombyxBGIBMGA013369-TA0.075.15% 
DrosophilaVinc-PA0.056.39% 
EBI UniRef50UniRef50_O460370.056.39%Vinculin n=18 Tax=Pancrustacea RepID=VINC_DROME
NCBI RefSeqXP_001355231.20.057.01%GA17230 [Drosophila pseudoobscura pseudoobscura]
NCBI nr blastpgi|1984701220.057.01%GA17230 [Drosophila pseudoobscura pseudoobscura]
NCBI nr blastxgi|1571114020.056.79%vinculin [Aedes aegypti]
Group
Gene OntologyGO:00071554.9e-302cell adhesion
GO:00156294.9e-302actin cytoskeleton
GO:00051984.9e-302structural molecule activity
KEGG pathwaydpo:Dpse_GA172300.0 
 K05700 (VCL)maps-> Shigellosis
    Amoebiasis
    Regulation of actin cytoskeleton
    Leukocyte transendothelial migration
    Bacterial invasion of epithelial cells
    Adherens junction
    Focal adhesion
InterPro domain[28-992] IPR0060774.9e-302Vinculin/alpha-catenin
[776-786] IPR0179973.1e-51Vinculin
Orthology groupMCL13378 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203737-TA
ATGCACGTTTATCTTTTACTACATCTTGAACACGCGCCGAGATGGGAAGCTGAGAGGAAAACTGCAGCTTGGATGGAAGCTATAGACAGGGTATCTCGTCTCGTCATACTCCATGAGGAAGCTGAGGATGGGAACGCCATGCCGGACCTGGCCAGGCCGGTACAGGCCGTCTCACTGGCCGTCAACAACCTCGTTAAGGTGGGTCACGAGACCATAGAGTCAGCTGATGACAATTTACTCCGGGCCGACATGCCGGGGGCTCTGCACAGGGTCGAGGGGGCGGCGACACTACTGCAACAGGCATCTGACATGCTCAGAGGAGACCCATACTCAGGACCCGCCAGGAAAAAGCTCATCGAAGGCTCACGGGGTATCCTCCAAGGCACGTCAGCTTTGCTTCTTTGCTTCGACGAGTCCGAAGTTAGGAAAATTGTCAAGGAGTGTAAAAAGGTGTTAGACTATCTAGGTGTTGCAGAAGTGATCGACACCATGGAGGACCTCGCTCAGTTCCTGCGAGATATCTCACCAGCTCTATCCAAAGCAGCGAGGGAGGTGGCGGCGCGTGCAGCTGAACTGACCCACCCTCCGCACGCGGAGACCCTCGCCCGTTGTCTGGAGAGCGTCAAGCGACTTGCCCCAGTACTCATCTGCGCCATGAAGACATACATACACATACTGTCAGAAGGAGGCAAAGGCATTGAAGATGCGGCGGAGAACAGGAACTACCTGGCCCAGAGGATGGCGGACGAGATACATGAAATCATCAGGGTGTTACAATTGACGTCGTACGTGGAGGACGGCGGCGAGAAAGACAACATCGCCGTGTTGAAGGCCTTACAGAGTCTGGTGCACAGTAAAGTGCTCGCCGCTAATGAGTTTCTTGATGACCCTGAGGCTCAGCGGACCAGTGCCGGCGAGAGAGCTCTACGAGCAGCGCTCACGGCCGCTGCGCGGGCCGCCGAACACACCGACACACATCTCGCTGACAGACTGCGACGAGCTGCCAGGAACGGCGGTATCAACGCGGACCTGCTGTGCGACGAGCGGCAGTACGGACGAGGGAGGGAGCAGAAGGCGCTGACCCTGGCCGCGGAGCTGAAGGCGCAGCTGAGGGACGTACAAGGAACCGTGAACGAGGGAGTGCGAGCCGCGGAGAAGATACAAGGAGGGAAGACCATAGCCGCCAGATTGGAAACCGCCCACAAGTGGCTGGTGCACCCCGCGTGTGATCCCACCACCAGGGTCGAGGGACAGAAGGCTATCAACAGTATCGTGTCGCAGGGACAGAGGATAGCTGACAACCTCCATGGGCGGGAGAAAGCCGAGGTGATGCAGCTGTGCTCCGAGGTACAAAGGCTGGCCGACCAGCTCGCTGACCTCTGTATGACTGGAGACGGCGACCAGGAGGAGGCCAGGACGCTCACACGGTCGCTGACCGGCAAGCTGCACGAGCTGAAGCGTGCCATGGAGCGAGCGGTGGTCAACAGAGTGGTGGAGGACTTCATAGATGTGGCCGCTCCTCTCAGACACTTCACTGACGCTGTGAACGCGCCTGAAGGTACACCGAACCGCGAGGGTAACTTCCACGACAAGGCCACGTCCCTGGCCAGCTTCAGCTCCCGAGCCGCGGCCGCCGCCGCCATGGTCGCCGCAGACATCACACACGACAAGAGACTCGTCGACCAGCTGCTGCAACACGCGCAGGAGGTGGAGAAGTTATCGCCTCAGCTGATCTGCGCCGGCAAGATCCGTCTGCACTACCCAGAAAGCAAAGTGGCCGAGGAACACTTCAACAACCTGAAGTCTCAATACTCTGACGCCGTTCTTCGCTGTCGCGACCTCTGCGACCAGGCGGTCGACCCGCTGGAGTTCGTCCGCACCGCTGGTGAACTCATTCAGAAGCACACGTATCTGTGCGAGGACGCCATAAGGAACAACGACTCGCAGAAGATGGTGGACAACACATCGGCCATCGCCAGGTTGGCGAACCGCGTGTTGCTGGTGGCTGGTCGCGAGCGGGATAACACAGAAGACGGAGCCTTCAGCGCCGCCCTGGGGACAGCCCAGAGCAGGCTGCAGGCGGCGCTGGCGCCCGCCGTCCGAGCCGCCAAGAGTGTCGCCCTCGGGCAACCCGGCGCCCCCCCACACTGGAGAACCGCTAACGGAGAGACCTTTCTTGGAATATTAGTTATGATTACACTTAATCTTTTATGTCTGTATCTGAACACCCGACCTCGGACTGGAAATCTGGTGGATAAATCCTGTTTATGCATCCATAGAGTATCATCCAAGCCATCAGCGGTGTGGAGGAGGCCCTCTCCCGTCACTACGCCCCTCCCCCTCCGTCCCCCCCTCCGCCGCCGTCGCTCCCCCCTCTCTTCCATGTCGGCGCCCCCTCGCCCGCCGCCGCCGGACACAGACGACGAGGGTGAAGACATCTTCAGAAGACAGCCTCACCCGAGCCAACCTATCTTGGTGGCGGCCCACAACCTGCACAAGGCGGTCCGCGAGTGGTCCTCCAAAGACAACGAGATCATCGCCGCCGCCAAGCGGATGGCCATACTCATGGCGCGCCTGTCCGACCTCGTGCGCTCCGACTCCAAGGGAAGTAAGCGTGAGTTGATAGCTACTGCGAAGGCTATAGCTGAGGCTTCAGAGGAAGTAACTCGCCTCGCTAAGAAACTGGCTCTAGAGTGTACTGATAAGAGAATCAGAACTAATCTCCTCCAGGTGTGTGAACGTATCCCCACCATCGGCACTCAACTCAAGATACTGTCAACTGTCAAGGCCACTATGCTCGGAGCCCAGGGTAGCGAAGAAGATCAAGAGGCAACCGAGATGTTGGTCGGCAACGCTCAAAACTTGATGCAGAGTGTTAAGGAAACTGTGAAGGCCGCGGAGGGAGCCTCCATCAAGATACGGACGGAGCAAGGAGCTTATAGACTGCGTTGGGTGCGACGCTCGCCCTGGTACCAGATATAG

Protein sequence:

>DPOGS203737-PA
MHVYLLLHLEHAPRWEAERKTAAWMEAIDRVSRLVILHEEAEDGNAMPDLARPVQAVSLAVNNLVKVGHETIESADDNLLRADMPGALHRVEGAATLLQQASDMLRGDPYSGPARKKLIEGSRGILQGTSALLLCFDESEVRKIVKECKKVLDYLGVAEVIDTMEDLAQFLRDISPALSKAAREVAARAAELTHPPHAETLARCLESVKRLAPVLICAMKTYIHILSEGGKGIEDAAENRNYLAQRMADEIHEIIRVLQLTSYVEDGGEKDNIAVLKALQSLVHSKVLAANEFLDDPEAQRTSAGERALRAALTAAARAAEHTDTHLADRLRRAARNGGINADLLCDERQYGRGREQKALTLAAELKAQLRDVQGTVNEGVRAAEKIQGGKTIAARLETAHKWLVHPACDPTTRVEGQKAINSIVSQGQRIADNLHGREKAEVMQLCSEVQRLADQLADLCMTGDGDQEEARTLTRSLTGKLHELKRAMERAVVNRVVEDFIDVAAPLRHFTDAVNAPEGTPNREGNFHDKATSLASFSSRAAAAAAMVAADITHDKRLVDQLLQHAQEVEKLSPQLICAGKIRLHYPESKVAEEHFNNLKSQYSDAVLRCRDLCDQAVDPLEFVRTAGELIQKHTYLCEDAIRNNDSQKMVDNTSAIARLANRVLLVAGRERDNTEDGAFSAALGTAQSRLQAALAPAVRAAKSVALGQPGAPPHWRTANGETFLGILVMITLNLLCLYLNTRPRTGNLVDKSCLCIHRVSSKPSAVWRRPSPVTTPLPLRPPLRRRRSPLSSMSAPPRPPPPDTDDEGEDIFRRQPHPSQPILVAAHNLHKAVREWSSKDNEIIAAAKRMAILMARLSDLVRSDSKGSKRELIATAKAIAEASEEVTRLAKKLALECTDKRIRTNLLQVCERIPTIGTQLKILSTVKATMLGAQGSEEDQEATEMLVGNAQNLMQSVKETVKAAEGASIKIRTEQGAYRLRWVRRSPWYQI-