Monarch geneset OGS2.0

DPOGS203040
TranscriptDPOGS203040-TA3099 bp
ProteinDPOGS203040-PA1032 aa
Genomic positionDPSCF300206 - 50301-66332
RNAseq coverage276x (Rank: top 39%)
Annotation
HeliconiusHMEL0161460.064.06% 
BombyxBGIBMGA006793-TA3e-11572.75% 
DrosophilaMICAL-like-PA6e-5849.27% 
EBI UniRef50UniRef50_D6X0752e-10933.19%Putative uncharacterized protein n=3 Tax=Tribolium castaneum RepID=D6X075_TRICA
NCBI RefSeqXP_001809361.11e-10933.05%PREDICTED: similar to MICAL-like CG11259-PA [Tribolium castaneum]
NCBI nr blastpgi|2700135857e-10933.19%hypothetical protein TcasGA2_TC012205 [Tribolium castaneum]
NCBI nr blastxgi|3320272925e-12632.05%MICAL-like protein 2 [Acromyrmex echinatior]
Group
Gene OntologyGO:00055152.6e-32protein binding
GO:00082706.7e-10zinc ion binding
KEGG pathwayngr:NAEGRDRAFT_562854e-24 
 K05699 (ACTN)maps-> Amoebiasis
    Regulation of actin cytoskeleton
    Tight junction
    Adherens junction
    Arrhythmogenic right ventricular cardiomyopathy (ARVC)
    Leukocyte transendothelial migration
    Systemic lupus erythematosus
    Focal adhesion
InterPro domain[821-967] IPR0227359.5e-35Domain of unknown function DUF3585
[164-264] IPR0017152.6e-32Calponin homology domain
[300-359] IPR0017816.7e-10Zinc finger, LIM-type
Orthology groupMCL25278 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203040-TA
ATGCTACTAGTTTTTATTCTATCAAGTGTCGTGGCTGTCTCGTTTGCACGGCCGAATTACGGAGAGTCTAGAAATATAATAACTGTGCCACCTAATTGTCCTTCTGGCCAGGAATGGATAAATGGAATGTGCAGAGACGTATGGCGTAGTGGTGTGACTCCTCCATCAATGTTATGGGATAATCAAGACTTAGGCAACGTCATCACCGTACCTCCCAATTGCCCACCTGGTCAAGAATATATTAACGGACAATGTCGAGATGTTTGGCGTAACGGCGGCAGTCTGGCGAATTCGAATCTCTTCAATGCTTTATTAAAAAAGTGGAGTGAACATGTTCGAAAAGCCCAACAATCATACGGTAGTGGATCAACAAGGAACATTATAAACGTTCCTAATCAATGTCCAATTGGTTTCCTACCGGACGCCTTGGGAAATTCACAAGTATTGTTTCAAATAAAGTTTTCTACAATGTCTGAAGTAAAGGGTACTAAAGCTTTAGAACTGTGGTGTAAACGGTTAATAAAAGATTGTCCTGAGGTACAAATAGATAATATGACAACATCCTGGAGAAATGGAATGGCGTTTTGTGCTTTGGTTCACCATTTTAGACCTGATTTAATAGATATTTCATCGCTTAAACCTGAAAATATTTACGAAAATAATCAATTAGCGTATAGTATTGCTGAAAAACACTTGGGCATTCCTGCTCTCTTGGATCCTGAGGATATGGTAGAGAATGATGTACCTGATAGACTGTCTATACTGACATATTTGTCCCAGTTTTATCAACGTCTTGGCCATACCGTTAACACAAAATCATCTGAAAGTGATCAAAAAGTATCACCACCTGTATCAAGTACACAGGTAGAGAGTCCAATCAAATTTGGAAAAGTGGCTTTAGATAAATGTGCTGTATGTGGATTGCCTGTGTATCTGGCACAGAGGCTTATAGTAGCTCAAAAACTCTATCACAGGAGATGTTTTCGATGTTCCAAATGCTCGGGACATCTTAATCCTAAAAACTATAATATTATTGACTCTACAAACTTTTCTTGTGACACATGTAAAAATGAAAAGATAATCCCTAAATATCTCAATAACAATGACCAAATAGGAATGTTGGCATTTAGTAATGATGTTGATGTGAGGCCAAATTTTCATAATGAAGAGGTGGTCAGCCCAAAAGCTAAACCAAGGCCACAATCCATACTAGACAAAATAAACATGTTTGAGAAAAATATAGAAAAAGAAACAGTAAACCTAGAAGATAAAGTAGGAAATATTAGTTTAAAAGACAACTATAATAATAAGTTAAAAGAAGGAATCAAAACTGATGATGAAGTTAAAAATAATAAATTTGAAATTCATAATAATGTTAATAGTTCGGAATTTAATTTAAAAGATAAAATTAAGTACCAAATAAATGCTGACACTGAAGTTATTTCTAAACCTCCATCACCCATATATAAGAGTAGTAAAGAAAAATTTGATTTTTTACAAACACAGTTATCTGACGATGCCTTAGACACTGGCATAGTCATAAATAGTCTAAAAAATGATTCCGTACAAATTAAGAATGAAGACGCTCTGAAAACAACATTCCCTCCAGAAGAATGTAGTGAAGTAACAGAAAATATAGAGAGCAACATAACAGATGATAGCAGATTGGCTCAGAGCAATATACTGTTGAATCCAAACGAAACGAAAATTAATAATATTGAACCGTTAAATACAGAAGAAGAGATCACTATAAGCGTTCCGCCGCGAAGAAAGAAACAGATTTTAGCCGATAGAAGATTAAAAGCTCTAAATAATAAAACTGAAGAAAACACCAAGGAGTACCCAGATCACTTGAATCCGTTTTCCGACGAAGAAAATGAGCATGAACAACAGGATTTAGCAGAGAAAGTCAGCACCAATCCATTCGGGAGTAGCGATGAAGACGAGATGCCAAACAATCCTAGTGTACCCGACATACAGACATCCACCCCTGTAAAACGTCACATCAAAGCATATAACCCTTTCTGGTCGGACGGCGAAGAACCATCGTCTGATGAGGAGTGTGATACAAATTCGAACATGCACAAATCGTCAATAACTGGTAGCACACCAAATCTCCGATCTCGGAGGAAGCCCAAGGCCCCTCCGCCCCCCATCCCTGAAAGCCTCAATCTTCCTTCAACATCAATGGACGATGTTGCCAGCATATCCTCATTCAGCTCCCACAACTCCACCATACATTCTGATCAGAAATCCTTCGGTGTCTCCACTCCGAAGTTAAAGAAGAAGCGGCCGGCTCCAACTCCCCCCGACAGCCTGTCCACTCTGTCAGGTTGTGGAGGTTCTATGGAGAGAGGTGTCGATACATCCGGCAGCCGCAACACTTCGGATGTCGTCCGTCGCGTAAAAGGACCAGCACCTGGTCTGCCACTTCCAGAACGTCGTGAGGTCAAATTGCAGATGTCCGGTGAAGAGCTGCAAGTGCAGCTGGATATGCTGGAGACGCAACAGCTGGGCCTGGAGAGACAGGGCGTACTCATAGAACAGATGATCAGGGATAAGTGTGAGGGTGATGGCGGTCCAACAGCCCCGCAAGAAGAAATCGAAGACCTGGTCATCCAGCTGTGCGAACTGGTCAATGAGAAAAATGACCTGTTCAGAAAACAAACAGAACTAATGTACATACGAAGACAACAAAGATTGGAACAAGAACAGGCGGACATCGAGCACGAGATACGTGTGATACAATCCCGACCCGCAGTCAATCGTATTGACTCTGATAAGGTCCGCGAGGAGCAGCTGGTCTCGCGACTGGTGGAGATCGTTCGCCTGCGGGACGAGCTTGTGCAGCAGCTTGATGCGGAAAGACGACGCGAGAGACAAGAGGACCTGGCCATAGCAGCATCGATCGCTACTAAAAGAGCACAAAGAAACAGCGAATCAAACAGTTCCTCGATGAAATCGGGGGACGCCGTCACGCCTGTCAAGAAGAACAAAGTTAAGGATAAAGTTAAAAAACAATTGAAGAAAGCCAAACACACTTTGATATCAAAGAAAAAAGACGAAGACAAACCTGACAAAGAGAAAAAAACTAAATAA

Protein sequence:

>DPOGS203040-PA
MLLVFILSSVVAVSFARPNYGESRNIITVPPNCPSGQEWINGMCRDVWRSGVTPPSMLWDNQDLGNVITVPPNCPPGQEYINGQCRDVWRNGGSLANSNLFNALLKKWSEHVRKAQQSYGSGSTRNIINVPNQCPIGFLPDALGNSQVLFQIKFSTMSEVKGTKALELWCKRLIKDCPEVQIDNMTTSWRNGMAFCALVHHFRPDLIDISSLKPENIYENNQLAYSIAEKHLGIPALLDPEDMVENDVPDRLSILTYLSQFYQRLGHTVNTKSSESDQKVSPPVSSTQVESPIKFGKVALDKCAVCGLPVYLAQRLIVAQKLYHRRCFRCSKCSGHLNPKNYNIIDSTNFSCDTCKNEKIIPKYLNNNDQIGMLAFSNDVDVRPNFHNEEVVSPKAKPRPQSILDKINMFEKNIEKETVNLEDKVGNISLKDNYNNKLKEGIKTDDEVKNNKFEIHNNVNSSEFNLKDKIKYQINADTEVISKPPSPIYKSSKEKFDFLQTQLSDDALDTGIVINSLKNDSVQIKNEDALKTTFPPEECSEVTENIESNITDDSRLAQSNILLNPNETKINNIEPLNTEEEITISVPPRRKKQILADRRLKALNNKTEENTKEYPDHLNPFSDEENEHEQQDLAEKVSTNPFGSSDEDEMPNNPSVPDIQTSTPVKRHIKAYNPFWSDGEEPSSDEECDTNSNMHKSSITGSTPNLRSRRKPKAPPPPIPESLNLPSTSMDDVASISSFSSHNSTIHSDQKSFGVSTPKLKKKRPAPTPPDSLSTLSGCGGSMERGVDTSGSRNTSDVVRRVKGPAPGLPLPERREVKLQMSGEELQVQLDMLETQQLGLERQGVLIEQMIRDKCEGDGGPTAPQEEIEDLVIQLCELVNEKNDLFRKQTELMYIRRQQRLEQEQADIEHEIRVIQSRPAVNRIDSDKVREEQLVSRLVEIVRLRDELVQQLDAERRRERQEDLAIAASIATKRAQRNSESNSSSMKSGDAVTPVKKNKVKDKVKKQLKKAKHTLISKKKDEDKPDKEKKTK-