Mapping Multi-basin Protein Energy Landscapes

We have recently proposed novel evolutionary algorithms to map energy landscapes of proteins with multiple stable and semi-stable structural states. One of the algorithms, SIfTER combines concepts from evolutionary computation, dimensionality reduction, and protein modeling research. The algorithm leverages experimentally-available structures of wildtype and variant sequences of a protein to define a reduced search space from where to efficiently draw samples corresponding to novel structures not directly observed in the wet laboratory. The leveraging is based on the principle of conformational selection proposed by Nussinov and collegues. SIfTER is useful to the community to answer the question of how sequence mutations affect the function of a protein when there is an abundance of experimental structures that can be exploited to reconstruct an energy landscape that would be computationally impractical to do via Molecular Dynamics.

SIfTER allows mapping and juxtaposing landscapes of variant sequences and then relating observed differences to functional changes in a protein. We have applied SIfTER to map in detail the energy landscape of the wildtype and two oncogenic sequences of the catalytic domain of H-Ras. Analysis of SIfTER-computed energy landscapes for the wildtype and variant H-Ras catalytic domain suggests that the oncogenic mutations G12V and Q61L cause constitutive activation through two different mechanisms. G12V directly affects binding specificity while leaving the energy landscape largely unchanged, whereas Q61L has pronounced, starker effects on the landscape. An implementation of SIfTER is available at \url{}.

A video of SIfTER in action, as it explores the conformation space for the wildtype sequence of H-Ras is shown here.

We have started to explore additional evolutionary techniques and algorithms for mapping complex multimodal landscapes. Some of our work has been published at GECCO 2015 and workshops. In particular, a CMA-ES adaptation has been presented that shows the promise of concepts from evolutionary computation in addressing difficult problems in protein modeling.

Currently on this Project:

  • Ryan Moffatt

    Emanuel Sapin

    Kenneth De Jong

    Amarda Shehu

This material is based upon work supported by the National Science Foundation under Grant No. 1421001 and IIS CAREER Award No. 1144106. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.