Multidimensional Exploration

Fragment-based assembly is a widely used technique in ab-initio protein structure prediction methods that seek to compute native-like conformations from sequence. Essentially, a protein conformation is pieced together with configurations of fragments extracted from databases of deposited protein native structures. Fragment length is an important consideration. The shorter the fragment, the more complex the protein conformational space where the native-like conformations reside and the more rugged the energy surface associated with that space. The longer the fragment, the simpler the conformational space and the smoother the energy surface; hence, the higher the risk of missing important regions of space that may lead to native-like conformations.

This project investigates the impact of varying fragment length during the search for low-energy protein conformations within a probabilistic search framework. Various strategies of varying the fragment length during the search are explored. Essentially, longer fragments are used in early stages of the search to simplify the search space and smooth the energy surface. Shorter fragments are then utilized in later stages to provide visibility to the more couple and realistic conformational space. We investigate both an approach where fragment length is changed after analysis of the explored conformational space and an adaptive approach where the search itself decides when and how to change employed fragment length.

Our ongoing work in this direction shows that varying the fragment length during the search enhances the sampling of the conformational space and reslts in higher-quality conformations (in terms of lRMSD) as compared to strategies that employ a single fragment length. A preliminary version of this work appear in "Variable-Length Fragment Assembly within a Probabilistic Protein Structure Prediction Framework", an MS thesis by Kevin Molloy. A journal research article is in preparation.



On this Project:

  • Kevin Molloy

    Amarda Shehu

This material is based upon work supported by the National Science Foundation under Grant No. 1016995. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.