Proteins: Structure, Function, Bioinformatics
The fibroblast growth factor receptor (FGFR) substrate 2 (FRS2) family proteins function as scaffolding adapters for receptor tyrosine kinases (RTKs). The FRS2α proteins interact with RTKs through the phosphotyrosine-binding (PTB) domain and transfer signals from the activated receptors to downstream effector proteins. Here, we report the nuclear magnetic resonance structure of the FRS2α PTB domain bound to phosphorylated TrkB. The structure reveals that the FRS2α-PTB domain is comprised of two distinct but adjacent pockets for its mutually exclusive interaction with either nonphosphorylated juxtamembrane region of the FGFR, or tyrosine phosphorylated peptides TrkA and TrkB. The new structural insights suggest rational design of selective small molecules through targeting of the two conjunct pockets in the FRS2α PTB domain. Proteins 2014; 82:1534–1541. © 2014 Wiley Periodicals, Inc.
Agrobacterium tumefaciens is a Gram-negative soil-borne bacterium that causes Crown Gall disease in many economically important crops. The absence of a suitable chemical treatment means there is a need to discover new anti-Crown Gall agents and also characterize bona fide drug targets. One such target is dihydrodipicolinate synthase (DHDPS), a homo-tetrameric enzyme that catalyzes the committed step in the metabolic pathway yielding meso-diaminopimelate and lysine. Interestingly, there are 10 putative DHDPS genes annotated in the A. tumefaciens genome, including three whose structures have recently been determined (PDB IDs: 3B4U, 2HMC, and 2R8W). However, we show using quantitative enzyme kinetic assays that nine of the 10 dapA gene products, including 3B4U, 2HMC, and 2R8W, lack DHDPS function in vitro. A sequence alignment showed that the product of the dapA7 gene contains all of the conserved residues known to be important for DHDPS catalysis and allostery. This gene was cloned and the recombinant product expressed and purified. Our studies show that the purified enzyme (i) possesses DHDPS enzyme activity, (ii) is allosterically inhibited by lysine, and (iii) adopts the canonical homo-tetrameric structure in both solution and the crystal state. This study describes for the first time the structure, function and allostery of the bona fide DHDPS from A. tumefaciens, which offers insight into the rational design of pesticide agents for combating Crown Gall disease. Proteins 2014;. © 2014 Wiley Periodicals, Inc.
Molecular simulations of β-lactoglobulin complexed with fatty acids reveal the structural basis of ligand affinity to internal and possible external binding sites
The interaction of saturated fatty acids of different length (C8:0 to C18:0) with β-lactoglobulin (βLG) was investigated by molecular dynamics simulation and docking approaches. The results shows that the presence of such ligands in the hydrophobic central cavity of βLG, known as the protein calyx, determines an enhancement of atomic fluctuations compared to the unliganded form, especially for loops at the entrance of the binding site. Concerted motions are evidenced for protein regions that could favor the binding of ligands. The mechanism of anchoring of fatty acids of different length is similar for the carboxylate head-group, through electrostatic interactions with the side chains of Lys60/Lys69. The key protein residues to secure the hydrocarbon chain are Phe105/Met107, which adapt their conformation upon ligand binding. In particular, Phe105 provides an additional hydrophobic clamp only for the tail of the two fatty acids with the longest chains, palmitic and stearic acid, which are known to bind βLG with a high affinity. The search of additional external binding sites for fatty acids, distinct from the calyx, was also carried out for palmitic acid. Two external sites with a lower affinity were identified as secondary sites, one consisting in a hydrophobic cavity allowing two distinct binding modes for the fatty acid, and the other corresponding to a surface crevice close to the protein α-helix. The overall results provide a comprehensive picture of the dynamical behaviour of βLG in complex with fatty acids, and elucidate the structural basis of the binding of these physiological ligands. © Proteins 2014;. © 2014 Wiley Periodicals, Inc.
Increasing stability of antibody via antibody engineering: Stability engineering on an anti-hVEGF antibody
Antibody stability is very important for expression, activity, specificity and storage. The present knowledge of antibody structure has made it possible for a computer-aided molecule design to be used to optimize and increase antibody stability. Many computational methods have been built based on knowledge or structure, however, a good integrated engineering system has yet to be developed that combines these methods. In the current study, we designed an integrated computer-aided engineering protocol, which included several successful methods. Mutants were designed considering factors that affected stability and multiwall filter screening was used to improve the design accuracy. Using this protocol, the thermo-stability of an anti-hVEGF antibody was significantly improved. Nearly 40% of the single-point mutants proved to be more stable than the parent antibody and most of the mutations could be stacked effectively. The T50 also improved about 7°C by combinational mutation of 7 sites in the light chain and 3 sites in the heavy chain. Data indicate that the protocol is an effective method for optimization of antibody structure, especially for improving thermo-stability. This protocol could also be used to enhance the stability of other antibodies. © Proteins 2014;. © 2014 Wiley Periodicals, Inc.
Predicting the side-chain dihedral angle distributions of non-polar, aromatic, and polar amino acids using hard sphere models
The side-chain dihedral angle distributions of all amino acids have been measured from myriad high-resolution protein crystal structures. However, we do not yet know the dominant interactions that determine these distributions. Here, we explore to what extent the defining features of the side-chain dihedral angle distributions of different amino acids can be captured by a simple physical model. We find that a hard-sphere model for a dipeptide mimetic that includes only steric interactions plus stereochemical constraints is able to recapitulate the key features of the back-bone dependent observed amino acid side-chain dihedral angle distributions of Ser, Cys, Thr, Val, Ile, Leu, Phe, Tyr, and Trp. We find that for certain amino acids, performing the calculations with the amino acid of interest in the central position of a short α-helical segment improves the match between the predicted and observed distributions. We also identify the atomic interactions that give rise to the differences between the predicted distributions for the hard-sphere model of the dipeptide and that of the α-helical segment. Finally, we point out a case where the hard-sphere plus stereochemical constraint model is insufficient to recapitulate the observed side-chain dihedral angle distribution – namely the distribution P(χ3) for Met. © Proteins 2014;. © 2014 Wiley Periodicals, Inc.
The N-terminal of Annexin A1 as a secondary membrane binding site: A molecular dynamics study
Annexin A1 has been shown to cause membrane aggregation and fusion, yet the mechanism of these activities is not clearly understood. In this work, molecular dynamics simulations were performed on monomeric annexin A1 positioned between two negatively charged monolayers using AMBER's all atom force field to gain insight into the mechanism of fusion. Each phospolipid monolayer was made up of 180 DOPC molecules and 45 DOPG molecules to achieve a 4:1 ratio. The space between the two monolayers was explicitly solvated using TIP3P waters in a rectilinear box. The constructed setup contained up to 0.14 million atoms. Application of periodic boundary conditions to the simulation setup gave the desired effect of two continuous membrane bilayers. Non-bonded interactions were calculated between the N-terminal residues and the bottom layer of phospholipids, which displayed a strong attraction of K26 and K29 to the lipid headgroups. The side-chains of these two residues were observed to orient themselves in close proximity (approximately 3.5 Å) with the polar headgroups of the phospholipids. © Proteins 2014;. © 2014 Wiley Periodicals, Inc.
Euclidean sections of protein conformation space and their implications in dimensionality reduction
Dimensionality reduction is widely used in searching for the intrinsic reaction coordinates for protein conformational changes. We find the dimensionality-reduction methods using the pairwise root-mean-square deviation as the local distance metric face a challenge. We use Isomap as an example to illustrate the problem. We believe that there is an implied assumption for the dimensionality-reduction approaches that aim to preserve the geometric relations between the objects: both the original space and the reduced space have the same kind of geometry, such as Euclidean geometry vs. Euclidean geometry or spherical geometry vs. spherical geometry. When the protein free energy landscape is mapped onto a 2D plane or 3D space, the reduced space is Euclidean, thus the original space should also be Euclidean. For a protein with N atoms, its conformation space is a subset of the 3N-dimensional Euclidean space R3N. We formally define the protein conformation space as the quotient space of R3N by the equivalence relation of rigid motions. Whether the quotient space is Euclidean or not depends on how it is parameterized. When the pairwise root-mean-square deviation is employed as the local distance metric, implicit representations are used for the protein conformation space, leading to no direct correspondence to a Euclidean set. We have demonstrated that an explicit Euclidean-based representation of protein conformation space and the local distance metric associated to it improve the quality of dimensionality reduction in the tetra-peptide and β-hairpin systems. © Proteins 2014;. © 2014 Wiley Periodicals, Inc.
Crystal structure of the secreted protein HP1454 from the human pathogen Helicobacter pylori
HP1454 is a protein of 303 amino acids found in the extracellular milieu of Helicobacter pylori. The protein structure, crystallized in the orthorhombic C2221 space group with one molecule per asymmetric unit, has been determined using the single-wavelength anomalous dispersion method. HP1454 exhibits an elongated bent shape, composed of three distinct domains. Each domain possesses a fold already present in other structures: Domain I contains a three-strand antiparallel β-barrel flanked by a long α-helix, Domain II is an anti-parallel three-helix bundle, and Domain III a β-sheet flanked by two α-helices. The overall assembly of the protein does not bear any similarity with known structures. Proteins 2014;. © 2014 Wiley Periodicals, Inc.
Thiamine diphosphate (ThDP)-dependent enzymes form a diverse protein family which was classified into nine superfamilies. The cofactor ThDP is bound at the interface between two catalytic domains, the PYR and the PP domain. The nine superfamilies were assigned to five different structural architectures. Two superfamilies, the sulfopyruvate decarboxylases and α-ketoacid dehydrogenases 2, consist of separate PYR and PP domains. The oxidoreductase superfamily is of the intra-monomer/PYR-PP type with an N-terminal PYR and a subsequent PP domain. The active enzymes form homodimers with the ThDP cofactor bound at the interface between a PYR and a PP domain of the same monomer. Decarboxylases are of the inter-monomer/PYR-PP type with the cofactor bound between domains from different monomers. 1-Deoxy-d-xylulose-5-phosphate synthases are of the intra-monomer/PP-PYR type. The transketolases, α-ketoglutarate dehydrogenases, and α-ketoacid dehydrogenases 1 are of the inter-monomer/PP-PYR type. For the phosphonopyruvate decarboxylases, definitive assessment of the structural architecture is not possible due to lack of structure information. By applying a structure-based domain alignment method, sequences of more than 62,000 PYR and PP domains were identified and aligned. Although the sequence similarity of the catalytic domains is low between different superfamilies, seven positions were identified to be highly conserved, including the cofactor binding GDGX24,27N motif, the cofactor-activating glutamic acid, and two structurally equivalent glycines in both the PYR and the PP domain. An evolutionary pathway of ThDP-dependent enzymes is proposed which explains the sequence and structure diversity of this family by three basic evolutionary events: domain recruitment, domain linkage, and structural rearrangement of catalytic domains. Proteins 2014. © 2014 Wiley Periodicals, Inc.
RBRDetector: Improved prediction of binding residues on RNA-binding protein structures using complementary feature- and template-based strategies
Computational prediction of RNA-binding residues is helpful in uncovering the mechanisms underlying protein-RNA interactions. Traditional algorithms individually applied feature- or template-based prediction strategy to recognize these crucial residues, which could restrict their predictive power. To improve RNA-binding residue prediction, herein we propose the first integrative algorithm termed RBRDetector (RNA-Binding Residue Detector) by combining these two strategies. We developed a feature-based approach that is an ensemble learning predictor comprising multiple structure-based classifiers, in which well-defined evolutionary and structural features in conjunction with sequential or structural microenvironment were used as the inputs of support vector machines. Meanwhile, we constructed a template-based predictor to recognize the putative RNA-binding regions by structurally aligning the query protein to the RNA-binding proteins with known structures. The final RBRDetector algorithm is an ingenious fusion of our feature- and template-based approaches based on a piecewise function. By validating our predictors with diverse types of structural data, including bound and unbound structures, native and simulated structures, and protein structures binding to different RNA functional groups, we consistently demonstrated that RBRDetector not only had clear advantages over its component methods, but also significantly outperformed the current state-of-the-art algorithms. Nevertheless, the major limitation of our algorithm is that it performed relatively well on DNA-binding proteins and thus incorrectly predicted the DNA-binding regions as RNA-binding interfaces. Finally, we implemented the RBRDetector algorithm as a user-friendly web server, which is freely accessible at http://ibi.hzau.edu.cn/rbrdetector. Proteins 2014;. © 2014 Wiley Periodicals, Inc.
The kinetics of protein interactions are essential determinants in many cellular processes such as signal transduction and transcriptional regulation. Many proteins involved in these functions contain intrinsic disordered regions. This makes conformational flexibility become an unneglectable factor when studying the binding kinetic of these proteins. Compared with the binding of rigid proteins that is limited by diffusions, the binding mechanisms of proteins with internal flexibility are much more complicated. Using a small protein that contains two domains and a connecting loop as a testing system, we developed a multiscale simulation framework to study the role of flexible linkers in regulating kinetics of protein binding. The association and dissociation processes were implemented by a coarse-grained Monte-Carlo algorithm, while the conformational changes of the flexible linker were captured from all-atom molecular dynamic simulations. Our simulations illustrated that the presence of the extended domain linker can enhance the rate of protein association. On the other hand, the full-length flexible molecule is more difficult to dissociate than its two rigid domains but much easier than the molecule with a rigid linker. Overall, our studies demonstrated that both kinetics and thermodynamics of protein binding are closely modulated by the dynamic features of linker regions. Proteins 2014. © 2014 Wiley Periodicals, Inc.
Temperature-accelerated molecular dynamics gives insights into globular conformations sampled in the free state of the AC catalytic domain
The catalytic domain of the adenyl cyclase (AC) toxin from Bordetella pertussis is activated by interaction with calmodulin (CaM), resulting in cAMP overproduction in the infected cell. In the X-ray crystallographic structure of the complex between AC and the C terminal lobe of CaM, the toxin displays a markedly elongated shape. As for the structure of the isolated protein, experimental results support the hypothesis that more globular conformations are sampled, but information at atomic resolution is still lacking. Here, we use temperature-accelerated molecular dynamics (TAMD) simulations to generate putative all-atom models of globular conformations sampled by CaM-free AC. As collective variables, we use centers of mass coordinates of groups of residues selected from the analysis of standard molecular dynamics (MD) simulations. Results show that TAMD allows extended conformational sampling and generates AC conformations that are more globular than in the complexed state. These structures are then refined via energy minimization and further unrestrained MD simulations to optimize inter-domain packing interactions, thus resulting in the identification of a set of hydrogen bonds present in the globular conformations. Proteins 2014;. © 2014 Wiley Periodicals, Inc.
G protein-coupled receptors (GPCRs) are a vital class of proteins that transduce biological signals across the cell membrane. However, their allosteric activation mechanism is not fully understood; crystal structures of active and inactive receptors have been reported, but the functional pathway between these two states remains elusive. Here, we use structure-based (Gō-like) models to simulate activation of two GPCRs, rhodopsin and the β2 adrenergic receptor (β2AR). We used data-derived reaction coordinates that capture the activation mechanism for both proteins, showing that activation proceeds through quantitatively different paths in the two systems. Both reaction coordinates are determined from the dominant concerted motions in the simulations so the technique is broadly applicable. There were two surprising results. First, the main structural changes in the simulations were distributed throughout the transmembrane bundle, and not localized to the obvious areas of interest, such as the intracellular portion of Helix 6. Second, the activation (and deactivation) paths were distinctly nonmonotonic, populating states that were not simply interpolations between the inactive and active structures. These transitions also suggest a functional explanation for β2AR's basal activity: it can proceed through a more broadly defined path during the observed transitions. Proteins 2014. © 2014 Wiley Periodicals, Inc.
A detailed representation of electrostatic energy in prediction of sequence and pH dependence of protein stability
A molecular mechanics model, previously validated in applications to structure prediction, is shown to reproduce experiment in predictions of protein ionization state, and in predictions of sequence and pH dependence of protein stability. Over a large dataset, 1876 values of ΔΔG of folding, the RMSD is 1.34 kcal/mol. Using an alternative measure of accuracy, either the sign of the calculated ΔΔG agrees with experiment or the absolute value of the deviation is less than 1.0 kcal/mol, 1660 of 1876 data points (88.5%) pass the condition. Relative to models used previously in computer-aided protein design, the concept, we propose, most responsible for the performance of our model, and for the extensibility to non-neutral values of pH, is the treatment of electrostatic energy. The electronic structure of the protein is modeled using distributed atomic multipoles. The structured liquid state of the solvent is modeled using a dielectric continuum. A modification to the energetics of the reaction field, induced by the protein in the dielectric continuum, attempts to account for preformed multipoles of solvent water molecules and ions. An adjustable weight (with optimal value.141) applied to the total vacuum energy accounts implicitly for electronic polarization. A threshold distance, beyond which pairwise atomic interactions are neglected, is not used. In searches through subspaces of sequences and conformations, efficiency remains acceptable for useful applications. Proteins 2014;. © 2014 Wiley Periodicals, Inc.
Inference of molecular function of proteins is the fundamental task in the quest for understanding cellular processes. The task is getting increasingly difficult with thousands of new proteins discovered each day. The difficulty arises primarily due to lack of high-throughput experimental technique for assessing protein molecular function, a lacunae that computational approaches are trying hard to fill. The latter too faces a major bottleneck in absence of clear evidence based on evolutionary information. Here we propose a de novo approach to annotate protein molecular function through structural dynamics match for a pair of segments from two dissimilar proteins, which may share even <10% sequence identity. To screen these matches, corresponding 1 µs coarse-grained (CG) molecular dynamics trajectories were used to compute normalized root-mean-square-fluctuation graphs and select mobile segments, which were, thereafter, matched for all pairs using unweighted three-dimensional autocorrelation vectors. Our in-house custom-built forcefield (FF), extensively validated against dynamics information obtained from experimental nuclear magnetic resonance data, was specifically used to generate the CG dynamics trajectories. The test for correspondence of dynamics-signature of protein segments and function revealed 87% true positive rate and 93.5% true negative rate, on a dataset of 60 experimentally validated proteins, including moonlighting proteins and those with novel functional motifs. A random test against 315 unique fold/function proteins for a negative test gave >99% true recall. A blind prediction on a novel protein appears consistent with additional evidences retrieved therein. This is the first proof-of-principle of generalized use of structural dynamics for inferring protein molecular function leveraging our custom-made CG FF, useful to all. Proteins 2014;. © 2014 Wiley Periodicals, Inc.
Exploring the early stages of the pH-induced conformational change of influenza hemagglutinin
Hemagglutinin (HA) mediates the membrane fusion process of influenza virus through its pH-induced conformational change. However, it remains challenging to study its structure reorganization pathways in atomic details. Here, we first applied continuous constant pH molecular dynamics approach to predict the pKa values of titratable residues in H2 subtype HA. The calculated net-charges in HA1 globular heads increase from 0e (pH 7.5) to +14e (pH 4.5), indicating that the charge repulsion drives the detrimerization of HA globular domains. In HA2 stem regions, critical pH sensors, such as Glu1032, His181, and Glu891, are identified to facilitate the essential structural reorganizations in the fusing pathways, including fusion peptide release and interhelical loop transition. To probe the contribution of identified pH sensors and unveil the early steps of pH-induced conformational change, we carried out conventional molecular dynamics simulations in explicit water with determined protonation state for each titratable residue in different environmental pH conditions. Particularly, energy barriers involving previously uncharacterized hydrogen bonds and hydrophobic interactions are identified in the fusion peptide release pathway. Nevertheless, comprehensive comparisons across HA family members indicate that different HA subtypes might employ diverse pH sensor groups along with different fusion pathways. Finally, we explored the fusion inhibition mechanism of antibody CR6261 and small molecular inhibitor TBHQ, and discovered a novel druggable pocket in H2 and H5 subtypes. Our results provide the underlying mechanism for the pH-driven conformational changes and also novel insight for anti-flu drug development. Proteins 2014;. © 2014 Wiley Periodicals, Inc.
How the folding rates of two- and multistate proteins depend on the amino acid properties
Proteins fold by either two-state or multistate kinetic mechanism. We observe that amino acids play different roles in different mechanism. Many residues that are easy to form regular secondary structures (α helices, β sheets and turns) can promote the two-state folding reactions of small proteins. Most of hydrophilic residues can speed up the multistate folding reactions of large proteins. Folding rates of large proteins are equally responsive to the flexibility of partial amino acids. Other properties of amino acids (including volume, polarity, accessible surface, exposure degree, isoelectric point, and phase transfer energy) have contributed little to folding kinetics of the proteins. Cysteine is a special residue, it triggers two-state folding reaction and but inhibits multistate folding reaction. These findings not only provide a new insight into protein structure prediction, but also could be used to direct the point mutations that can change folding rate. Proteins 2014;. © 2014 Wiley Periodicals, Inc.
Multiple conformational states and gate opening of outer membrane protein TolC revealed by molecular dynamics simulations
Outer membrane protein TolC serves as an exit duct for exporting substances out of cell. The occluded periplasmic entrance of TolC is required to open for substrate transport, although the opening mechanism remains elusive. In this study, systematic molecular dynamics (MD) simulations for wild type TolC and six mutants were performed to explore the conformational dynamics of TolC. The periplasmic gate was shown to sample multiple conformational states with various degrees of gating opening. The gate opening was facilitated by all mutations except Y362F, which adopts an even more closed state than wild type TolC. The interprotomer salt-bridge R367–D153 is turned out to be crucial for periplasmic gate opening. The mutations that disrupt the interactions at the periplasmic tip may affect the stability of the trimeric assembly of TolC. Structural asymmetry of the periplasmic gate was observed to be opening size dependent. Asymmetric conformations are found in moderately opening states, while the most and the least opening states are often more symmetric. Finally, it is shown that lowering pH can remarkably stabilize the closed state of the periplasmic gate. Proteins 2014;. © 2014 Wiley Periodicals, Inc.