Modeling Protein Flexibility with Spatial and Energetic ConstraintsYi-Chieh Wu1, Amarda Shehu2, Lydia Kavraki2,3
Provided an approach to generating physical conformations of a protein
Modeled flexibility of the binding site Future work
• Investigate other modes of motion• Incorporate multiple motion vectors
ConclusionsAcknowledgements
1Dept. of Electrical and Computer Engineering, Rice University2Dept. of Computer Science, Rice University
3Dept. of Bioengineering, Rice University
Computer Research Association’s Committee on the Status of Women in Computing Research Distributed Mentor Project
W. M. Keck Center Undergraduate Research Training Program
Physical and Biological Computing Group, Rice University
For questions, comments, and preprint requests: Yi-Chieh Wu [email protected]
References• A.A. Canutescu and R. L. Dunbrack. Cyclic coordinate descent: A
robotics algorithm for protein loop closure. Protein Science, 12: 963-972, 2003.
• A. Shehu. (2004). Sampling Biomolecular Conformations with Spatial and Energetic Constraints. MS Thesis, Rice University.
Modeled flap movement of HIV-1 protease using first PCA
Opened and closed flaps but kept protein stable
Movement concentrated in flaps
Open-flap conformations are less constrained – recovered conformations with higher RMSDs
Discussion
Spatial Constraints Inverse kinematics – CCD Features defined along backbone, so sidechains kept rigid Displacement only valid in a small neighborhood
Energetic Constraints Full conjugate gradient minimization of CHARMM energy Energy cutoff of 600 kcal/mol “Rewind” to previous conformation if high-energy barrier encountered
Spatial and Energetic Constraints
Problem Definition
Generate a set of conformations that capture the most important motions
Follow along collective modes of motion starting from an initial structure
Limited by local search – analysis fails far from the native
Problem Statement
Motivation Most current methods
consider proteins as rigid structures
Models incorporating protein flexibility provide better representations
HIV-1 protease A virus protein that assists in HIV replication Target of drug design – single point of failure Native structure: fully minimized structure of
4HVP from the Protein Databank
Principle component analysis (PCA) Identifies major modes of motion Direct physical interpretations HIV-1 protease: First eigenvector
corresponds to opening and closing of the flaps surrounding the binding site
Model System
Figure 4. Backbone representation of HIV-1 protease bound to an inhibitor (orange).
Features: residues with constrained positions Choose atoms with the largest displacements
(Figure 5) Internal features moved along the PCA –
capture flap movement End features unmoved – keep rest of protein
native-like to maintain low-energy
Feature Definition
Figure 5: Atom displacements along the first PCA. Red circles mark the indices of our chosen features.
Method Results
RMSDs of Recovered Conformations along the First PCA(The highest RMSD as measured against the native structure is given. RMSD is measured in angstroms.)
Step Size
Flap All-Atom RMSD
Flap Backbone
RMSD
Flap Sidechain
RMSD
Rest All-Atom RMSD
Total All-Atom RMSD
Close 0.1 2.125 2.856 3.188 0.483 1.1170.25 2.235 2.104 2.266 0.359 0.8040.5 2.097 2.032 2.113 0.337 0.75521.0 2.289 2.166 2.319 0.298 0.7982.5 2.159 1.993 2.198 0.312 0.764
Open 0.1 4.668 4.027 4.814 0.517 1.5990.25 3.421 3.111 3.494 0.356 1.1660.5 3.340 3.171 3.454 0.375 1.1641.0 2.351 2.030 2.424 0.247 0.8022.5 1.643 1.434 1.691 0.240 0.582
Figure 6: Backbone representation of flap movement along the first PCA. Features used are shown as gray spheres.
Algorithm
Rigid geometry model Dihedrals are the only
degrees of freedom Reduce problem
dimensionality
Proteins as Robotic Manipulators
†Figures adapted from: I. Lotan. (2004). Algorithms exploiting the chain structure of proteins. PhD Thesis, Stanford University.
Figure 3†: Using CCD to satisfy spatial constraints. One joint (circled in green) is rotated at a time to bring the end-effector (blue) closer to the target position (red).
Figure 2: A protein modeled as an articulated mechanism.
Figure 1†: Rigid geometry model. Only dihedral angles are used as degrees of freedom. Backbone dihedrals (phi and psi) are depicted.
PROGRAM OUTPUTINPUT
Initialize Protein and Features
Move Features by PCA
Check Energy
Rewind to Previous
Conformation
Conformations
Time Analysis
Closure Satisfaction,
Energies, RMSD
Protein (Native)
PCA Vector, Step Size
Features
Energy Cutoff
Randomization, CCD, and
Minimization Parameters within cutoff
outside cutoff
Use CCD to Satisfy Features
Minimize Energy
Applications Protein native state
behavior Molecular interactions Drug design and discovery
Protein Flexibility
Energy landscape Funnel-shaped →
thermodynamically stable native structure
Varying energetic constraints → non-symmetric for open- and close-flap conformations
More conformations around the native
Cyclic Coordinate Descent (CCD) Iterative, heuristic approach to solving inverse kinematics Adjusts one dihedral at a time to move an atom to its
constrained position Computationally fast and analytically simple
Robotic Representation Atoms ≡ joints Bonds ≡ links Apply robotic techniques