bio molecular modeling 1
TRANSCRIPT
-
8/6/2019 Bio Molecular Modeling 1
1/39
Instructor: Prof. Jess A. Izaguirre
Textbook: Tamar Schlick,Molecular Modeling and
Simulation: An Interdisciplinary Guide, Springer-Verlag, Berlin-New York, 2002
Reference: C. Brooks, M. Karplus, B.
Pettitt,Proteins: A Theoretical Perspective
of Dynamics, Structure, and
Thermodynamics, Wiley, 1988
Computational Biology
Introduction to Biomolecular
Modeling
-
8/6/2019 Bio Molecular Modeling 1
2/39
Outline
1. What is biomolecular
modeling?
2. Historical perspective3. Theory and
experiments
4. Protein
characterization
5. Computational
successes
6. Remaining challenges
-
8/6/2019 Bio Molecular Modeling 1
3/39
What is biomolecular modeling?
Application of computational models to
understand the structure, dynamics, and
thermodynamics of biological molecules
The models must be tailored to the question at
hand: Schrodinger equation is not the answer to
everything! Reductionist view bound to fail!
This implies that biomolecular modeling must beboth multidisciplinaryand multiscale
-
8/6/2019 Bio Molecular Modeling 1
4/39
Historical Perspective
1. 1946 MD calculation
2. 1960 force fields
3. 1969 Levinthals paradox on protein folding
4. 1970 MD of biological molecules
5. 1971 protein data bank
a. 1998 ion channel protein crystal structure
b. 1999 IBM announces blue gene project
-
8/6/2019 Bio Molecular Modeling 1
5/39
Theoretical Foundations
1. Born-Oppenheimer approximation (fixednuclei)
2. Force field parameters for families of chemical
compounds3. System modeled using Newtons equations of
motion
4. Examples: hard spheres simulations (alder and
Wainwright, 1959); Liquid water (Rahman andStillinger, 1970); BPTI (McCammon andKarplus); Villin headpiece (Duan and Kollman,1998)
-
8/6/2019 Bio Molecular Modeling 1
6/39
Experimental Foundations I
1. X-ray crystallography Analysis of the X-ray diffraction pattern produced when a
beam of X-rays is directed onto a well-ordered crystal. Thephase has to be reconstructed.
Phase problem solved by direct method for smallmolecules
For larger molecules, sophisticated Multiple IsomorphousReplacement (MIR) technique used
Current resolution below 2 \AA
2. Protein crystallography Difficult to grow well-ordered crystals Early success in predicting alpha helices and beta sheets
(Pauling, 1950s)
-
8/6/2019 Bio Molecular Modeling 1
7/39
Experimental Foundations II
3. NMR Spectroscopy
Nuclear Magnetic Resonance provides
structural and dynamic information about
molecules. It is not as detailed as X-ray,
limited to masses of 35 kDa
Distances between neighboring hydrogens
are used to reconstruct the 3D structureusing global optimization
-
8/6/2019 Bio Molecular Modeling 1
8/39
Proteins I
Polypeptide chains made up of amino
acids or residues linked by peptide bonds
20 aminoacids 50-500 residues, 1000-10000 atoms
Native structure believed to correspond to
energy minimum, since proteins unfoldwhen temperature is increased
-
8/6/2019 Bio Molecular Modeling 1
9/39
Protein Function in Cell
1. Enzymes Catalyze biological reactions
2. Structural role Cell wall Cell membrane
Cytoplasm
-
8/6/2019 Bio Molecular Modeling 1
10/39
Protein: The Machinery of Life
NH2-Val-His-Leu-Thr-Pro-Glu-Glu-
Lys-Ser-Ala-Val-Thr-Ala-Leu-Trp-
Gly-Lys-Val-Asn-Val-Asp-Glu-Val-
Gly-Gly-Glu-..
-
8/6/2019 Bio Molecular Modeling 1
11/39
Proteins II
Secondary structure: alpha helices, betasheets, turns
Tertiary structure: proteins are tightlypacked, with hydrophobic groups in thecore and charged sidechains in thesurface
Quaternary structure: protein domainsmay assemble into so called quaternarystructures
-
8/6/2019 Bio Molecular Modeling 1
12/39
Protein Structure
-
8/6/2019 Bio Molecular Modeling 1
13/39
Protein Structure
-
8/6/2019 Bio Molecular Modeling 1
14/39
Model Molecule: Hemoglobin
-
8/6/2019 Bio Molecular Modeling 1
15/39
Hemoglobin: Background
Protein in red blood cells
-
8/6/2019 Bio Molecular Modeling 1
16/39
Red Blood Cell (Erythrocyte)
-
8/6/2019 Bio Molecular Modeling 1
17/39
Hemoglobin: Background Protein in red blood cells
Composed of four subunits, each
containing a heme group: a ring-like
structure with a central iron atom that
binds oxygen
-
8/6/2019 Bio Molecular Modeling 1
18/39
Heme Groups in Hemoglobin
-
8/6/2019 Bio Molecular Modeling 1
19/39
Hemoglobin: Background Protein in red blood cells
Composed of four subunits, each
containing a heme group: a ring-like
structure with a central iron atom that
binds oxygen
Picks up oxygen in lungs, releases it inperipheral tissues (e.g. muscles)
-
8/6/2019 Bio Molecular Modeling 1
20/39
Hemoglobin Quaternary
Structure
Two alpha subunits and two beta subunits
(141 AA per alpha, 146 AA per beta)
-
8/6/2019 Bio Molecular Modeling 1
21/39
Hemoglobin Tertiary Structure
One beta subunit (8 alpha helices)
-
8/6/2019 Bio Molecular Modeling 1
22/39
Hemoglobin Secondary
Structure
alpha helix
-
8/6/2019 Bio Molecular Modeling 1
23/39
Proteins III
Protein motions of importance are torsionaloscillations about the bonds that link groupstogether
Substantial displacements of groups occur overlong time intervals
Collective motions either local (cage structure)or rigid-body (displacement of different regions)
What is the importance of these fluctuations forbiological function?
-
8/6/2019 Bio Molecular Modeling 1
24/39
Proteins IV
Effect of fluctuations:
Thermodynamics: equilibrium behavior
important; examples, energy of ligand binding
Dynamics: displacements from average
structure important; example, local sidechain
motions that act as conformational gates in
oxygen transport myoglobin, enzymes, ionchannels
-
8/6/2019 Bio Molecular Modeling 1
25/39
Proteins V: Local Motions
0.01-5 AA, 1 fs -0.1s
Atomic fluctuations Small displacements for substrate binding in enzymes
Energy source for barrier crossing and otheractivated processes (e.g., ring flips)
Sidechain motions Opening pathways for ligand (myoglobin)
Closing active site Loop motions
Disorder-to-order transition as part of virus formation
-
8/6/2019 Bio Molecular Modeling 1
26/39
Proteins VI: Rigid-Body Motions
1-10 AA, 1 ns 1 s
Helix motions
Transitions between substrates (myoglobin)
Hinge-bending motions
Gating of active-site region (liver alcoholdehydroginase)
Increasing binding range of antigens(antibodies)
-
8/6/2019 Bio Molecular Modeling 1
27/39
Proteins VII: Large Scale Motion
> 5 AA, 1 microsecond 10000 s
Helix-coil transition Activation of hormones
Protein folding transition Dissociation
Formation of viruses
Folding and unfolding transition
Synthesis and degradation of proteins
Role of motions sometimes only inferred fromtwo or more conformations in structural studies
-
8/6/2019 Bio Molecular Modeling 1
28/39
Study of Dynamics I
The computational study of atomicfluctuations in BPTI and other proteins hasshown that :
Directional character of active-site fluctuationsin enzymes contributes to catalysis
Small amplitude fluctuations are lubricant
It may be possible to extrapolate from shorttime fluctuations to larger-scale proteinmotions
-
8/6/2019 Bio Molecular Modeling 1
29/39
Study of Dynamics II
Collective motions particularly important
for biological function, e.g., displacements
for transition from inactive to active
Extended nature of these motions makes
them sensitive to environment: great
difference between vacuum and solution
simulations Collective motions transmit external solvent
effects to protein interior
-
8/6/2019 Bio Molecular Modeling 1
30/39
Study of Dynamics III
For the related storage protein, myoglobin:
Fluctuations in the globin are essential tobinding: the protein matrix in X-ray is so tightly
packed that there is no low energy path forthe ligand to enter or leave the heme pocket
Only through structural fluctuations can thebarriers be lowered sufficiently
Demonstrated through energy minimizationand molecular dynamics
-
8/6/2019 Bio Molecular Modeling 1
31/39
Study of Dynamics IV
For the transport protein hemoglobin there
are several important motions:
Oxygen binding produces tertiary structural
change A quaternary structural change from deoxy (low
oxygen affinity) to oxy configuration takes place.
This transmits information over a long distance
From the X-ray deoxy and oxy structures, astochastic reaction path has been found. Detailed
ligand binding has been performed using MD. A
statistical mechanical modelhas provided
coupling between these two processes
-
8/6/2019 Bio Molecular Modeling 1
32/39
Study of Dynamics VI
Three open problems are the following:
1. Ion channel gating: highly correlated fluctuations are
likely to be of great importance. Long time dynamics
problem2. Flexible docking: for MMP, enzymes, etc.,
fluctuations enter into thermodynamics and kinetic
of reactions. Samplingproblem
3. Protein folding: too complicated for full treatment butfor smallest proteins, beyond current methodology.
Coarseningproblem
-
8/6/2019 Bio Molecular Modeling 1
33/39
Possible topics for final projects
Applications Virtual screening
Extend recommender for MD protocols
Algorithms Multiscale integrators or sampling methods
Cellular automata solvers for diffusion, reaction,advection, etc.
Software 3D Visualization Extend simulation engines
-
8/6/2019 Bio Molecular Modeling 1
34/39
How to create hierarchical,
multiscale, multilevel algorithms?Examples:
Algorithms for N-body problem(linear complexity, multiplegrids) e.g., Matthey andIzaguirre (2004) J. Par. Dist.Comp.
Multiscale integration (15 order ofmagnitude gap on timescales)e.g. Ma and Izaguirre (2003),Multisc. Model. Simul.
Coarse approximations (use
averagingorstochasticorensemble) solutions, e.g.Izaguirre and Hampton (2004),J. Comp. Phys.
-
8/6/2019 Bio Molecular Modeling 1
35/39
Lengthening scales: DPD
Dissipative ParticleDynamics combinescoarsening of atoms
into fluid packageswith dissipative pairinteractions, and astochastic pair
interaction Total momentumconserved
Self-organization of lipid bilayer,
self-assembled aggregates
formed by amphiphilic lipid
molecules in water.
-
8/6/2019 Bio Molecular Modeling 1
36/39
Lengthening Scales: SRP
Enzyme simulation of
a ms using stochastic
reaction path
disadvantage: needinitial and final
configuration
Finds a trajectory
where global energyis minimized
-
8/6/2019 Bio Molecular Modeling 1
37/39
How to predict protein interaction
networks?
Goal:
Predict proteins in agenome that are likely tointeract, thus giving clue
as to their function.Our current solution starts
from experimentalinteraction data and usesclustering and a set coverapproach to predict novelinteractions.
This is documented inHuang et al. (2004),IEEE/ACM TCBB,submitted
-
8/6/2019 Bio Molecular Modeling 1
38/39
How to create high-performance
software that is easy to use?
Goals:
Encapsulate optimizations like parallelism andcluster/grid computing so that these canbe used easily. MATLAB andMathematica are examples of easy to usescientific software
Allow easy prototyping of algorithms,extensions of the software bycomputational scientists (not expertcomputer scientists)
Our current solutions use:
Generic and object-oriented programming
Design patterns
XML-based domain specific languages
Related publications:
Matthey et al. (2004) ACM Trans. Math.Software, 20(3)
Cickovski et al. (2004) IEEE/ACM Trans.Comput. Biol. and Bioinformatics
Cickovski and Izaguirre (2004) ACMTrans. Prog. Lang. and Systems, in
preparation
ProtoMol, CompuCell3D, Biologo
ProtoMol is open source and available at http://protomol.sourceforge.net
-
8/6/2019 Bio Molecular Modeling 1
39/39
How to help user select software,
algorithms, and parameters to solve
their problems?
Simulation Requirements
ProtoMol/MDSimAid Server
Optimal parameters
via XML
Oursolution uses performance models and
machine learning to generate rules, run-time optimization to fine tune suggestions.
We want to use agents and machine
learning to update the rules.
This is documented in Ko (2002) and
Crockeret al. (2004), J. Comp. Chem.
Goal:
Recommend optimal softwareand architectural parameters to
solve particular problems
Make this easily available as web
portals