bio molecular modeling 1

Upload: ashvin-sorani

Post on 07-Apr-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 Bio Molecular Modeling 1

    1/39

    Instructor: Prof. Jess A. Izaguirre

    Textbook: Tamar Schlick,Molecular Modeling and

    Simulation: An Interdisciplinary Guide, Springer-Verlag, Berlin-New York, 2002

    Reference: C. Brooks, M. Karplus, B.

    Pettitt,Proteins: A Theoretical Perspective

    of Dynamics, Structure, and

    Thermodynamics, Wiley, 1988

    Computational Biology

    Introduction to Biomolecular

    Modeling

  • 8/6/2019 Bio Molecular Modeling 1

    2/39

    Outline

    1. What is biomolecular

    modeling?

    2. Historical perspective3. Theory and

    experiments

    4. Protein

    characterization

    5. Computational

    successes

    6. Remaining challenges

  • 8/6/2019 Bio Molecular Modeling 1

    3/39

    What is biomolecular modeling?

    Application of computational models to

    understand the structure, dynamics, and

    thermodynamics of biological molecules

    The models must be tailored to the question at

    hand: Schrodinger equation is not the answer to

    everything! Reductionist view bound to fail!

    This implies that biomolecular modeling must beboth multidisciplinaryand multiscale

  • 8/6/2019 Bio Molecular Modeling 1

    4/39

    Historical Perspective

    1. 1946 MD calculation

    2. 1960 force fields

    3. 1969 Levinthals paradox on protein folding

    4. 1970 MD of biological molecules

    5. 1971 protein data bank

    a. 1998 ion channel protein crystal structure

    b. 1999 IBM announces blue gene project

  • 8/6/2019 Bio Molecular Modeling 1

    5/39

    Theoretical Foundations

    1. Born-Oppenheimer approximation (fixednuclei)

    2. Force field parameters for families of chemical

    compounds3. System modeled using Newtons equations of

    motion

    4. Examples: hard spheres simulations (alder and

    Wainwright, 1959); Liquid water (Rahman andStillinger, 1970); BPTI (McCammon andKarplus); Villin headpiece (Duan and Kollman,1998)

  • 8/6/2019 Bio Molecular Modeling 1

    6/39

    Experimental Foundations I

    1. X-ray crystallography Analysis of the X-ray diffraction pattern produced when a

    beam of X-rays is directed onto a well-ordered crystal. Thephase has to be reconstructed.

    Phase problem solved by direct method for smallmolecules

    For larger molecules, sophisticated Multiple IsomorphousReplacement (MIR) technique used

    Current resolution below 2 \AA

    2. Protein crystallography Difficult to grow well-ordered crystals Early success in predicting alpha helices and beta sheets

    (Pauling, 1950s)

  • 8/6/2019 Bio Molecular Modeling 1

    7/39

    Experimental Foundations II

    3. NMR Spectroscopy

    Nuclear Magnetic Resonance provides

    structural and dynamic information about

    molecules. It is not as detailed as X-ray,

    limited to masses of 35 kDa

    Distances between neighboring hydrogens

    are used to reconstruct the 3D structureusing global optimization

  • 8/6/2019 Bio Molecular Modeling 1

    8/39

    Proteins I

    Polypeptide chains made up of amino

    acids or residues linked by peptide bonds

    20 aminoacids 50-500 residues, 1000-10000 atoms

    Native structure believed to correspond to

    energy minimum, since proteins unfoldwhen temperature is increased

  • 8/6/2019 Bio Molecular Modeling 1

    9/39

    Protein Function in Cell

    1. Enzymes Catalyze biological reactions

    2. Structural role Cell wall Cell membrane

    Cytoplasm

  • 8/6/2019 Bio Molecular Modeling 1

    10/39

    Protein: The Machinery of Life

    NH2-Val-His-Leu-Thr-Pro-Glu-Glu-

    Lys-Ser-Ala-Val-Thr-Ala-Leu-Trp-

    Gly-Lys-Val-Asn-Val-Asp-Glu-Val-

    Gly-Gly-Glu-..

  • 8/6/2019 Bio Molecular Modeling 1

    11/39

    Proteins II

    Secondary structure: alpha helices, betasheets, turns

    Tertiary structure: proteins are tightlypacked, with hydrophobic groups in thecore and charged sidechains in thesurface

    Quaternary structure: protein domainsmay assemble into so called quaternarystructures

  • 8/6/2019 Bio Molecular Modeling 1

    12/39

    Protein Structure

  • 8/6/2019 Bio Molecular Modeling 1

    13/39

    Protein Structure

  • 8/6/2019 Bio Molecular Modeling 1

    14/39

    Model Molecule: Hemoglobin

  • 8/6/2019 Bio Molecular Modeling 1

    15/39

    Hemoglobin: Background

    Protein in red blood cells

  • 8/6/2019 Bio Molecular Modeling 1

    16/39

    Red Blood Cell (Erythrocyte)

  • 8/6/2019 Bio Molecular Modeling 1

    17/39

    Hemoglobin: Background Protein in red blood cells

    Composed of four subunits, each

    containing a heme group: a ring-like

    structure with a central iron atom that

    binds oxygen

  • 8/6/2019 Bio Molecular Modeling 1

    18/39

    Heme Groups in Hemoglobin

  • 8/6/2019 Bio Molecular Modeling 1

    19/39

    Hemoglobin: Background Protein in red blood cells

    Composed of four subunits, each

    containing a heme group: a ring-like

    structure with a central iron atom that

    binds oxygen

    Picks up oxygen in lungs, releases it inperipheral tissues (e.g. muscles)

  • 8/6/2019 Bio Molecular Modeling 1

    20/39

    Hemoglobin Quaternary

    Structure

    Two alpha subunits and two beta subunits

    (141 AA per alpha, 146 AA per beta)

  • 8/6/2019 Bio Molecular Modeling 1

    21/39

    Hemoglobin Tertiary Structure

    One beta subunit (8 alpha helices)

  • 8/6/2019 Bio Molecular Modeling 1

    22/39

    Hemoglobin Secondary

    Structure

    alpha helix

  • 8/6/2019 Bio Molecular Modeling 1

    23/39

    Proteins III

    Protein motions of importance are torsionaloscillations about the bonds that link groupstogether

    Substantial displacements of groups occur overlong time intervals

    Collective motions either local (cage structure)or rigid-body (displacement of different regions)

    What is the importance of these fluctuations forbiological function?

  • 8/6/2019 Bio Molecular Modeling 1

    24/39

    Proteins IV

    Effect of fluctuations:

    Thermodynamics: equilibrium behavior

    important; examples, energy of ligand binding

    Dynamics: displacements from average

    structure important; example, local sidechain

    motions that act as conformational gates in

    oxygen transport myoglobin, enzymes, ionchannels

  • 8/6/2019 Bio Molecular Modeling 1

    25/39

    Proteins V: Local Motions

    0.01-5 AA, 1 fs -0.1s

    Atomic fluctuations Small displacements for substrate binding in enzymes

    Energy source for barrier crossing and otheractivated processes (e.g., ring flips)

    Sidechain motions Opening pathways for ligand (myoglobin)

    Closing active site Loop motions

    Disorder-to-order transition as part of virus formation

  • 8/6/2019 Bio Molecular Modeling 1

    26/39

    Proteins VI: Rigid-Body Motions

    1-10 AA, 1 ns 1 s

    Helix motions

    Transitions between substrates (myoglobin)

    Hinge-bending motions

    Gating of active-site region (liver alcoholdehydroginase)

    Increasing binding range of antigens(antibodies)

  • 8/6/2019 Bio Molecular Modeling 1

    27/39

    Proteins VII: Large Scale Motion

    > 5 AA, 1 microsecond 10000 s

    Helix-coil transition Activation of hormones

    Protein folding transition Dissociation

    Formation of viruses

    Folding and unfolding transition

    Synthesis and degradation of proteins

    Role of motions sometimes only inferred fromtwo or more conformations in structural studies

  • 8/6/2019 Bio Molecular Modeling 1

    28/39

    Study of Dynamics I

    The computational study of atomicfluctuations in BPTI and other proteins hasshown that :

    Directional character of active-site fluctuationsin enzymes contributes to catalysis

    Small amplitude fluctuations are lubricant

    It may be possible to extrapolate from shorttime fluctuations to larger-scale proteinmotions

  • 8/6/2019 Bio Molecular Modeling 1

    29/39

    Study of Dynamics II

    Collective motions particularly important

    for biological function, e.g., displacements

    for transition from inactive to active

    Extended nature of these motions makes

    them sensitive to environment: great

    difference between vacuum and solution

    simulations Collective motions transmit external solvent

    effects to protein interior

  • 8/6/2019 Bio Molecular Modeling 1

    30/39

    Study of Dynamics III

    For the related storage protein, myoglobin:

    Fluctuations in the globin are essential tobinding: the protein matrix in X-ray is so tightly

    packed that there is no low energy path forthe ligand to enter or leave the heme pocket

    Only through structural fluctuations can thebarriers be lowered sufficiently

    Demonstrated through energy minimizationand molecular dynamics

  • 8/6/2019 Bio Molecular Modeling 1

    31/39

    Study of Dynamics IV

    For the transport protein hemoglobin there

    are several important motions:

    Oxygen binding produces tertiary structural

    change A quaternary structural change from deoxy (low

    oxygen affinity) to oxy configuration takes place.

    This transmits information over a long distance

    From the X-ray deoxy and oxy structures, astochastic reaction path has been found. Detailed

    ligand binding has been performed using MD. A

    statistical mechanical modelhas provided

    coupling between these two processes

  • 8/6/2019 Bio Molecular Modeling 1

    32/39

    Study of Dynamics VI

    Three open problems are the following:

    1. Ion channel gating: highly correlated fluctuations are

    likely to be of great importance. Long time dynamics

    problem2. Flexible docking: for MMP, enzymes, etc.,

    fluctuations enter into thermodynamics and kinetic

    of reactions. Samplingproblem

    3. Protein folding: too complicated for full treatment butfor smallest proteins, beyond current methodology.

    Coarseningproblem

  • 8/6/2019 Bio Molecular Modeling 1

    33/39

    Possible topics for final projects

    Applications Virtual screening

    Extend recommender for MD protocols

    Algorithms Multiscale integrators or sampling methods

    Cellular automata solvers for diffusion, reaction,advection, etc.

    Software 3D Visualization Extend simulation engines

  • 8/6/2019 Bio Molecular Modeling 1

    34/39

    How to create hierarchical,

    multiscale, multilevel algorithms?Examples:

    Algorithms for N-body problem(linear complexity, multiplegrids) e.g., Matthey andIzaguirre (2004) J. Par. Dist.Comp.

    Multiscale integration (15 order ofmagnitude gap on timescales)e.g. Ma and Izaguirre (2003),Multisc. Model. Simul.

    Coarse approximations (use

    averagingorstochasticorensemble) solutions, e.g.Izaguirre and Hampton (2004),J. Comp. Phys.

  • 8/6/2019 Bio Molecular Modeling 1

    35/39

    Lengthening scales: DPD

    Dissipative ParticleDynamics combinescoarsening of atoms

    into fluid packageswith dissipative pairinteractions, and astochastic pair

    interaction Total momentumconserved

    Self-organization of lipid bilayer,

    self-assembled aggregates

    formed by amphiphilic lipid

    molecules in water.

  • 8/6/2019 Bio Molecular Modeling 1

    36/39

    Lengthening Scales: SRP

    Enzyme simulation of

    a ms using stochastic

    reaction path

    disadvantage: needinitial and final

    configuration

    Finds a trajectory

    where global energyis minimized

  • 8/6/2019 Bio Molecular Modeling 1

    37/39

    How to predict protein interaction

    networks?

    Goal:

    Predict proteins in agenome that are likely tointeract, thus giving clue

    as to their function.Our current solution starts

    from experimentalinteraction data and usesclustering and a set coverapproach to predict novelinteractions.

    This is documented inHuang et al. (2004),IEEE/ACM TCBB,submitted

  • 8/6/2019 Bio Molecular Modeling 1

    38/39

    How to create high-performance

    software that is easy to use?

    Goals:

    Encapsulate optimizations like parallelism andcluster/grid computing so that these canbe used easily. MATLAB andMathematica are examples of easy to usescientific software

    Allow easy prototyping of algorithms,extensions of the software bycomputational scientists (not expertcomputer scientists)

    Our current solutions use:

    Generic and object-oriented programming

    Design patterns

    XML-based domain specific languages

    Related publications:

    Matthey et al. (2004) ACM Trans. Math.Software, 20(3)

    Cickovski et al. (2004) IEEE/ACM Trans.Comput. Biol. and Bioinformatics

    Cickovski and Izaguirre (2004) ACMTrans. Prog. Lang. and Systems, in

    preparation

    ProtoMol, CompuCell3D, Biologo

    ProtoMol is open source and available at http://protomol.sourceforge.net

  • 8/6/2019 Bio Molecular Modeling 1

    39/39

    How to help user select software,

    algorithms, and parameters to solve

    their problems?

    Simulation Requirements

    ProtoMol/MDSimAid Server

    Optimal parameters

    via XML

    Oursolution uses performance models and

    machine learning to generate rules, run-time optimization to fine tune suggestions.

    We want to use agents and machine

    learning to update the rules.

    This is documented in Ko (2002) and

    Crockeret al. (2004), J. Comp. Chem.

    Goal:

    Recommend optimal softwareand architectural parameters to

    solve particular problems

    Make this easily available as web

    portals