topics to be covered introduction to protein folding mechanism of folding and misfolding groel...
TRANSCRIPT
Topics to be Covered
Introduction to Protein Folding
Mechanism of folding and misfolding
GroEL – biological machine (chaperones folding)
Molecular motors: Polymer physics and Myosin V motility
Many Facets of Folding
1. Structure Prediction
2. Protein & Enzyme Design
3. Folding Kinetics & Mechanisms
4. Crowding & confinement Effects
5. Relation to aggregation
6. Molecular Chaperones
7. Unfolded protein response (UPR)
Folding and clearance mechanisms are at the center stage
A Big Protein Folding Problem
Read the Genetic Code; Transcription; ProduceProteins, Function, Degradation
Length ≈ 220 nm ≈ 700 water
Size ≈ 22nmA very large protein in water – complexproblem indeed! (about 100,000 waters)
Pictures, Models, Approximations & Reality
A bit PhilosophyRich History in Condensed Matter physics & Soft Matter (Analytic Theory)
• Ising model for magnetic systems (Ni/also biology; 1920)
• Spin glasses – Edwards-Anderson model (CuMn alloy; 1975)
• Polymer statistics (Flory; 1948)
• Liquid Crystals (TMV) (Onsager 1949)
• BCS Theory (1956…)
Folding Kinetics
Experiments Theory
o Prot Engg (TSE)
o SAXS/NMR (DSE)
o FAST Folding (T jump; P JUMP; Rapid Mixing)
o SM FRET (Folding/ unfolding)
o LOT/AFM (Force Ramp Force Quench)
Statistical Mechanics (Energy Landscape)
Minimal Models (Lattice/Off-Lattice)
MD Simulations
Bioinformatics (Evolutionary Imprint)
Outline
How far can we go using polymer physics? (no force)
Toy models and generic lessons
Finite size effects: Universal relations
Bringing “specificity” back: Phenomenological Models
Many facets of Protein Folding
How does a chain (necklace with different shape pearls) fold up and how fast?Can things go wrong and then what?
As structuregets organizedEnergy gets lowered
Minimum Free Energy(water ions cosolvents)
Anfinsen over 50 yearsago; Nobel Prize 1972
Computational approaches to Biological problems: 2013 Nobel Chemistry
RNA and some Proteins
F
S
ΔFiNBA/ΔFij >> 1
I: Gradient to NBAdominates: Most likely event underfolding conditions
All other transitionsless likely.
Page 881 of Textbook Chapter 18
Approximation to Reality!
Another Nobel Protein! (GFP)
Not all molecules take the same route:Folding is stochastic! At least 4 classes of folding trajectories (Reddy)
Complicated Energy Function
Thermodynamics of src-SH3 folding
Green = UreaRed= MTM predictionsBlack = Experiments (Baker)
ΔGNU[C] = ΔGNU[0] + m[C]
m = (1.3 – 1.5) kcal/mol.M
Exp. m = 1.5kcal/mol.MExcellent Agreement!
Z. Liu, G. Reddy, E. O’Brien and dt PNAS 2011
Characteristic Temperatures in Proteins
HIGH Tor [C]
RandomCoil (Flory)
T T
Or [C]
T TF
[CF]
Compact
Native State
Rg ≈ aDN0.6
Rg ≈ aN N0.33
Foldable: = (T - TF)/T
small
Estimating Protein Size as a Function of N
High denaturant concentration (GdmCl or Urea)
Good solvent for polypeptide chain – may be!
Flory Theory: F(Rg) ≈ (Rg2/N2) + v(N2/Rg
d)
(see de Gennes book)
Rg ≈ aNν ν = 3/(d + 2)
Folded States Globular Proteins
• Maximally compact
• Largely Spherical
• Rg ≈ aN N(1/3)
• So size of proteins follow polymer laws – surprising!
Protein Collapse : Rg follows Flory law
RgU = 2N0.6
Rg = 3N1/3
Dima & dt JPCB (04)
Kohn PNAS (05)
“Unfolded”
Folded
Tetrahymena ribozyme(...difficult)
RNA Folding: Tetrahymena ribozymeRNA – Branched polymer
Ion valence size shape
Rg Scaling works for RNA too includingthe ribosome!
Size Dependence of RNA
Rg ≈ 5.5N(1/3)
Fairly decent (due to Hyeon)
Exponent mayBe larger..analogyTo branched polymers
Ben Shaul, Gelbart,Knobler
Illustrating Key ideas using Lattice models
Seems like an Absurd Idea! Role of non-nativeattractions
Multiple Folding Nuclei
Fast and slow tracks
K. A. Dill Protein Science (1995)
Blues Like Each other.They gain one unit of energy
Toy model:Explains proteinfolding
Even simplerFolded lower in energy byone unit
Multiple paths!
A simple minded approach
4 types of monomers(H, P, +, -)
Monomer has 8 beads
# of sequences = 48
(amylome)
# of conformations oncubic lattice = 1,841
http://dillgroup.org/#/code
HPSandbox
Order parameter description
Macroscopic System
Ferromagnetism MNematic Phases S = P2(cos)Smectic Phases S,tilt angle
Spin Glasses: M; qEA
Paramagnet M = 0. qEA = 0Spin Glass M = 0; qEA 0Ferromanet M 0; qEA 0
Physics dictates OP
Proteins a lot of choicesOP is in the eye of thebeholder= N/Rg
3 ; (overlap)“unfolded” (Small,big)
Compact non-native(O(1), big)
Native(O(1), small)
Other ChoicesHelix/sheet content;Distribution of contacts………
Folding reaction as a phase transition: A rationale N = number of amino acids
Order Parameter Description = N/Rg
3 ; = Overlap with NBA (0 for NBA)
Unfolded (U), Collapsed Globules (CG);Folded (NBA)U: (small), Large (“vapor”)
CG: ≈ O(1), Large (Dense no order “Liquid”)
NBA: ≈ O(1), Small (Dense order “Solid”)
Developing a “nucleation” picture
Free Energy of Creating a DropletG(R) ≈ -R3 + R2
Driving force + Opposing
What are these forces in proteins?
Driving force: Hydrophobic Collapse Burying H bonds Opposing: “Droplet with nonconstant ” Entropy loss due to looping
Tentative Models + Slight refinement
Cost of creating a region with NR
ordered residuesout of N?
Rugged Landscape with Many possibilities
Some phenomenological ModelsGBW(NR) -f(T)NR + a2NR
2/3
NR* (8a2/3 f(T))3
NR* too large for typical and f(T) values
GGT(NR) h(h - 1)NR2 + a2NR
2/3
NR* (8a2/h)3/4
NR* 15 or so…
Using experimental parameters NR
* 27 or so..
Folding trajectories to MFN to transition state ensemble (TSE)
Structures near Barrier top or TSESimulations
Moving from one scenario to another – pressure jump…
Refinement (Hiding Ignorance)
G(NR) -1NR + NR
+ S (loop)
d small barrier (downhill folding)
Surface tension cannot be a constantMultiple Folding Nuclei (StructuralPlasticity)
Multi-domain proteins involve interfaces between globular parts..
Finite Size Effects on FoldingOrder parameters matter
Scaling of C with N (number of aa)
Two points:
1) TF = max in (suceptibility)
= T(d<>/dh; h = ordering field (analogy to mag system) is dimensionless h ~ T (in proteins or [C])
2) Efficient folding TF T (collapse Temp; Camacho & dt PNAS (1993)) C controlled by protein DSE at T TF T
Rg ~ (T/TF)- ~ N (DSE a SAW & manget analogy)
T/TF ~ 1/N (Result I)
Finite-size effects on TF
T/TF ~ 1/N
Experiments
Lattice modelsSide Chains
Li, Klimov & DT Phys. Rev. Lett. (04)
Scaling of c with NMagnet-Polymer analogy
c= (TF/T) [TF(d<>/dT)] “disp in TF” X “suspectibility”
C N ; = 1 + (Universal); 1.2 Result II
T TF T
N
Universality in CooperativityLi, Klimov, dt PRL (04)
c ~ N
Experiments
Residue-dependent melting Tm-Holtzer Effect
Consequences of finite size
fm(Tmi) = 0.5
Lattice Models Side Chains
Klimov & dt J. Comp. Chemistry (2002)
Is the melting temperature Unique? Finite-size effects!
BBL
HoltzerLeucineZipperBiophys J1997
-hairpinPNAS 2000Klimov & dt
T large
MunozNature2006
UdgaonkarBarstar Monnelin
Residue dependent ordering Protein LO’Brien, Brooks & dt Biochemistry (2009)
Spread decreases asN decreases….finite-sizeeffects
Summary So Far – Really with little work on acomplex problem
• Sizes of single domain proteins (folded and unfolded) roughly follow Flory’s expectation
• Same holds good for RNA folded structures
• Nucleation Picture of Folding
• Finite size effects – theory matches experiments
Part II: Protein Folding Kinetics
Organization of structure
Fluctuations due to finite-sizeeffects
Changes in distributions at various stages of folding [C]
Or T
A Few Questions
• Mechanisms of Structural organization
• Nature of the Folding Nuclei
• Interactions that guide folding (native vs non-native)
• Folding rates – dependence on N
Illustrating Key ideas using Lattice models
Seems like an Absurd Idea! Role of non-nativeattractions
Multiple Folding Nuclei
Fast and slow tracks
K. A. Dill Protein Science (1995)
Stages in folding
RandomCoil
“SpecificCollapse”
Native StateC F
F/C (100 - 1000)
F
C
Camacho and dt, PNAS (1993)
dt J. de. Physique (1995)
Need for Quantitative Models
Using mechanicalforce to triggerfolding
smFRET trajectories
Fernandez, Rief.. Hyeon, Morrison, dt
Eaton, Schuler, Haran…
Non-native interactions early (time scales of collapse) in folding;
Subsequently native interactionsdominate Camacho & dt Proteins22, 27-40 (1995);Cardenas-Elber (all atom simulations)
Dill type HP modelBeads on a lattice
Native Centric (or Go)models appropriate!
Multiple protein folding nuclei and the transition state ensemble in two state proteins‐
Proteins: Structure, Function, and BioinformaticsVolume 43, Issue 4, pages 465-475, 17 APR 2001 DOI: 10.1002/prot.1058http://onlinelibrary.wiley.com/doi/10.1002/prot.1058/full#fig5
LMSC ExactEnumeration
MC simulations;600 folding Trajectories;Folding time:
A/AGO ≈ 3
Klimov and dt (2001)
Transition State Ensemble: Neural Net
Go
Klimov anddt Proteins 2001
ES NSB 2000
Equivalentto pfold
Multiple protein folding nuclei and the transition state ensemble in two state proteins‐
Proteins: Structure, Function, and BioinformaticsVolume 43, Issue 4, pages 465-475, 17 APR 2001 DOI: 10.1002/prot.1058http://onlinelibrary.wiley.com/doi/10.1002/prot.1058/full#fig9
Multiple Channels CarryFlux to the NBA
Multiple Transition StatesConnecting these Channels Bottom line:
To get semi-quantitativeresults Go-type modelsMay be enough…
Folding Rate versus N
kF ≈ k0 exp(-Nβ) with β = 0.5
Barriers scale sublinearly with N
Proteins: Hydrophobic residues buriedIn interior (chain compact); Polar and charged residues want solvent exposure(extended states). Frustration betweenConflicting requirements.
P(ΔG♯) ≈ exp( - (ΔG♯)2/2N)
<ΔG♯> ≈ N0.5 (Analogy to glasses)
Fit to Experiments (80 Proteins Dill, PNAS 2012)
Reasonable givendata from so many differentlaboratories
Even better for RNA (Hyeon, 2012)
At high [C] is DSE a Flory Coil?It appears that high [C] is a Θ-solvent!
P(x) ~ xexp(-x1/(1-))
Proteincollapse
CT =(C - Cm)/C
= 2 + (γ-1)/ν
O’BrienPNAS 2008
Toy Model (Is the fibril structure encoded in monomer spectrum) Prot Sci 2002; JCP 2008
4 types of monomers(H, P, +, -)
Monomer has 8 beads
# of sequences = 48
(amylome)
# of conformations oncubic lattice = 1,841
Structure of “protofilament” + “fibril”Single and double layer
Interplay of E+- and EHH
a: Monomers parallelb: Monomer alternatec: Double layerd: No fibril compact
Optimal growth tempfib = (104 - 10n)F
Largest n about 9
Seeding speeds up fibrilrate formation
Growth rate depends on N* population PN*
Depends on sequence
Sequence + N* ensemblefibril kinetics monomerlandscape encodes structure + growth rate
Lifshitz-Slyazov Growth Law Supersaturated solution
J. Phys. Chem. Solids (1961)
G 0M1/3
Large clustersincorporate small oligomers
M Mn* [ PF Fibrils]