protein kinases there are 518 kinases encoded in the human genome, and they have been demonstrated...

1
Protein Kinases There are 518 kinases encoded in the human genome, and they have been demonstrated to play pivotal roles in virtually all aspects of cellular physiology 2 . Dysregulation of kinase activity has been implicated in pathological conditions ranging from neuronal disorders to cellular transformation in leukemias 3 . It is currently estimated that over a quarter of all pharmaceutical drug targets are protein kinases—an assessment that drives an eager search for new chemical scaffolds that have the potential to become drugs 4 . In the present work, the QM/MM hybrid method is tested in the affinity prediction of several protein kinase ligand complexes. The protein-kinase system under study was CDK2 in complex with two, diverse and related, set of compounds. Among the different driving forces controlling the protein– ligand binding process, the electrostatic component is one of the most relevant. This driving force can be quantitatively estimated through computational methods; and the hydrogen bond (H-bond) and electrostatic (or ionic) bonds form part of it. Nowadays, hybrid calculation approaches are the methods of choice for studying enzymatic catalysis or protein – ligand interactions in computer aided drug design. These approaches make use of the well established quantum mechanics (QM) methods along with one of the many molecular mechanics force fields available today. 1 The combination of these two powerful tools allows the computational (organic or medicinal) chemist to study and understand many biological processes in a relatively short time. Abstrac t Figure 3. CDK2/UCN-1 complex: Carbon atoms of ligand and protein are represented in yellow and grey, respectively. Hydrogen bonds between ligand and protein are represented as dashed yellow lines. Hinge region residues and some others 2.5 Å far from ligand are depicted in stick representation. Compounds Sets Effect of Protein-ligand Flexibility on QM/MM Scoring Function Concluding Remarks References QM/MM Scoring Function at near to X- ray Structures (1)Senn, H. M.; Thiel, W. Angewandte Chemie-International Edition 48, 1198- 1229 (2009) (2)Manning, G., Whyte, D.B., Martinez, R., Hunter, T. & Sudarsanam, S. The protein kinase complement of the human genome. Science 298, 1912–1934 (2002) (3) Hunter, T. The role of tyrosine phosphorylation in cell growth and disease. Harvey Lect. 94, 81–119 (1998). (4) Cohen, P. Protein kinases—the major drug targets of the twenty-first century? Nat. Rev. Drug Discov. 1, 309–315 (2002). (5) Fischmann, T. et al. Biopolymers 89, 372-379 (2007) (6) Dobes, P. et al. J. Comput. Aided Mol. Des. 25, 223–235 (2011) (7) QSite, version 5.5, 2009; Schrödinger, LLC: New York, NY. (8)Desmond Molecular Dynamics System, version 2.2;; D. E. Shaw Research, Schrödinger, LLC: New York, NY, 2009. E-mail: [email protected] Missing residues and loops on X-ray structures were added with Schrödinger’s Prime module. In order to incorporate protein-ligand flexibility, and to evaluate its impact on QM/MM scoring function, molecular dynamics of all inhibitors belonging to compound set B inside the CDK2 active site, were performed using the OPLS-AA force field in explicit solvent with the SPC water model (OPLS-AA/SPC), within the Desmond 8 package for MD simulations. The strength of protein-ligand complexes was evaluated based on a new quantum mechanics/molecular mechanics (QM/MM) scoring function, where inhibitors were treated with DFT methodology and the protein was modeled using OPLS-2005 force field. QM/MM single point calculations allowed us to correlate the potencies (measured as IC 50 or Ki values) of the studied inhibitors with QM/MM energies of the systems. The scoring function was tested in protein-ligand conformations near to X-ray crystal structure and MD derived conformations. The preliminary results suggested that correlations could be improved by averaging a representative set of MD protein-ligand conformations. Despite the small set of compounds tested in presented work, ongoing computational experiments on additional protein kinase-ligand systems showed that new Figure 2. Active site of CDK2 in interaction with compound 9 (PDB Code: 2R3M). Eleven compounds containing the related bicyclicheterocycles pyrazolopyrimidines and imidazopyrazines were ranked using QM/MM scoring function. Figure was taken from reference 5. Acknowledgments J.H.A.M. acknowledges the financial support through project FONDECYT Nº 11100177. J.C. thanks “Becas Universidad de Talca” for financial support through a doctoral fellowship. Figure 1, taken from http://www.icr.ac.uk Figure 4. Graphs of correlation between inhibitor’s biological activity (Ki or IC 50 ) and QM/MM interaction energy. A and B correspond to CDK2 in interaction with compound set B and, C and D correspond to CDK2 in interaction with compound set A. A and C. Only ligand was included in QM shell. B and D. Ligand, hinge region (backbone of residues Glu81-Asp86), side chain of residues Lys89, Asp86, Gly131, Lys33 and Asp45 were included in QM shell. An energy minimization of every protein-ligand complex was performed, with Embrace application within module Macromodel from Schrödinger Suite, before to apply QM/MM approach. To do this, the ligand and residues within were selected as a first shell, which was allowed to move during energy minimization process. A second restricted shell (force constant of 200 kJ/mol Å 2 was applied) was selected including residues within 10 Å surrounding first shell. The remaining atoms beyond this distance were considered frozen during minimization. The PRCG (Polak-Ribiere Conjugate Gradient) method was used for energy minimization, with 15000 maximum steps of iteration and a convergence threshold criterion of 0.05 kJ/mol. Subsequently, the QM/MM method was applied on each protein-ligand complex using the module Q-Site 7 from the Schrödinger Suite of computational programs. A single point energy calculation was performed on each system, where ligands were defined as part of the quantum mechanical (QM) region, and they were modeled at the density functional theory (DFT) level using the B3LYP/6-31G** method and basis set, respectively. The protein and water molecules were defined as part of the molecular mechanics (MM) region, and they were modeled with the OPLS-2005 force field. A default protocol, which ran six steps composed of minimizations and short (12 and 24 ps) molecular dynamics simulations, was applied to relax the model systems before performing the final long simulations. After that, 2 ns long equilibration MD simulation was performed on each complex system, and it was followed for a 10 ns long production MD simulation. First ten frames (selected according to lower RMSD values against corresponding X-ray crystal structure) from long production MD simulations on each protein-ligand complex were chosen to perform QM/MM study. Water solvent box and counter-ion atoms were deleted from the systems in order to save computational time. A QM/MM single point energy calculation was performed on each system according to partition scheme described before. Scoring function obtained was an average of ten frames for each protein-ligand system. Figure 5. A. Plot of QM/MM energy against Ln(Ki) for complete X-ray structures (missing residues and loops were completed with Prime) B. Plot of the averaged (10 frames from long MD simulations) QM/MM energy against Ln(Ki) for structures after RMSD clustering. Only ligand’s atoms were included in QM shell. C. Alignment of different conformations of compound Nu6127 (frame selected from MD simulation against X-ray structure 1E1X) within their binding site. Non polar hydrogens are omitted for sake of clarity. Figure 4. Time dependence of the RMSD for backbone from starting structures during equilibration process. RMSD for systems containing most active compounds Staurosporine, UCN-1 and NU6102 are represented in blue, red and green respectively. Exploring the Use of QM/MM Hybrid Methods in The Prediction of Protein-Ligand Binding Affinities Jans Alzate-Morales 1 , Iñaki Tuñon 2 , Julio Caballero 1 , Francisco Adasme 1 and Camila Muñoz 1 1 Centre for Bioinformatics and Molecular Simulations, School of Bioinformatic Engineering, University of Talca, Talca, Chile 2 Chemistry-Physics Department, University of Valencia, Valencia, Spain No. a PDB Entry QM/MM Interaction Energy (kcal/mol) Log(IC 50 ) IC 50 (nM) 2r3f (1) -83.28 6.30 500 2r3g (2) -83.07 6.10 800 2r3h (3) -47.16 4.70 20000 2r3i (4) -85.94 6.00 1000 2r3k (7) -89.37 7.00 100 2r3l (8) -84.40 7.00 100 2r3m (9) -100.67 8.00 10 2r3n (10) -84.30 7.15 70 2r3o (11) -88.16 6.22 600 2r3p (12) -94.35 6.05 900 2r3g (6) -80.19 6.52 300 Table 1. Compound set A taken from Fischmann et al. 5 Biological activity was converted to pIC 50 values and structures from PDB were prepared with Protein Wizard Preparation module from Schrodinger Suite in order to evaluated the affinity by means of QM/MM methods. a Compounds were numbered according to reference 5. PDB Entry QM/MM Interaction Energy (kcal/mol) Ln(Ki) a 1e1x -61.65 -13.55 2exm -58.91 -9.46 1h1s -106.06 -18.93 1h1p -62.55 -11.33 1pxp -74.29 -15.33 1pkd -88.00 -17.32 1pxj -62.07 -11.94 1pxl -64.27 -15.05 1pxm -72.57 -16.63 1a4l -79.33 -13.63 2fvd -105.90 -19.62 1aq1 -101.13 -19.66 1pxn -79.04 -16.47 2x1n -82.99 -17.59 1ogu -112.46 -17.55 Table 2. Compound set B compiled by Dobes et al. 6 Biological activity was converted to pKi values and structures from PDB were prepared as Compound set A in order to evaluated the affinity by means of QM/MM methods. a Ln(Ki) is the logarithm of inhibition constant (Ki in M) A B C D QM shell CDK2-Set A CDK2-Set B B3LYP/6-31G** B3LYP/6-31G+ +** B3LYP/6-31G** B3LYP/lacvp+ +** Minimization protocol 1 a Ligand 0.46 --- 0.64 0.61 Ligand+Resid ues 0.43 --- 0.61 --- Minimization protocol 2 b Ligand 0.60 0.59 0.72 0.71 Ligand+Resid ues 0.60 0.66 0.67 --- Table 3. Correlation coefficient values (R 2 ) obtained for different QM/MM partition schemes and using several protein-ligand minimization protocols and basis sets. a This protocol refers to using a RMSD value of 0.30 (with reference to X-ray structure) to minimize entire protein-ligand structures. b This protocol refers to minimize only a shell of 5 Å around the ligand as described before. A B C

Upload: dwight-adams

Post on 05-Jan-2016

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Protein Kinases There are 518 kinases encoded in the human genome, and they have been demonstrated to play pivotal roles in virtually all aspects of cellular

Protein Kinases

There are 518 kinases encoded in the human genome, and they have been demonstrated to play pivotal roles in virtually all aspects of cellular physiology2. Dysregulation of kinase activity has been implicated in pathological conditions ranging from neuronal disorders to cellular transformation in leukemias3. It is currently estimated that over a quarter of all pharmaceutical drug targets are protein kinases—an assessment that drives an eager search for new chemical scaffolds that have the potential to become drugs4. In the present work, the QM/MM hybrid method is tested in the affinity prediction of several protein kinase ligand complexes. The protein-kinase system under study was CDK2 in complex with two, diverse and related, set of compounds.

Among the different driving forces controlling the protein–ligand binding process, the electrostatic component is one of the most relevant. This driving force can be quantitatively estimated through computational methods; and the hydrogen bond (H-bond) and electrostatic (or ionic) bonds form part of it. Nowadays, hybrid calculation approaches are the methods of choice for studying enzymatic catalysis or protein – ligand interactions in computer aided drug design. These approaches make use of the well established quantum mechanics (QM) methods along with one of the many molecular mechanics force fields available today.1 The combination of these two powerful tools allows the computational (organic or medicinal) chemist to study and understand many biological processes in a relatively short time.

Abstract

Figure 3. CDK2/UCN-1 complex: Carbon atoms of ligand and protein are represented in yellow and grey, respectively. Hydrogen bonds between ligand and protein are represented as dashed yellow lines. Hinge region residues and some others 2.5 Å far from ligand are depicted in stick representation.

Compounds Sets

Effect of Protein-ligand Flexibility on QM/MM Scoring Function

Concluding Remarks References

QM/MM Scoring Function at near to X-ray Structures

(1) Senn, H. M.; Thiel, W. Angewandte Chemie-International Edition 48, 1198- 1229 (2009)(2) Manning, G., Whyte, D.B., Martinez, R., Hunter, T. & Sudarsanam, S. The protein kinase complement of the human genome. Science 298, 1912–1934

(2002)(3) Hunter, T. The role of tyrosine phosphorylation in cell growth and disease. Harvey Lect. 94, 81–119 (1998).(4) Cohen, P. Protein kinases—the major drug targets of the twenty-first century? Nat. Rev. Drug Discov. 1, 309–315 (2002).(5) Fischmann, T. et al. Biopolymers 89, 372-379 (2007)(6) Dobes, P. et al. J. Comput. Aided Mol. Des. 25, 223–235 (2011)(7) QSite, version 5.5, 2009; Schrödinger, LLC: New York, NY. (8)Desmond Molecular Dynamics System, version 2.2;; D. E. Shaw Research, Schrödinger, LLC: New York, NY, 2009.

E-mail: [email protected]

Missing residues and loops on X-ray structures were added with Schrödinger’s Prime module. In order to incorporate protein-ligand flexibility, and to evaluate its impact on QM/MM scoring function, molecular dynamics of all inhibitors belonging to compound set B inside the CDK2 active site, were performed using the OPLS-AA force field in explicit solvent with the SPC water model (OPLS-AA/SPC), within the Desmond8 package for MD simulations.

The strength of protein-ligand complexes was evaluated based on a new quantum mechanics/molecular mechanics (QM/MM) scoring function, where inhibitors were treated with DFT methodology and the protein was modeled using OPLS-2005 force field. QM/MM single point calculations allowed us to correlate the potencies (measured as IC50 or Ki values) of the studied inhibitors with QM/MM energies of the systems. The scoring function was tested in protein-ligand conformations near to X-ray crystal structure and MD derived conformations. The preliminary results suggested that correlations could be improved by averaging a representative set of MD protein-ligand conformations. Despite the small set of compounds tested in presented work, ongoing computational experiments on additional protein kinase-ligand systems showed that new QM/MM scoring function could prove to be useful in ranking and predicting the potency of new protein-kinase inhibitors.

Figure 2. Active site of CDK2 in interaction with compound 9 (PDB Code: 2R3M). Eleven compounds containing the related bicyclicheterocycles pyrazolopyrimidines and imidazopyrazines were ranked using QM/MM scoring function. Figure was taken from reference 5.

AcknowledgmentsJ.H.A.M. acknowledges the financial support through project FONDECYT Nº 11100177. J.C. thanks “Becas Universidad de Talca” for financial support through a doctoral fellowship.

Figure 1, taken from http://www.icr.ac.uk

Figure 4. Graphs of correlation between inhibitor’s biological activity (Ki or IC50) and QM/MM interaction energy. A and B correspond to CDK2 in interaction with compound set B and, C and D correspond to CDK2 in interaction with compound set A. A and C. Only ligand was included in QM shell. B and D. Ligand, hinge region (backbone of residues Glu81-Asp86), side chain of residues Lys89, Asp86, Gly131, Lys33 and Asp45 were included in QM shell.

An energy minimization of every protein-ligand complex was performed, with Embrace application within module Macromodel from Schrödinger Suite, before to apply QM/MM approach. To do this, the ligand and residues within 5Å were selected as a first shell, which was allowed to move during energy minimization process. A second restricted shell (force constant of 200 kJ/mol Å2 was applied) was selected including residues within 10 Å surrounding first shell. The remaining atoms beyond this distance were considered frozen during minimization. The PRCG (Polak-Ribiere Conjugate Gradient) method was used for energy minimization, with 15000 maximum steps of iteration and a convergence threshold criterion of 0.05 kJ/mol. Subsequently, the QM/MM method was applied on each protein-ligand complex using the module Q-Site7 from the Schrödinger Suite of computational programs. A single point energy calculation was performed on each system, where ligands were defined as part of the quantum mechanical (QM) region, and they were modeled at the density functional theory (DFT) level using the B3LYP/6-31G** method and basis set, respectively. The protein and water molecules were defined as part of the molecular mechanics (MM) region, and they were modeled with the OPLS-2005 force field.

A default protocol, which ran six steps composed of minimizations and short (12 and 24 ps) molecular dynamics simulations, was applied to relax the model systems before performing the final long simulations. After that, 2 ns long equilibration MD simulation was performed on each complex system, and it was followed for a 10 ns long production MD simulation.First ten frames (selected according to lower RMSD values against corresponding X-ray crystal structure) from long production MD simulations on each protein-ligand complex were chosen to perform QM/MM study. Water solvent box and counter-ion atoms were deleted from the systems in order to save computational time. A QM/MM single point energy calculation was performed on each system according to partition scheme described before. Scoring function obtained was an average of ten frames for each protein-ligand system.

Figure 5. A. Plot of QM/MM energy against Ln(Ki) for complete X-ray structures (missing residues and loops were completed with Prime) B. Plot of the averaged (10 frames from long MD simulations) QM/MM energy against Ln(Ki) for structures after RMSD clustering. Only ligand’s atoms were included in QM shell. C. Alignment of different conformations of compound Nu6127 (frame selected from MD simulation against X-ray structure 1E1X) within their binding site. Non polar hydrogens are omitted for sake of clarity.

Figure 4. Time dependence of the RMSD for backbone from starting structures during equilibration process. RMSD for systems containing most active compounds Staurosporine, UCN-1 and NU6102 are represented in blue, red and green respectively.

Exploring the Use of QM/MM Hybrid Methods in The Prediction of Protein-Ligand Binding Affinities

Jans Alzate-Morales1, Iñaki Tuñon2, Julio Caballero1, Francisco Adasme1 and Camila Muñoz1 1Centre for Bioinformatics and Molecular Simulations, School of Bioinformatic Engineering, University of Talca, Talca, Chile

2Chemistry-Physics Department, University of Valencia, Valencia, Spain

No.a PDB Entry

QM/MM Interaction Energy (kcal/mol)

Log(IC50) IC50 (nM)

2r3f (1) -83.28 6.30 500

2r3g (2) -83.07 6.10 800

2r3h (3) -47.16 4.70 20000

2r3i (4) -85.94 6.00 1000

2r3k (7) -89.37 7.00 100

2r3l (8) -84.40 7.00 100

2r3m (9) -100.67 8.00 10

2r3n (10) -84.30 7.15 70

2r3o (11) -88.16 6.22 600

2r3p (12) -94.35 6.05 900

2r3g (6) -80.19 6.52 300

Table 1. Compound set A taken from Fischmann et al.5 Biological activity was converted to pIC50 values and structures from PDB were prepared with Protein Wizard Preparation module from Schrodinger Suite in order to evaluated the affinity by means of QM/MM methods.a Compounds were numbered according to reference 5.

PDB EntryQM/MM Interaction Energy (kcal/mol)

Ln(Ki)a

1e1x -61.65 -13.55

2exm -58.91 -9.46

1h1s -106.06 -18.93

1h1p -62.55 -11.33

1pxp -74.29 -15.33

1pkd -88.00 -17.32

1pxj -62.07 -11.94

1pxl -64.27 -15.05

1pxm -72.57 -16.63

1a4l -79.33 -13.63

2fvd -105.90 -19.62

1aq1 -101.13 -19.66

1pxn -79.04 -16.47

2x1n -82.99 -17.59

1ogu -112.46 -17.55

Table 2. Compound set B compiled by Dobes et al.6 Biological activity was converted to pKi values and structures from PDB were prepared as Compound set A in order to evaluated the affinity by means of QM/MM methods. a Ln(Ki) is the logarithm of inhibition constant (Ki in M)

A B

C D

QM shell CDK2-Set A CDK2-Set B

B3LYP/6-31G**B3LYP/6-31G+

+**B3LYP/6-31G**

B3LYP/lacvp++**

Minimization protocol 1a

Ligand 0.46 --- 0.64 0.61Ligand+Residues 0.43 --- 0.61 ---Minimization protocol 2b

Ligand 0.60 0.59 0.72 0.71Ligand+Residues 0.60 0.66 0.67 ---

Table 3. Correlation coefficient values (R2) obtained for different QM/MM partition schemes and using several protein-ligand minimization protocols and basis sets. a This protocol refers to using a RMSD value of 0.30 (with reference to X-ray structure) to minimize entire protein-ligand structures.b This protocol refers to minimize only a shell of 5 Å around the ligand as described before.

A

B

C