application of quantum mechanics and molecular mechanics...

23
Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics Natalia Sizochenko, D. Majumdar, Szczepan Roszak, and Jerzy Leszczynski Contents Introduction ......................................................................................... 2 Quantum Chemical Parameters for Chemoinformatics ........................................... 4 Atomic Charges .................................................................................. 4 Other Charge-Related Properties ............................................................... 5 Total and Orbital Energies and Related Properties ............................................. 9 Molecular Surface-Related Properties .......................................................... 13 Quantum Chemical Techniques as Source of Parameter Generations ........................... 17 Molecular Mechanics and QSAR .................................................................. 19 Conclusions ......................................................................................... 20 Bibliography ........................................................................................ 20 Abstract Quantum chemical and molecular mechanics-generated structure and reactivity parameters comprise a part of chemoinformatics, where such parameters are stored and properly indexed for search-information of a related molecule or a set of molecular systems. The present review makes a general survey of the various computable quantum chemical parameters for molecules. These could be used for quantitative structure activity relation (QSAR) modeling. The applicability of various quantum chemical techniques for such property (QSAR parameters) is also discussed and density functional theory (DFT)-related techniques have been advocated to be quite useful for such purposes. Molecular mechanics N. Sizochenko • D. Majumdar () • J. Leszczynski () Department of Chemistry, Jackson State University, Jackson, MS, USA e-mail: [email protected]; [email protected]; [email protected] S. Roszak Advanced Materials Engineering and Modelling Group, Faculty of Chemistry, Wroclaw University of Technology, Wroclaw, Poland e-mail: [email protected] © Springer Science+Business Media Dordrecht 2016 J. Leszczynski et al. (eds.), Handbook of Computational Chemistry, DOI 10.1007/978-94-007-6169-8_52-1 1

Upload: others

Post on 24-Dec-2019

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanicsand Molecular Mechanics in Chemoinformatics

Natalia Sizochenko, D. Majumdar, Szczepan Roszak, and JerzyLeszczynski

Contents

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Quantum Chemical Parameters for Chemoinformatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Atomic Charges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Other Charge-Related Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Total and Orbital Energies and Related Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Molecular Surface-Related Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Quantum Chemical Techniques as Source of Parameter Generations . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Molecular Mechanics and QSAR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

AbstractQuantum chemical and molecular mechanics-generated structure and reactivityparameters comprise a part of chemoinformatics, where such parameters arestored and properly indexed for search-information of a related molecule or a setof molecular systems. The present review makes a general survey of the variouscomputable quantum chemical parameters for molecules. These could be usedfor quantitative structure activity relation (QSAR) modeling. The applicabilityof various quantum chemical techniques for such property (QSAR parameters)is also discussed and density functional theory (DFT)-related techniques havebeen advocated to be quite useful for such purposes. Molecular mechanics

N. Sizochenko • D. Majumdar (�) • J. Leszczynski (�)Department of Chemistry, Jackson State University, Jackson, MS, USAe-mail: [email protected]; [email protected]; [email protected]

S. RoszakAdvanced Materials Engineering and Modelling Group, Faculty of Chemistry, WroclawUniversity of Technology, Wroclaw, Polande-mail: [email protected]

© Springer Science+Business Media Dordrecht 2016J. Leszczynski et al. (eds.), Handbook of Computational Chemistry,DOI 10.1007/978-94-007-6169-8_52-1

1

Page 2: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

2 N. Sizochenko et al.

methods, although mostly useful for less time consuming structure calculationsand important in higher level molecular dynamics and Monte-Carlo simulations,are sometimes useful to generate structure-related descriptors for QSAR analysis.A brief discussion in this connection with molecular mechanics-related QSARmodeling is included to show the use of such descriptors.

Introduction

Chemoinformatics (also known as cheminformatics and chemical informatics) is theuse of computer informational techniques applied to a range of problems in the fieldof chemistry. These in silico techniques are used mostly in drug discovery but canalso be used in chemical and allied industries which include paper, pulp, dyes, andsuch other industries. The term chemoinformatics was first properly defined by K.F.Brown in 1998 (Brown 1998), and according to this definition it is described as themixing of information resources to transform data into information and informationinto knowledge for the intended purpose of making better and faster decisions inthe area of drug lead identification and optimization. This basic definition was laterextended to the other allied chemical fields. It combines the scientific working fieldsof chemistry, computer science, and information science in the areas of topology,chemical graph theory, information retrieval, and data mining in the chemical space(Leach and Gillet 2007).

The main goal in chemoinformatics involves storage, indexing, and search ofinformation of the related compounds. The related search topics include unstruc-tured data mining (information retrieval and extraction), structured data mining(database, graph, molecule, sequence and tree mining), and digital libraries. Thein silico representation of chemicals (involving structure and properties) are usu-ally stored in large chemical databases. They also include visual representationsin two and three dimensions for studying physical interactions, modeling, anddocking. These chemical databases include virtual libraries (data for real andvirtual molecules) and facilities for computationally screening in silico librariesof compounds (commonly known as virtual screening). The results are used forthe well-known quantitative structure-activity relationship (QSAR) studies for theprediction of specific activity of compounds.

Since the inception of QSAR technique (Hansch 1969), electronic parametersfrom molecular structures were incorporated in the correlation through Hammettsubstituents constants (Hammett 1937; Hansch et al.1991). It was possible sincethe molecules used were aromatic systems with substituents at meta- and para-positions. Hansch (1969) used the parameters generated through such analysis asmeasure of the electronic characteristics of molecules. An enormous number ofQSAR equations have been reported in literature, many having functional formmuch more complicated than the original Hansch equation. Various parametershave been used in QSAR equations in this respect, most of which were designedto represent the hydrophobic, electronic, and steric characteristics of molecules.

Page 3: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Extensive tables have been published with values of these common parameters fora wide variety of substituents.

Modern state of the art quantum chemical techniques provide meaningfulrepresentation of structures and are source of related physical and electronicproperties of molecules from their calculated wave functions. The usual electronicdescriptors of molecules are atomic charges, dipole moment, electron affinity,ionization potential, and many other electronic properties related to orbital energiesand molecular surfaces (e.g., molecular electrostatic potential (MEP)), and specificand thermodynamic parameters (related to the conformational properties andreactivities). These are of potential importance in QSAR analysis and are usuallystored in chemoinformatics libraries as basic molecular information. Molecularmechanics (MM) calculations are also a useful tool for the structure calculations ina less time consuming way (with lower accuracy) but are not useful for generationof electronic parameters like quantum chemical methods. The main importanceof various MM approaches actually lies in higher level molecular dynamics andMonte-Carlo simulations. Still in some specific cases, MM has sometimes been usedto generate structure-related descriptors for QSAR methods.

A relation between the quantum chemically/MM derivable properties (usuallystored in chemoinformatics libraries) and the related mathematical modeling areshown in Fig. 1. The mathematical models are usually of statistical origin and

ChemoinformaticsDatabase

3DMolecularStructure

Atomic,Molecular,

and SurfaceProperties

MathematicalTransformation

Fig. 1 The use of derived molecular properties (quantum chemical origin) for mathematical drugdesign is schematically shown in the flow chart. The properties are either from chemoinformaticsdata base or through direct calculations

Page 4: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

4 N. Sizochenko et al.

correlate the observed biological activities with molecular descriptors to understandtheir specific properties. They are also useful to predict the properties of designeddrug molecules within the statistical limitations of the derived models. In thepresent review, we will discuss the connection of such in silico databases inchemoinformatics used in drug discovery only. The present review would involvebrief discussions on the usefulness of various quantum chemical parameters relatedto the biological properties of molecules (to be used in QSAR) and a brief résumé ofthe techniques in generating such quantum chemical parameters. In the last sectionof this review, a brief discussion has been made on the usefulness of MM-derivedparameters for QSAR.

Quantum Chemical Parameters for Chemoinformatics

Atomic Charges

The quantum chemically computed atomic charges are one of the most fundamentalparameters to be used in QSAR. This is because electrostatic interactions arecentral in controlling most of the chemical/biological reactions. The advantageof a quantum chemical technique is that it can compute the net charges on eachatom in a molecule through gross population analysis. These parameters are alsoimportant to investigate surface charge characteristics of a molecule. There arevarious techniques to generate atomic charges through population analysis. Thesuccessful techniques amongst the oldest ones are Mulliken and Löwdin populationanalysis. In Mulliken’s method (Mulliken 1955), the electron population (�A) on anatom A of a molecule is defined in terms of one electron density matrix (D��) andoverlap matrix (S��) as

�A D

AOX

�2A

AOX

v

D�vS�v (1)

The diagonal term D��S�� represents the number of electrons in the �th atomicorbital (AO) and the off-diagonal D��S�� is (half) the number of electrons shared bythe AOs �� (there is an equivalent D��S�� element as the matrix is diagonal). Thegross charge on atom A is defined as the sum of nuclear and electronic contributions.

QA D ZA � �A (2)

The Löwdin technique (Löwdin 1970) uses S1/2D S1/2 matrix for populationanalysis. This is equivalent to a population analysis of the density matrix in theorthogonalized basis set by transforming the original set of functions by S�1/2. Theelectron population associated with an atom is obtained through the equation:

Page 5: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 5

QA D ZA �

AOX

�D1� on A

�S1=2DS

1=2

��(3)

The Mulliken and Löwdin population give different atomic charges, but math-ematically there is nothing to indicate which partition gives better result. Morerealistic charge analysis, from a chemist’s point of view, is obtained from thenatural bond orbital (NBO) analysis by Weinhold and coworkers (Weinhold andLandis 2012). The method uses one-electron density matrix for defining the shapeof atomic orbitals in molecular environment and molecular bonds from electrondensity between atoms. The density matrix D is defined in terms of blocks of basisfunctions belonging to a particular center. The natural atomic orbitals (NAO) inthe molecular orbital are defined as those which diagonalize the original blocks inthe D matrix. These NAOs are, in general, not orthogonal and are orthogonalized toachieve well-defined division in electrons. The natural atomic charges are computedas the difference of nuclear charge and the total natural population of NAOs (onthe atoms). There are several other atomic charge calculation techniques, viz.,electrostatic potential derived charges (Bayly et al. 1993) and populations derivedfrom atoms in molecule (AIM) theory (Popelier 2000). They could be used in QSARwith varying successes, but the most recommended charges are the natural atomiccharges through NBO analysis.

The atomic charges are usually used as the static chemical reactivity indices(Franke 1984). Since electrostatic interactions play important role in a chemicalreaction, the orientations of the interacting moieties could be decided from the ¢ and -electron densities and in such cases they could be used as directional reactivityindices. These orientation natures of the interacting moieties could, of course, bebetter interpreted from the molecular electrostatic potential (MEP) of the molecules.The atomic charges have been used to explain various other properties also. Forexample octanol-water partition coefficients of organic compounds (Karelson et al.1996; Ghose et al. 1988), prediction of anti-HIV-1 activities of HEPT-analogcompounds (Alves et al. 2000), hydrogen-bond donor strengths (Schwöbel et al.2009), and atomic charge selectivity for optimal ligand-charge distribution at proteinbinding sites (Bhat et al. 2006) are a few to be mentioned. Atomic charges are alsoused as descriptors of molecular polarity but again charge derived properties likeelectrostatic moments would be a better choice.

Other Charge-Related Properties

These properties include frontier orbital densities, superdelocalizability, dipolemoment, polarizability, and hyperpolarizability. The highest occupied molecularorbital (HOMO) and the lowest unoccupied molecular orbital (LUMO) are usuallytermed as frontier orbitals (Fukui 1971). These orbitals are the key factors inexplaining many organic reactions. Specifically, pericyclic reactions (Woodward

Page 6: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

6 N. Sizochenko et al.

and Hoffmann 1969) were explained from symmetries of the HOMO–LUMOorbitals. The frontier orbital densities were designed later as QSAR descriptor forelectrophilic (frE) and nucleophilic (frN) centers in terms of HOMO (CHOMO,n) andLUMO (CLUMO,n) coefficients of the atomic orbital of the nth atom X and weredefined as

P.CHOMO;n/

2 andP.CLUMO;n/

2, respectively. The superdelocalizabil-ities and atom–atom polarizabilities are originated through Hückel theories and arenot of much use in recent times in regular quantum chemistries, although they aresometimes used as reactivity descriptors in QSAR analysis (Karelson et al. 1996).

Dipole moment (�) of a molecule could be used as a polarity descriptor inchromatographic observations of molecular polarity accounts for chromatographicretention on a polar stationary phase (Grunenberg and Herges 1995). Dipolemoment of a molecule has several other importance, viz., it is an importantparameter to account for the solvation properties of molecules (Milischuk andMatyushov 2002; Jensen 1999). In the case of directional orientation of moleculesduring interactions, the dipole–dipole interactions become quite important in totalelectrostatic interactions, if the components have non-zero dipole moments. Thehigher moments of the molecular systems also play important role in shaping theorientation as well as the nature of interactions in specific cases. The importanceof higher moments is seen in cation-  interactions (Ma and Dougherty 1997). Itis a very widely studied biological phenomena and the nature of interactions arebelieved to be controlled through electrostatics. Interactions through charge-dipole(C-D) and charge-quadrupole (C-Q) were believed to be the controlling features insuch interactions. Recent studies on model cation-  interactions have shown (Kimet al. 1994; Kadlubanski et al. 2013) that octopole moments also have an importantrole in such interactions.

The model interactions were investigated for Mg2C, Ca2C, and NH4C ions

(acceptors) interacting with benzene, p-methylphenol, and 3-methylindole (donors)(Kadlubanski et al. 2013). The calculated strong electron sharing in the complexesshowed that they might have considerable influence in shaping the contribution oftotal electrostatic interactions between the donors and acceptors. Figures 2 and 3represent the natures of three of such multipolar components of the electrostaticinteraction energies in Mg2C, Ca2C, and NH4

C complexes as a function of variousinteraction distances. The total electrostatic interaction energies (T-El) together withcharge-charge (C-C), dipole-charge (D-C), quadrupole-charge (Q-C), and octopole-charge (O-C) components were found to be important and presented in the figures.The influences of the other multipolar components were found to be negligible. Theresults show that the total interaction energy curves, with few exceptions, are eitherrepulsive or dispersive in nature. The T-El curves of Mg2C complexes are bound (i.e.having a minimum) while the similar curves for the Ca2C-complexes are dispersivefor benzene and 3-methylindole (Fig. 2a, c). In this respect, the interactions of p-methylphenol : : :Ca2C show a weak minimum (Fig. 2c). The nature of the curvesrepresenting the Q-C and O-C components of both the complexes as represented inFig. 2b, d show that they have either repulsive or attractive contributions to the totalelectrostatic interactions in a significant way around the equilibrium distances andthus revealing the importance of such interactions in strong cation-  complexes.

Page 7: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 7

1500

1000

500

–500

–1000

–1500

0

1500

1000

500

–500

–1000

–1500

–2000

0

Bz(T-El)Bz (D-C)

Bz (D-C)

Bz (T-El)Bz (C-C)PMP (T-El)PMP (C-C)3MI (T-EI)3MI (C-C)

Bz (Q-C)Bz (O-C)PMP (D-C)

PMP (Q-C)PMP (O-C)3MI (D-C)3MI (Q-C)3MI (O-C)

Bz (Q-C)Bz (O-C)PMP (D-C)PMP (Q-C)

PMP (O-C)3MI (D-C)3MI (Q-C)3MI (O-C)

PMP (T-El)3MI (T-El)Bz (C-C)PMP (C-C)3MI (C-C)

–100

–200

–300

200

100

0

1 2 3 4 5 6 7 1 2 3 4 5 6 7

1 2 3 4 5 6 71 2 3 4 5 6 7

200a

c

b

d

150

100

50

–50

–100

0

DEel

int (

kcal

/mo

l)DE

elin

t (kc

al/m

ol)

r (Å) r (Å)

Fig. 2 Plots of total electrostatic interaction energies (T-El) and their multipolar components asa function of r (interaction distance) for the complexes of Mg2C and Ca2C ions with benzene(Bz), p-methylphenol (PMP), and 3-methylindole (3MI). Panels (a) and (b) represents the curvesfor the Mg2C complexes, while panels (c) and (d) are for the Ca2C ion complexes. In the figures,C-C, D-C, Q-C, and O-C represent the multipolar electrostatic interaction energy components (C-C charge-charge, D-C dipole-charge, Q-C quadrupole-charge, O-C octopole-charge) (Reproducedfrom Kadlubanski et al. 2013, with the kind permission of the American Chemical Society 2013)

The D-C and (C-C) components of these metal complexes also behave similarly(Fig. 2a–d) to mostly offset the attractive components to generate overall shape ofthe T-El curves.

The curves representing the electrostatic components of the cation-  complexesof NH4

C ion show an altogether different nature (Fig. 3). The D-C components arebound around the equilibrium interaction distances (r) (Kadlubanski et al. 2013).The rest of the components around the equilibrium r behave in such a way thatthe T-El curves are dispersive, although such components sometimes show boundcharacter at larger r (not of much significance here). The differences of the natureof the curves of NH4

C : : : aromatic complexes (Fig. 3) with respect to those of

Page 8: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

8 N. Sizochenko et al.

400a b

c 800

600

400

200

–200

–400

–600

–8002 3 4 5 6 7 8

0

800

600

400

200

–200

–400

–600

–800

0

T-EIC-CD-CQ-CO-C

T-EIC-CD-CQ-CO-C

T-EIC-CD-CQ-CO-C

300

200

100

–100

–200

–3002 2.5 3 3.5 4 4.5 5 5.5 6 2 2.5 3 3.5 4 4.5 5 5.5

0

DEel

int (

kcal

/mo

l)

DEel

int (

kcal

/mo

l)

r (Å) r (Å)

r (Å)

Fig. 3 Plots of total electrostatic interaction energies (T-El) and their multipolar components asa function of interaction distances for the complexes of NH4

C ion with benzene (Bz) (panel a),p-methylphenol (PMP) (panel b), and 3-methylindole (3MI) (panel c). See Fig. 2 for the definitionof the curves representing the cases of C-C, D-C, Q-C and O-C (Reproduced from Kadlubanski etal. 2013, with the kind permission of the American Chemical Society 2013)

the metal ion complexes could be due to the anisotropic potential expansion ofthe NH4

C ion. Similar studies were carried out using the dihydrated species ofthe metal cations (Kadlubanski et al. 2013). Although dihydrated metal ions areexpected to have some anisotropic character in this respect, the curves showed theyare not strong enough to make too much difference with respect to non-hydratedcases (where the metal ions are strictly isotropic). Thus it is evident from thesecurves that the multipolar components of electrostatic interactions play a majorrole in the formation as well as controlling the nature of interactions in the cation-  complexes. The rationale of this discussion is that although dipole moment is

Page 9: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 9

common descriptor in QSAR studies, the higher moments should also be treated aseffective descriptors in many important biological phenomena.

The quadrupole moment gives a rough approximation of the molecular volume,but this volume parameter can be observed through the effect of electric field ondipole moment (induced dipole moment) of a molecule. Generally the induceddipole moment � could be written as:

� D �0 C ˛ij Vj C1

2ˇijkVj Vk C

1

6�ijklVj VkVl � � � (4)

The terms �0 and ˛ denote dipole moment and polarizability, ˇ and � are the higherorder polarizabilities. The Vj, Vk : : : are the components of the applied field V. Theterm ˇ and � are not known to make contributions in QSAR studies, but the term ˛

is an important descriptor. There are two such descriptors, isotropic and anisotropic˛ and they could be written as (Martin et al. 1979):

˛isotropic D1

3

�˛xx C ˛yy C ˛zz

�(5)

and

˛anisotropic D1p2

"�˛xx � ˛yy

�2C .˛xx � ˛zz/

2 C�˛yy � ˛zz

�2

C 6�˛2xy C ˛

2xz C ˛

2yz

�#1.

2(6)

The ˛isotropic is in a sense related to the molar volume (as it is expressed in termsof Å3) and has been found to be related to hydrophobicity and thus to otherrelated biological properties (Hansch and Coats 1970). Furthermore, the electronicpolarizability of molecules shares common features with the electrophilic superde-localizability (Lewis 1987). This parameter also contains information regarding theinductive effects in a molecule (Gaudio et al. 1994). The anisotropic term is alsoexpressed in volume units (Å3) and characterizes the properties of a molecule as anelectron acceptor (Cartier and Rivail 1987).

Total and Orbital Energies and Related Properties

The total energy (E) of a molecule is generally used to account for several importantmolecular properties which include ionization potential (IP), electron affinity (EA),protonation energy, etc. The thermochemistry of a system is also important andthe related parameters are computed in quantum chemical calculations as part ofvibrational frequency calculations, as the partition functions needed for Gibbs freeenergy (G), enthalpy (H), and entropy (S) calculations are related to the molecularvibrations. The changes of these thermodynamic parameters, viz., �G, �H, �S,are important in accounting for experimental interaction energies between two

Page 10: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

10 N. Sizochenko et al.

molecular systems, for heat of reactions, and to calculate reaction rates. The lastparameter is not probably that important in QSAR, but the two others are quiteimportant. The IP and EA of a molecule are computed as the difference of energybetween the neutral and the corresponding cationic and anionic species, respectively.The HOMO (EHOMO) and LUMO (ELUMO) energies also account for the IP and EAof a molecule (Koopman’s theorem), but the accuracies of such results are very lowand are not used any more.

The important applications of EHOMO and ELUMO are to account for reactivity ofmolecules has already been discussed in connection with frontier electron densities.But the conception of hardness and softness are considered as more reliabledescriptors for reactivities and they are also derived through EHOMO and ELUMO.Parr and Pearson provided the analytical definitions of hardness () and softness (S)as (Parr and Pearson 1983):

D1

2

�@2E

@N2

v��r� D

1

2

�@�

@N

v.r/

(7)

S D1

2D

�@N

@�

v.r/

(8)

where E is the total energy, N is the number of electrons of the chemical species,and � represents the chemical potential, which is identified as negative of theelectronegativity (Parr and Pearson 1983), and v(r) is the external electrostaticpotential of an electron at r due to the nuclei. The operational form of and Scould be written in terms of IP and EA as (Parr and Pearson 1983; Roy et al. 1998):

DIP �EA

2IS D

1

IP �EA(9)

These definitions of hardness and softness lead to the hard and soft acid–base theory,which is quite useful to explain chemical reactions. According to this principle, allfactors being equal, hard bases prefer to react with hard acids and soft bases prefersto react with soft acids, and atomic and molecular systems were classified as hardand soft systems based on equation (9). The concept was further extended to explainsite-specific nucleophilic, electrophilic, and radical reactions through developmentof the concepts of local softness and hardness (D(r) and Fukui functions (f (r)). Thelocal softness (s(r) is defined as (Roy et al. 1998):

s.r/ D f .r/S D

�@�

@v.r/

N

S (10)

Three types of f (r) can be defined, which when multiplied by S generates three typesof local softness, viz., sCk , s�k , and s0

k for the nucleophilic, electrophilic, and radicalattack, respectively, on the atom center k of a molecule.

Page 11: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 11

sCk D Œ�k .N0 C 1/ � �k .N0/ S

s�k D Œ�k .N0/ � �k .N0 � 1/ S

s0k D12Œ�k .N0 C 1/ � �k .N0 � 1/ S

9=

; (11)

�k(N0) represents electronic population on atom k for N0 electron system. The localhardness .e .r// is defined using electron density �(r) as:

e .r/ D�@�

@� .r/

v.r/

(12)

The local hardness is usually derived in working form (D(r)) from this formulausing Thomas-Fermi-Dirac (TFD) approach to DFT using the electron density ¡ as:

D .r/ DVel .r/

2N(13)

where Vel(r) is the electronic part of the electrostatic potential. The electron density� is embedded in the explicit expression of D(r) (Roy et al. 1998) and is not shownhere.

The applicability of hard and soft acid–base (HSAB) principle together with theirlocal counterparts (hardness and softness) has been discussed in details by Rooset al. (2009). The authors have discussed the use of these parameters in enzymecatalysis, which include phosphoenolpyruvate (the cofactor of several syntheses),the disulfide cascade mechanism in arsenate reductase, the preferential attack ofcys10 on dianionic arsenate, mixed disulfide dissociation mechanism of thioredoxin,and hydride transfer in flavoenzymes. The local softness and electrophilicityparameters were found to be quite effective in explaining site-specific reactivityinvolved in these complex reactions. QSAR combined with DFT concepts have beenapplied in histone deacetylase (HDAC) inhibitors as they are very promising targets

His I80

HDAC inhibitor

Zn2+

Asp178

Asp267

Fig. 4 Ribbon diagram of the active site of human histone deacetylase (HDAC8) (PDB: 1 W22)with a bound hydroxamic acid inhibitor (Reproduced from Roos et al. 2009, with the kindpermission of the American Chemical Society 2009)

Page 12: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

12 N. Sizochenko et al.

as anticancer drugs. The active site HDAC consists of Zn2C ion (Fig. 4). The generalQSAR analysis in this respect was found to be qualitative in nature (Chattarajet al. 2006). The introduction of electrophilicity concept in QSAR (Chattaraj et al.2006) increased the interpretative character and chemical hardness was found to beuseful descriptor in this respect, as the hardness of the zinc binding group of HDACcorrelated well with the calculated relative interaction energies of these inhibitors(Roos et al. 2009). In a recent review (Schwöbel et al. 2011), the importance ofchemical hardness descriptors has been discussed for the electrophilic reactivity intoxicity prediction. An electrophilic index !el (D �2/2) has been proposed to beimportant descriptor for the QSAR analysis of such reactions.

Hardness and electronegativity parameters have been further used to developdescriptors for hydrogen bond basicity and acidity for applications in QSAR.Two theoretical scales, quantifying the hydrogen bond ability of compounds, areproposed: hydrogen bond basicity, B*, and hydrogen bond acidity, A* (Oliferenkoet al. 2004). Both of these scales are based on clear physical considerationsand use molecular topology and orbital valence-state energetic characteristics asparameters. These parameters are quantitatively defined including the geometricfeatures originating through molecular graphs. The total hydrogen bond basicityB* is defined as:

B� Dlp

�lpn�

DX

i

��i � �ref1

�i d2i n�

(14)

where �lp and lp are the lone pair electronegativity and corresponding hardnessvalues, n* is the effective principal quantum number, �ref

1 is the reference elec-tronegativity of the basic center, ��i , �i represent electronegativity and hardnessof the � -orbitals, and di is the topological distance for the ith substituent atom. Thesummation runs over all atoms positioned at particular topological distances fromthe basic center D, where D is the diameter (the shortest maximum length) of thecorresponding molecular graph. The total hydrogen bond acidity A* is defined usinga similar set of parameters in the following way:

A� D��i � �

ref1

�i D21H

C

NX

iD2

��i � �refi

�i d2iH

(15)

where ��i and �i are the electronegativity and hardness of the sigma orbital of theacidic center and the ith atom, respectively, �ref

i is reference electronegativity ofthe ith atom, D1H is the geometric distance between the hydrogen and the nearestatom, and diH is the topological distance between ith and the hydrogen atom. Thesenew HB descriptors were found to significantly improve the performance of QSARsoftware and be useful in molecular modeling and drug design. The latter becomespossible if one considers Ai

* and Bi* values of each atom in the molecule rather

than the overall molecular features. Two sets of such atomic values, fAi*g and

Page 13: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 13

fBi*g, define two scalar potential fields of hydrogen bonding which can be used as

molecular fields in comparative molecular field analysis (CoMFA) and/or scoringfunctions in ligand docking.

Molecular Surface-Related Properties

Molecular Electrostatic PotentialMolecular complementarity plays a central role in the molecular recognitionprocess, and molecular electrostatic potential (MEP) is one of the oldest knownproperties of molecular systems to interpret such interactions. Since the introductionof the MEP by Scrocco et al. (Scrocco and Tomasi 1973), different investigatorshave applied this property for predicting biological recognition processes and it hasbeen shown that an MEP profile leads to direct inferences about the nature of thecorresponding binding site (Scrocco et al. 1973) and about interaction between thesesites and an approaching molecule. The MEP of molecule V

�Er�

at a point Er isdefined as:

V�Er�DX

A

ZAˇ̌ˇ ERA � Er

ˇ̌ˇ�

Z��Er1�

ˇ̌Er1 � Er

ˇ̌d Er (16)

The MEP has been used for molecular recognition process of drug moleculesand to investigate the nature of long-range interactions with the correspondingdrug-binding/receptor sites (Scrocco and Tomasi 1978). This approach was usedto account for the binding properties of the neurotransmitter ”-aminobutyric acid(GABA) (Guha et al. 1992), cardiotonic drugs (Bhattacharjee et al. 1992), variousnucleic acid bases (Weiner et al. 1982), and various other molecular systems(Scrocco and Tomasi 1978; Weiner et al. 1982). In recent times, MEP analysis havebeen found to be quite useful to interpret the difference in the nature of bidingof acetylcholine (Ach) and various toxic nerve agents to the acetylcholinesterase(ACHE) binding sites (Majumdar et al. 2006). A typical example in this respectcould be found in the interaction natures of Ach and R- and S-sarin enantiomers tothe ACHE binding site (Fig. 5).

The ACHE(I) and ACHE(II) represents two forms of ACHE active cavity due tothe two different orientations of the tautomeric His-440 moiety. The interactions inthe ACHE(I) : : :Ach and ACHE(II) : : :Ach complexes (Fig. 5a, b) are dominatedby the two-body electrostatic interactions between Tyr-121 residue and Ach. Theorientation of the -NC(CH3)3 group over the aromatic ring is ideal for cation- interaction. However, it cannot be classified as purely cation- , as MEP mapsof Fig. 5a, b show that there is a possibility of electrostatic interactions of the -NC(CH3)3 electron density with the side chain and the -OH group of the Tyr-121residue. This electrostatic effect together with the dispersion interactions makes thisinteraction more attractive than the normal cation- interaction. The electrostaticinteractions with the rest of the fragments are repulsive.

Page 14: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

14 N. Sizochenko et al.

Fig. 5 Calculated MEP on the isodensity surfaces of the (a) ACHE(I) : : :Ach, (b)ACHE(II) : : :Ach, (c) ACHE(I) : : : (S)-sarin, and (d) ACHE(II) : : : (S)-sarin complexes. The var-ious colored regions on the surfaces are deep blue (highly positive, >0.1 au), light blue (<0.1 auand >0.05 au), green (0.0 au), yellow (<�0.1 au and >�0.05 au), and red (g0.1 au). The minimumnegative MEP values of the substrate are within the range �20 to �10 kcal/mol (greenish yellowregion), while it is around �50 kcal/mol (reddish yellow region) for the inhibitor (Reproducedfrom Majumdar et al. 2006, with the kind permission of the American Chemical Society 2006)

The MEP map of the ACHE(I) : : : (S)-sarin complex (Fig. 5c) indicates stronghydrogen bonding interactions with the Ala-201, Ser-200, Glu-199 (X1), His-440, and Glu-327 (X3) fragments of ACHE (Majumdar et al. 2006). However,these are not pure hydrogen bonding interactions. The MEP isopotential surfaceof the ACHE(I) : : : (S)-sarin complex shows that the attractive interactions are alsopossible from the other parts of the X1-Ach and X3-Ach fragments. These effectsmake the two-body interactions more attractive than an ordinary hydrogen bond.The nature of interactions is totally changed in the ACHE-(II) : : : (S)-sarin complex(Fig. 5d). Since the interaction of PDO [of (S)-sarin] with the –N-H of the His-400residue is lost in this case, the complex becomes a weakly bound system. The MEPmap further shows that the electrostatic terms are quite important.

The MEP properties could be used as QSAR descriptor and important develop-ment in this direction was made by Carbo and coworkers (1980) and later by Good

Page 15: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 15

et al. (Good and Richards 1996). These are concerned with molecular similarity andare used in 3D-QSAR modeling. The Carbo index for molecular similarity (CAB) isdefined as:

CAB D

ZPAPBdv

�ZP 2Adv

�1=2�ZP 2Bdv

�1=2(17)

where PA and PB are electron density dependent structural properties. The numeratormeasures property overlap while the denominator normalizes the similarity result.Initially, electron density was used as the structural property P, and later MEP wasused to evaluate P (Eq. 18).

Pr D

nX

iD1

qi

.r �Ri/(18)

where Pr is the electrostatic potential at point r associated with the charge qi ofthe ith atom and the summation runs over the total n atoms. Ri is the nuclearcoordinate of the atom i. In order to avoid singularities at the atomic nuclei, wherel/(r – Ri) tends to infinity, the MEP is normally determined outside the van derWaals (VDW) volume of the molecules in the calculation. The Carbo index wassensitive to the shape of property distribution rather than to its magnitude. Hodgkinand Richards (Good and Richards 1996) introduced the following modified index(HAB) to increase the sensitivity of the formula to property magnitude:

HAB D

2

ZPAPBdv

ZP 2Adv C

ZP 2Bdv

(19)

When considering receptor drug binding, the electrostatic interaction energy of thesystem is proportional to the electrostatic potential exerted by the drug. The drugmolecule similarity is related to the difference in binding energies (ignoring entropyand solvent effects), and this disparity is proportional to the difference in drugelectrostatic potentials. Richards et al. (Good and Richards 1996) further developedtwo similarity indices based on this premise. They are known as exponential (EAB,Eq. 20) and linear (LAB, Eq. 21) indices.

EAB D

Xn

iD1exp

�� jPA�PB j

max.jPAj;jPB j/

n(20)

Page 16: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

16 N. Sizochenko et al.

LAB D

Xn

iD1

�1 � jPA�PB j

max.jPAj;jPB j/

n(21)

The term max(jPAj, jPBj) (D Pmax) equals larger MEP magnitude between PA

and PB at grid point where similarity is being calculated. Although CAB indiceswere used in QSAR studies, this was later replaced by the indices developed byRichard and coworkers. They were used to study dopamine agonistic propertiesas well as studies on hypoglycemic agents and np-apomorphine molecules andwere employed in QSAR studies of steroids (Good and Richards 1996). These3D-QSAR indices were also extended in graphical structure activity analysis usingneural networks (Good and Richards 1996) and QSAR models from partial leastsquares [PLS] studies (Good and Richards 1996). Klebe et al. (1994) furtherdeveloped MEP based comparative molecular similarity indices analysis techniquefor 3D-QSAR studies. The method was used to examine the correlations betweencalculated physicochemical properties and in vitro activities of a series of humanimmunodeficiency virus type 1 (HIV-1) integrase inhibitors (Makhija and Kulkarni2001).

The more advanced studies of the MEP properties of molecules have beendirected to macromolecular systems. In recent times, average electrostatic potential(�k) have been computed using Markov model to study indirect interaction betweenamino acids placed at a topological distance (k) within a given protein backbone. Aspecific example in this context is the case of Arc repressor, which is a model proteinof relevance for biochemical studies on bioorganics and medicinal chemistry. It wasused to model the effect of alanine scanning on thermal stability (González-Díazand Uriarte 2005). The potential �k is defined as:

�k D

nX

jD1

pAk .j / � V .j / (22)

The potential �k, in this definition, is dependent on the absolute probability pAk (j)

with which the amino acids interact with other amino acids placed at a distance k.The potential also depends on the initial unperturbed MEP V(j) of the amino acid.The detailed mathematical analysis is available in González et al. (2005), and wedo not need to elaborate it here further. Recently Murray, Politzer, and coworkers(Murray et al. 2014, 2015) have designed reactivity index of molecules based ontheir MEP surface properties and have applied to account for several interactions,but the applications of such indices as descriptor in QSAR are not yet available.

Atoms in Molecule (AIM) Theory Based DescriptorsThe 3D molecular structure representation based on Bader’s quantum topologicalAtoms in Molecules (AIM) theory for use in QSAR have been developed byAlsberg et al. (2000). Critical points located on the electron density distribution ofthe molecules are central to this structure representation using quantum topology

Page 17: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 17

(StruQT). Other gradient fields such as the Laplacian of the electron densitydistribution can also be used. The type of critical point of particular interest is thebond critical point (BCP), which is characterized by using the three parameters, viz.,electron density (�), the Laplacian (r2�), and the ellipticity ("). This representationhas the advantage that there is no need to probe a large number of lattice points in 3Dspace to capture the important parts of the 3D electronic structure as is necessary in,e.g., comparative field analysis (CoMFA). The details of AIM parameters are welldocumented (Popelier 2000) and need not be discussed here within the short spanof this review. Interested readers are advised to go through the cited references fordetails.

Alsberg et al. (2000) used the structure representation through AIM to computethe wavelength of the lowest UV transition for a system of 18 anthocyanidins.Different QSAR models were also constructed using several chemo-metric/machinelearning methods such as standard partial least squares regression (PLS), truncatedPLS variable selection, genetic algorithm-based variable selection, and geneticprogramming (GP). These models identified bonds that either took part in decreas-ing or increasing the dominant excitation wavelength (Alsberg et al. 2000). Themodels also correctly emphasized on the involvement of the conjugated   systemfor predicting the wavelength through flagging the BCP ellipticity parameters asimportant for this particular data set. One advantage of AIM formalism is thatit can be used to compute various electrostatic moment terms (up to octopolemoment) and the MEP generated by molecular electron distribution can be expandedusing multipole moments to represent the potential accurately. The AIM hasfurther advantage of generating transferable atomic contribution of electrostaticpotential in the total MEP of a molecule. This enables construction of potentialmaps of macromolecules (e.g., proteins) using the results of smaller fragments(Popelier 2000). The applications of AIM formalism, although promising, are stillin developing stage and not used yet on regular basis.

Quantum Chemical Techniques as Source of ParameterGenerations

The quantum chemical techniques to compute the parameters described aboveare quite large in number and still many new techniques are emerging to meetthe challenges of breaking the molecular size barrier together with accuracy ofthe parameters. The guide to the accuracy of the computed parameters lies in thereproducibility of the experimental observables (e.g., dipole moment, vibrationalspectra, binding energies, etc.) through the techniques used. In earlier days ofQSAR, mostly Hückel’s method was used and the molecular systems were restrictedto planar  -systems (Hansch 1969; Hansch and Coats 1970; Hansch et al. 1991).The development of various semiempirical techniques broke this barrier from themid-1970s through the development of complete neglect of differential overlap(CNDO) (Jensen 1999), intermediate neglect of differential overlap (INDO) (Jensen1999), and modified intermediate neglect of differential overlap (MINDO group of

Page 18: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

18 N. Sizochenko et al.

techniques) (Jensen 1999) of atomic orbitals in a molecule. These methods weredeveloped through approximations (attached to the name of the techniques) fromHartree-Fock (HF) equations. The other variants CNDO/S and ZINDO (Jensen1999) were not used unless spectral parameters were needed in QSAR. The laterdevelopments of these semiempirical techniques, viz., modified neglect of diatomicdifferential overlap (MNDO), its AM1 version, and the PM3-PM6 (Jensen 1999)group of techniques are still in use, especially PM6, for some specific cases. Thesemethods were mostly generated from their earlier version of neglect of diatomicdifferential overlap (NDDO) technique (Jensen 1999). Most of the semiempiricaltechniques are not considered as the state of the art after the early 1990s due totheir inability to reproduce experimental observables within accuracy limit. Thesystematic error embedded in the approximations used in such techniques is theresponsible factor and development of various HF theory-based ab initio methodsfrom the mid-1960s started to replace the semiempirical computations.

The simple HF-theory based ab initio techniques were later improved furtherto include high electron-correlations. These include Möller-Plesset second-orderperturbation (MP2) and its higher versions (MP3, MP4), configuration interactions(CI), and multiconfiguration self-consistent field (MCSCF) techniques; their vari-ants at the multireference levels; and various coupled cluster theory based methods(viz., CCSD, CCSD(T), and their multireference models) (Jensen 1999). These arehighly accurate methods for computing molecular properties for both the groundand excited states but are enormously time consuming and need large computermemories even for small molecules. Thus their usage is not very much welcome inQSAR as such analysis needs substantial number of molecular data set for a specificproperty analysis of mostly small and midsized molecules. Sometimes the molecularsystems are quite large and the properties are not easily extractable through suchhigher-order techniques.

The compromise between accuracy and molecular size could be resolved throughthe use of density functional theory (DFT) (Jensen 1999). The Kohn-Sham varia-tional principle for DFT is quite old (Kohn and Sham 1965), but real computationalapproach of DFT in molecular quantum chemistry started in the mid-1980s. Atpresent this is one of the cheapest and reliable techniques and quite a large numberof functionals have been designed (and still under development) in this context toresolve diverse molecular problems. DFT could be used for fairly large molecularsystems with affordable computational time and the technique is ideal for generatingreliable parameters for QSAR. Several quantum chemical parameters for QSAR,discussed earlier, have either no semiempirical version or are not so accurateafter computation. DFT technique is also very much desirable in this context. Thesemiempirical techniques are still directly used for very large molecular systems orsystems where use of QM/MM (Jensen 1999) type of formalism is essential. TheQM/MM technique, which divides a molecule/molecular system into two or threeregions (called layers), e.g., methods like ONIOM (Svensson et al. 1996), mostlyuses semiempirical formalism as second layer in the total computational strategy.This layer distribution is, of course, flexible and depends on the level of ab initiotechnique chosen for the first layer and the accuracy of computations needed for the

Page 19: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 19

second and third layers. The final choice of the use of quantum chemical techniquesin QSAR depends on the requirements for the model and, of course, user’s intuition.

Molecular Mechanics and QSAR

The molecular mechanics (MM) as such is a tool for generating structure ofmolecules through much cheaper way than quantum chemical techniques usingclassical potentials. There are various forms of such potentials and they areformulated through transferable parameters called force-fields. The details of suchpotentials cannot be discussed in this short review, but the general views on it arewell documented. The interested readers can refer to more recent books in this area(Schlick 2002; Warshel 1991). The most important aspect is that these potentialsand related force-field parameters are the important essential step in designingstrategies for large molecular structure simulations. The large structures are usuallyproteins, enzymes, DNA, and RNA like molecules and the simulation strategiesinvolve classical molecular dynamics (MD) and Monte-Carlo (MC) techniques.The literature involving such calculation strategies as well as their use in largestructure simulations, simulations involving their free energy surfaces (involvingphase transition from one form to the other, free energy of interactions with largeand small molecular units, etc.) have been accumulating for the last 50 yearswith various levels of sophistications. It is out of scope to discuss such topicshere, but they are well documented (Schlick 2002) for interested readers. The MMcalculations could also be used as an initial step to calculate the structures of largerspecies, like metal nanoclusters in desired symmetry (Majumdar et al. 2012).

The other important part of MM calculations is that the methods are sometimesused as a cheap way to generate structure dependent descriptor in QSAR analysis.Such descriptors could be generated from other sources also (Stanon et al. 2002),and we will wrap up this discussion by citing one important example of MMapplications in this direction. The example is related to the generation of surfacearea of molecule from the van der Waals radii of atoms from the force-field data(MMFF94 force-field) after optimizing the geometry through MM calculations.For a given numerical property Pi for each atom i in a molecule, the descriptorP_VSA(u,v) (VSA: van der Waals surface area) for a specific range [u,v] was definedusing sum of atomic VSA contribution of each atom i as (Labute 2000):

P�VSA .u; v/ DX

i

Vi ı .Pi 2 Œu; v/ (23)

where Vi is the atomic contribution of atom i to the VSA of the molecule. For a setof n descriptors associated with a property P,

P�VSAk DX

i

Vi ı .Pi 2 Œak�a; ak/ k D 1; 2; ::n (24)

Page 20: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

20 N. Sizochenko et al.

where a0 < ak < an are interval boundaries such that [a0, an] bound all values of Pi

in any molecule. The mathematical expression shows that VSA-type of descriptorscorrespond to a subdivision of the molecular surface area. The concept was usedto construct QSAR/QSPR models for boiling point, vapor pressure, free energy ofsolvation in water, solubility in water, receptor class, and activity against thrombin,trypsin, and factor Xa (Labute 2000).

Conclusions

Quantum chemical parameters are getting more and more useful in QSAR as theyare quite reliable because of the proper reproducibility of many key molecularobservables with respect to experiments. In the present review, we have discussedvarious quantum chemical parameters involving atomic charges in molecules; elec-trostatic moments and their implications; orbital-related properties like hardness,softness, and their implications in reactivities; surface-related properties like MEP;and the related important QSAR indices. AIM properties are also discussed andexamples have been cited regarding the usage of all these parameters in QSAR tomodel drug-related molecular properties. The sources of generating such quantum-chemical parameters are also discussed in terms of semiempirical and ab initio-related methods. DFT techniques have been suggested to be the most suitable onesafter considering the pros and cons of these methods. The discussion on molecularmechanics (MM) also constitutes a part of the review and although it is mostlyused to construct molecular structures, the structural parameters generated throughMM are also sometimes used in QSAR/QSPR-related modeling. Such possibilitiesare also discussed in terms of van der Waals surface-related parameters. Variousparameters for QSAR are still emerging through various quantum chemical/MM-related techniques, but ultimately it depends on the intuition of the user to generatemeaningful structure-activity correlations for practical use in chemoinformatics.

Acknowledgments The authors acknowledge the support of NSF CREST (No.: HRD-0833178)grant. One of the authors (S.R.) acknowledges the financial support by a statutory activitysubsidy from Polish Ministry of Science and Technology of Higher Education for the Facultyof Chemistry of Wroclaw University of Science and Technology and NCN grant no UMO-2013/09/B/ST4/00097.

Bibliography

Alsberg, B. K., Marchand-Geneste, N., & King, R. D. (2000). A new 3D molecular structurerepresentation using quantum topology with application to structure–property relationships.Chemometrics and Intelligent Laboratory Systems, 54, 75.

Alves, C. N., Pinheiro, J. C., Camargo, A. J., Ferreira, M. M. C., & da Silva, A. B. F. (2000).A structure–activity relationship study of HEPT-analog compounds with anti-HIV activity.Journal of Molecular Structure (THEOCHEM), 530, 39.

Bayly, C. I., Cieplak, P., Cornell, W., & Kollman, P. A. (1993). A well-behaved electrostaticpotential based method using charge restraints for deriving atomic charges: The RESP model.The Journal of Physical Chemistry, 97, 10269.

Page 21: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 21

Bhat, S., Sulea, T., & Purisima, E. O. (2006). Coupled atomic charge selectivity for optimal ligand-charge distributions at protein binding sites. Journal of Computational Chemistry, 27, 1899.

Bhattacharjee, A. K., Majumdar, D., & Guha, S. (1992). Theoretical studies on the conformationalproperties and pharmacophoric pattern of several bipyridine cardiotonics. Journal of theChemical Society, Perkin Transactions, 2, 805.

Brown, F. K. (1998). Chemoinformatics: What is it and how does it impact drug discovery. AnnualReports in Medicinal Chemistry, 33, 375.

Carbó, R., Leyda, L., & Arnau, M. (1980). How similar is a molecule to another? An electrondensity measure of similarity between two molecular structures. International Journal ofQuantum Chemistry, 17, 1185.

Cartier, A., & Rivail, J. L. (1987). Electronic descriptors in quantitative structure – Activityrelationships. Chemometrics and Intelligent Laboratory Systems, 1, 335.

Chattaraj, P. K., Sarkar, U., & Roy, D. R. (2006). Electrophilicity index. Chemical Reviews, 106,2065.

Franke, R. (1984). Theoretical drug design methods. Amsterdam: Elsevier.Fukui, K. (1971). Recognition of stereochemical paths by orbital interaction. Accounts of Chemical

Research, 4, 57.Gaudio, A. C., Korolkovas, A., & Takahata, Y. (1994). Quantitative structure-activity relationships

for 1,4-dihydropyridine calcium channel antagonists (nifedipine analogues): A quantum chem-ical/classical approach. Journal of Pharmaceutical Sciences, 83, 1110.

Ghose, A. K., Pritchett, A., & Crippen, G. M. (1988). Atomic physicochemical parameters forthree dimensional structure directed quantitative structure-activity relationships III: Modelinghydrophobic interactions. Journal of Computational Chemistry, 9, 80.

González-Díaz, H., & Uriarte, E. (2005). Proteins QSAR with Markov average electrostaticpotentials. Bioorganic & Medicinal Chemistry Letters, 15, 5088.

Good, A. C., & Richards, W. G. (1996). The extension and application of molecular similaritycalculations to drug design. Drug Information Journal, 30, 371.

Grunenberg, J., & Herges, R. (1995). Prediction of chromatographic retention values (rm)and partition coefficients (log Poct) using a combination of semiempirical self-consistentreaction field calculations and neural networks. Journal of Chemical Information and ComputerSciences, 35, 905.

Guha, S., Majumdar, D., & Bhattacharjee, A. K. (1992). Molecular electrostatic potential: A toolfor the prediction of the pharmacophoric pattern of drug molecules. Journal of MolecularStructure (THEOCHEM), 256, 61.

Hammett, L. P. (1937). The effect of structure upon the reactions of organic compounds. Benzenederivatives. Journal of the American Chemical Society, 59, 96.

Hansch, C. (1969). Quantitative approach to biochemical structure-activity relationships. Accountsof Chemical Research, 2, 232.

Hansch, C., & Coats, E. (1970). ’-chymotrypsin: A case study of substituent constants andregression analysis in enzymic structure – activity relationships. Journal of PharmaceuticalSciences, 59, 731.

Hansch, C., Leo, A., & Taft, R. W. (1991). A survey of Hammett substituent constants andresonance and field parameters. Chemical Reviews, 91, 165.

Jensen, F. (1999). Introduction to computational chemistry. New York: Wiley.Kadlubanski, P., Calderón-Mojica, K., Rodriguez, W. A., Majumdar, D., Roszak, S., & Leszczyn-

ski, J. (2013). Role of the multipolar electrostatic interaction energy components in strong andweak cation �  interactions. The Journal of Physical Chemistry. A, 117, 7989.

Karelson, M., Lobanov, V. S., & Katritzky, A. R. (1996). Quantum-chemical descriptors inQSAR/QSPR studies. Chemical Reviews, 96, 1027.

Kim, K. S., Lee, J. Y., Lee, S. J., Ha, T.-K., & Kim, D. H. (1994). On binding forces betweenaromatic ring and quaternary ammonium compound. Journal of the American Chemical Society,116, 7399.

Page 22: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

22 N. Sizochenko et al.

Klebe, G., Abraham, U., & Mietzner, T. (1994). Molecular similarity indices in a comparativeanalysis (comsia) of drug molecules to correlate and predict their biological activity. Journal ofMedicinal Chemistry, 37, 4130.

Kohn, W., & Sham, L. J. (1965). Self-consistent equations including exchange and correlationeffects. Physical Review, 140, A1133.

Labute, P. (2000). A widely applicable set of descriptors. Journal of Molecular Graphics andModelling, 18, 464.

Leach, A. R., & Gillet, V. J. (2007). An introduction to chemoinformatics. Dordrecht: Springer.Lewis, D. F. V. (1987). Molecular orbital calculations on solvents and other small molecules:

Correlation between electronic and molecular properties �, ’MOL,  *, and “. Journal ofComputational Chemistry, 8, 1084.

Löwdin, P.-O. (1970). On the nonorthogonality problem. Advances in Quantum Chemistry, 5, 185.Ma, J. C., & Dougherty, D. A. (1997). The cation �  interaction. Chemical Reviews, 97, 1303.Majumdar, D., Roszak, S., & Leszczynski, J. (2006). Probing the acetylcholinesterase inhibition of

sarin: A comparative interaction study of the inhibitor and acetylcholine with a model enzymecavity. The Journal of Physical Chemistry. B, 110, 13597.

Majumdar, D., Roszak, S., & Leszczynski, J. (2012). Theoretical studies on the structure and elec-tronic properties of cubic gold nanoclusters. The Canadian Journal of Chemical Engineering,90, 852.

Makhija, M., & Kulkarni, V. (2001). Molecular electrostatic potentials as input for the alignmentof HIV-1 integrase inhibitors in 3D QSAR. Journal of Computer-Aided Molecular Design, 15,961.

Martin, R. L., Davidson, E. R., & Eggers, D. F. (1979). Ab initio theory of the polarizability andpolarizability derivatives in H2S. Chemical Physics, 38, 341.

Milischuk, A., & Matyushov, D. V. (2002). Dipole solvation: Nonlinear effects, density reorgani-zation, and the breakdown of the onsager saturation limit. The Journal of Physical Chemistry.A, 106, 2146.

Mulliken, R. S. (1955). Electronic population analysis on LCAO–MO molecular wave functions I.The Journal of Chemical Physics, 23, 1833.

Murray, J. S., Macaveiu, L., & Politzer, P. (2014). Factors affecting the strengths of ¢-holeelectrostatic potentials. Journal of Computational Science, 5, 590.

Murray, J. S., Shields, Z. P.-I., Seybold, P. G., & Politzer, P. (2015). Intuitive and counterintuitivenoncovalent interactions of aromatic   regions with the hydrogen and the nitrogen of HCN.Journal of Computational Science, 10, 209.

Oliferenko, A. A., Oliferenko, P. V., Huddleston, J. G., Rogers, R. D., Palyulin, V. A., Zefirov,N. S., & Katritzky, A. R. (2004). Theoretical scales of hydrogen bond acidity and basicity forapplication in qsar/qspr studies and drug design. Partitioning of aliphatic compounds. Journalof Chemical Information and Computer Sciences, 44, 1042.

Parr, R. G., & Pearson, R. G. (1983). Absolute hardness: Companion parameter to absoluteelectronegativity. Journal of the American Chemical Society, 105, 7512.

Popelier, P. L. A. (2000). Atoms in molecules: An introduction. New York: Prentice Hall.Roos, G., Geerlings, P., & Messens, J. (2009). Enzymatic catalysis: The emerging role of

conceptual density functional theory. The Journal of Physical Chemistry. B, 113, 13465.Roy, R. K., Krishnamurti, S., Geerlings, P., & Pal, S. (1998). Local softness and hardness based

reactivity descriptors for predicting intra- and intermolecular reactivity sequences: Carbonylcompounds. The Journal of Physical Chemistry. A, 102, 3746.

Schlick, T. (2002). Molecular modeling and simulation: An interdisciplinary guide. New York:Springer.

Schwöbel, J., Ebert, R.-U., Kühne, R., & Schüürmann, G. (2009). Modeling the H bond donorstrength of -OH,�NH, and -CH sites by local molecular parameters. Journal of ComputationalChemistry, 30, 1454.

Schwöbel, J. A. H., Koleva, Y. K., Enoch, S. J., Bajot, F., Hewitt, M., Madden, J. C., Roberts, D.W., Schultz, T. W., & Cronin, M. T. D. (2011). Measurement and estimation of electrophilicreactivity for predictive toxicology. Chemical Reviews, 111, 2562.

Page 23: Application of Quantum Mechanics and Molecular Mechanics ...pubs.ccmsi.us/pubs/HCC3-2017-2041.pdf · Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 3

Application of Quantum Mechanics and Molecular Mechanics in Chemoinformatics 23

Scrocco, E., & Tomasi, J. (1973). The electrostatic molecular potential as a tool for theinterpretation of molecular properties. Topics in Current Chemistry, 42, 95.

Scrocco, E., & Tomasi, J. (1978). Electronic molecular structure, reactivity and intermolecularforces: An euristic interpretation by means of electrostatic molecular potentials. Advances inQuantum Chemistry, 11, 115.

Stanon, D. L., Dimitrov, S., Gruncharov, V., & Mekenyan, O. G. (2002). Charged partial surfacearea (CPSA) descriptors QSAR applications. SAR and QSAR in Environmental Research, 13,341.

Svensson, M., Humbel, S., Froese, R. D. J., Matsubara, T., Sieber, S., & Morokuma, K. (1996).ONIOM: A multilayered integrated moCmm method for geometry optimizations and singlepoint energy predictions. a test for diels� alder reactions and Pt(P(t-Bu)3)2CH2 oxidativeaddition. The Journal of Physical Chemistry, 100, 19357.

Warshel, A. (1991). Computer modeling of chemical reactions in enzymes and solutions. NewYork: Wiley.

Weiner, P. K., Langridge, R., Blaney, J. M., Schaefer, R., & Kollman, P. A. (1982). Electrostaticpotential molecular surfaces. Proceedings of the National Academy of Sciences of the UnitedStates of America, 79, 3754.

Weinhold, F., & Landis, C. R. (2012). Discovering chemistry with natural bond orbitals. Hoboken:Wiley.

Woodward, R. B., & Hoffmann, R. (1969). The conservation of orbital symmetry. AngewandteChemie International Edition in English, 8, 781.