simulation of nmr observables of carbohydratestoukach.ru/files/glyconmr.pdf · simulation of nmr...
Post on 25-Jan-2020
17 Views
Preview:
TRANSCRIPT
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 1
Simulation of NMR observables of carbohydrates
(FULL VERSION of Recent Advances in Computational Predictions of NMR Parameters for
Structure Elucidation of Carbohydrates: Methods and Limitations , DOI: 10.1039/b000000x)
Filip V. Toukach, Valentine P. Ananikov* Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Leninsky Prospekt 47, Moscow, 119991, Russia. Fax: +7 499 135 5328; 5
E-mail: val@ioc.ac.ru
All living systems are comprised of four fundamental classes of macromolecules - nucleaic acids, proteins, lipids, and carbohydrates (glycans). Glycans play a unique role of joining three principal hierarchical levels of the living world: 1) molecular level (pathogenic agent and vaccine recognition by the immune system; metabolic pathways involving saccharides that provide cells with energy; and energy accumulation via photosynthesis); 2) nanoscale 10
level (cell membrane mechanics; structural support of biomolecules; and glycosylation of macromolecules); 3) microscale and macroscale levels (polymeric materials, such as cellulose, starch, glycogen, and biomass). NMR spectroscopy is the most powerful research approach for getting insight into solution structure and function of carbohydrates at all hierarchical levels, from monosaccharides to oligo- and polysaccharides. Recent progress in computational procedures opened a novel opportunity to reveal structural information available in the NMR 15
spectra of saccharides and to advance our understanding of corresponding biochemical processes. The ability to predict the molecular geometry and NMR parameters is crucial for elucidation of carbohydrate structures. In the present paper, we review the major NMR spectrum simulation techniques in regard to chemical shift, coupling constant, relaxation rate and nuclear Overhauser effect prediction applied to the three levels of glycomics. Outstanding development in the related fields of genomics and proteomics has clearly shown that it is the 20
advancement of research tools (automated spectrum analysis, structure elucidation, synthesis, sequencing and amplification) that drives grand challenges in modern science. Combination of NMR spectroscopy and computational analysis of structural information encoded in the NMR spectra reveals the way to automated elucidation of the structure of carbohydrates.
Contents 25
1. Introduction 1 2. Computation of the NMR parameters of carbohydrates 4 3. Empirical methods of NMR parameter prediction 5
3.1. Database approach 5 3.2. Usage of neural networks 6 30
3.3. Regression-based methods 7 3.4. CHARGE approach 7 3.5. Incremental approach at the residual level 8
4. Models and methods for carbohydrate 3D structural studies 9 4.1. Molecular mechanics and molecular dynamics 9 35
4.2. Semi-empirical methods 9 4.3. Ab initio and density functional modeling 11 4.4. Hybrid QM/MM, QM/QM and ONIOM approaches 12 4.5. Interaction with solvent 13
5. Computation of NMR chemical shifts 14 40
5.1. Monosaccharides and derivatives 14 5.2. Oligosaccharides and polysaccharides 19
6. Computation of NMR coupling constants 23 6.1. Intra-residue coupling constants 24 6.2. Inter-residue coupling constants 28 45
7. Computation of NMR relaxation rates 29 8. Computation of other NMR parameters 30 9. Conclusions 31 10. Abbreviations 35 11. Acknowledgements 36 50
12. References 36
1. Introduction Glycochemical and glycobiological research has recently shown a tremendous growth and rapidly developed into one of the leading 55
forces in modern science. Novel synthetic approaches and rational design of carbohydrates and glycoconjugates revealed new opportunities in drug and vaccine discovery.1-5 Detailed insight was gained into the key role of carbohydrates in biological recognition, development of diseases and control of the immune 60
response.6-11 Nowadays a lot of new carbohydrate drugs are licensed or are in clinical testing.2-4, 6-13 Glyco-nanomaterials are perspective building blocks for such applications as biosensors or multivalent scaffolds for drug delivery.14 With such an outstanding progress demonstrated in recent decades a new era 65
has emerged in medicinal and pharmaceutical applications of carbohydrates. The role of oligo- and polysaccharides and their conjugates in cellular biology can hardly be overestimated.15-19 Carbohydrate functions in living organisms vary from the energy storage and 70
the maintenance of the cellular shape to provision of the immunological uniqueness of microorganisms. The high structural diversity of saccharide residues and their linkages allows carbohydrate-containing molecules to present a huge number of signals to their surroundings, making them well suited 75
for the control of molecular recognition in living cells,20 highly involved in signal transduction,21 and in multiple biosynthetic pathways.22 Carbohydrate microarrays and other analytical techniques dedicated to probing of glycan-related processes in cells have been developed.23-25 80
2 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
Fig. 1. Representative 1H NMR spectra in D2O: (A) cyclic pentapeptide
showing individual signals in a wide range of chemical shifts from 0 to 11 ppm 26; (B) regular polymer with pentasaccharide repeating unit showing signals in two narrow regions 1.0–2.5 ppm and 3.0–6.0 ppm, 5
including a strong overlap in a range 3.5-4.5 ppm.27 (reproduced with permission, © Elsevier Ltd., 2005)
Cellulose and chitin are two most abundant natural polymers on Earth and their industrial utilization is the question of primary importance within a widely accepted sustainable concept. 10
Diversity of industrial applications benefits from employing procedures developed in carbohydrate chemistry towards biomass processing. Carbohydrates contribute up to 75% to world renewable biomass.28-30 Development of practically useful and efficient procedures for conversion of cellulose into platform 15
chemicals and biofuels was identified as one of the central research challenges in the coming century.31-34 The estimations have shown that up to 30% of the transportation fuel demands could be fulfilled by cellulose biomass.31-35 However, in spite of massive development of fascinating 20
applications, carbohydrates remain the least structurally characterized among the major classes of biological molecules. Carbohydrates are very difficult to crystallize and in most cases single crystals of sufficient quality for X-ray analysis cannot be obtained36, 37. Moreover, even for such minority of successful 25
crystallizations, X-ray crystallography was reported to give poorly resolved structures of glycan moieties36-39. Limited application of X-ray structure determination for carbohydrates is in sharp contrast to proteins, where crystallization and X-ray structure elucidation have become a standard research tool40-42. 30
Mass-spectrometry of carbohydrates is a useful technique, however it is not sufficient as a structural tool alone since the crucial issue of stereochemistry of carbohydrates cannot be solved by routinely available methods43. Unlike many high-throughput analytical methods, NMR 35
spectroscopy is tolerant to the incompleteness of reference data
and thus plays a key role in primary structural elucidation of new natural glycans44. Besides its ubiquitous use in structural studies of carbohydrates it makes a significant insight into the mechanisms of their biological action38, 43, 45, 46. In fact, NMR 40
spectroscopy provided most of the experimental data on solution structure of carbohydrates, complex equilibriums and interconversions of sugar units, monitoring of chemical reactions involving carbohydrates, characterization of carbohydrate binding to other bioactive molecules and other processes of biological 45
relevance38, 43, 45-48. It has been recognized as a valuable tool for quality control and characterization of carbohydrates-containing drugs49. NMR-based approaches were incorporated into the World Health Organization recommendations on the production and quality control of glycoconjugate vaccines50. 50
Important advantage of the NMR spectroscopy concerns determination of three-dimensional structure directly in water solution (in water and organic solvents), where the processes of biological and chemical relevance occur. To achieve this goal several experimental methods were developed for measurement 55
of the key NMR parameters: chemical shifts, coupling constants, NOE data and relaxation rates. Highly sensitive and powerful 1D and 2D NMR experiments were developed and optimized to carry out the measurements of carbohydrates15, 36, 43, 51. Rapid progress in the NMR hardware and development of new NMR 60
experiments made structural elucidations routinely available in everyday practice in chemical and biological research laboratories. Such an impressive development has clearly identified state-of-the-art challenge in the field of structural studies of 65
carbohydrates. However, further insight in this fascinating area of research is limited by difficulties in interpretation of the NMR parameters, rather than by recording of the NMR spectra. Indeed, proving correct signal assignment and understanding the relationship between measured NMR parameters and molecular 70
structure is still a tedious task, especially for such chemically diverse class of compounds as carbohydrates. In spite of wide structural diversity of carbohydrates, the majority of their NMR studies is limited to 1H and 13C nuclei in contrast to proteins (1H, 13C and 15N) and nucleic acids (1H, 13C, 75
15N and 31P). Isotope labeling, routinely used in protein NMR spectroscopy to enhance automated structure analysis, is only limitedly applicable to carbohydrates52-56. Although the building blocks of carbohydrates are more diverse in nature compared to structural units of nucleic acids and proteins, their NMR chemical 80
shifts are located in much narrower region (Fig. 1). Thus, assignment and interpretation of the NMR spectra remain a challenge in modern structural glycoscience. Proper interpretation of the NMR parameters requires a theoretical analysis. Particularly, to correlate the time-averaged experimental NMR 85
data with the primary and secondary chemical structure, the former can be computed by molecular modeling48. Modelling of the carbohydrate structure and molecular properties has benefit from a variety of computational methods57. In the present review we discuss recent progress in 90
development of computational approaches for modeling of the NMR parameters of carbohydrates. The review covers a set of topics important for structure elucidation: i) theoretical calculations and analysis of 1H, 13C, 15N, 17O, 31P chemical shifts;
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 3
Fig. 2. Selected monosaccharides used in this review, shown in pyranose form, and their IUPAC abbreviations. The monosaccharides that typically exhibit an equlibrium are shown in both forms (A). Various forms of monosaccharides exemplified by D-glucose (IUPAC abbreviations in red). Numbers stand
for carbon atom enumeration (B). Schematic representation of some 4C1 and 1C4 chair hydroxyl and hydroxymethyl rotamers of β-D-glucose. The idealized torsions are denoted by g+, t and g- for gauche clockwise (60°), anti (180°), and gauche counterclockwise (-60°) respectively. The idealized O5-5
C5-C6-O6 dihedral angles for the hydroxymethyl group are denoted by capital letters: G+, T, and G-. g++ or g-- notate torsions far from the idealized values for 1C4 chair conformer58 (C, reproduced with permission, © Elsevier Science B.V., 1996).
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 4
Fig. 3. Representative parts of 2D 1H,13C HSQC (A) and homonuclear NOESY (B) spectra of a sulfated trisaccharide recorded in D2O at 500 MHz59. In
spite of strong signal overlap in 1D spectra, the cross-peaks are clearly resolved in 2D spectra. Reproduced with permission, © Elsevier Ltd., 2005.
ii) computations of chemical shielding tensors and chemical shift 5
surfaces; iii) prediction of H-H, P-H, C-H and C-C coupling constants essential for structural studies; iv) modeling of relaxation parameters; v) prediction of nuclear Overhauser effects and other NMR parameters. A discussion is provided on the scope and limitations of 10
available theoretical approaches including ab initio, density functional, semiempirical, molecular mechanics, molecular dynamics, empirical and hybrid calculations in relation to the NMR structural analysis. Application of modern approaches for theoretical prediction of the NMR properties, together with 15
experimental data, results in revealing the key information concerning the molecular structure. One of the major goals of the theoretical NMR calculations is a faithful reproduction, and, later, prediction of the experimental data. The present reviews focuses on the prediction and analysis of 20
NMR parameters of carbohydrates and their derivatives using empirical methods and expert systems60, 61, as well as calculations at quantum-chemical level61-65.
2. Computation of the NMR parameters of carbohydrates 25
Increasing demand in the NMR structure analysis of carbohydrates emerged development of wide variety of computational approaches to predict and analyze chemical shifts, spin-spin coupling constants, relaxation rates and other parameters. 30
The first class of approaches includes empirical methods, which operate with molecules basing on the connectivity of atoms or residues. These methods do not require thorough evaluation of atomic coordinates, except for rough stereochemistry. A series of easy to use and computationally 35
efficient tools were developed based on empirical methods, and these tools are now routinely used in everyday research practice. The concise overview of the empirical methods is given in section 3. Straightforward “first principles” modeling of the NMR 40
parameters with ab initio and density functional methods requires calculation of a molecular structure followed by derivation of the NMR data. The necessary description of models and methods important for NMR structural studies of carbohydrates is summarized in section 4. Detailed discussion of application of the 45
computational methods for structure elucidation of carbohydrates is divided to: a) calculation of NMR chemical shifts (section 5); b) calculation of spin-spin coupling constants (section 6); c) prediction of other NMR parameters (sections 7 and 8). In spite of rapid development of ab initio and density 50
functional methods facilitated by the increasing performance of computational hardware, it should be pointed out that empirical predictions are still widely used. A rough estimation of NMR prediction quality using standard “out-of-box” protocols shows that empirical methods produce good accuracy and are very fast. 55
As is discussed below, there are several options available to improve the performance of ab initio and density functional predictions of the NMR parameters of carbohydrates. However, these options are mostly described in the specialized theoretical articles without being widely known to researchers working on 60
the the experimental data analysis and structure elucidation. Exchange of knowledge between these fields is an important goal of the present review. Typical carbohydrate building blocks and characteristic geometrical features are shown in Fig. 2 (for the list of 65
abbreviations see section 10). On the one hand, a diversity of building blocks and a variety of available inter-residue
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 5
connections generate a huge number of possible carbohydrate structures. On the other hand, this abundant structural information usually cannot be deduced from 1D 1H and 13C NMR spectra due to ambiguous assignment pathways and strong signal overlap. Nowadays, multidimensional spectroscopy allows determination 5
of the NMR parameters and reliable structure elucidation of carbohydrates43, 49, 51, 66. Indeed, even addition of ony one spectral dimension (2D NMR spectroscopy) results in clearly resolved signals (Fig. 3) as compared to 1D spectra (Fig. 1B). The aim of computational prediction and analysis of 2D NMR spectra 10
highlights the most important challenge in the field since accurate calculation of 1H and 13C NMR chemical shifts and coupling constants is required.
3. Empirical methods of NMR parameter prediction 15
Since 1975, a number of chemical shift collections dedicated solely to carbohydrates have evolved43, encouraging many groups to develop algorithms that can utilize this information in computational prediction of the NMR spectra of carbohydrates. Most of the chemical shift databases provided a signal search 20
tool, making NMR data easily interpretable in terms of structure. The simplest class of empirical methods implies only a small reference database (so called base values) and a multitude of additive rules and increments parameterized for every class of compounds. This approach has developed to a number of 25
initiatives discussed in the present section of the review. As a representative example, in case of 1H NMR a mean deviation of 0.2-0.3 ppm was observed for prediction of 90% of all CHx-groups chemical shifts in unpolar solvents and in case of 13C NMR >95% of the chemical shifts were predicted by CHARGE 30
with a mean deviation of 3.8 ppm67. Empirical methods, as well as usage of neural networks, enable the fastest and fully automatized calculation that can generate up to 10,000 chemical shifts per second on a desktop computer with an accuracy of 1.6-1.8 ppm68. Programs utilizing statistical 35
processing of reference chemical shifts databases provide similar or better accuracy at slower but still acceptable performance69. Every structural fragment is assigned a descriptor that correlates with its major structural peculiarities. When the database is queried with the descriptor, similar structures are identified, and 40
the resulting values are weighted averages of the experimental NMR data corresponding to these structures. However, the predictions are solely limited to the structural information deposited in a database. As a result, empirical methods have only a limited application in the elucidation of 45
secondary structure, as they are unable to predict non-averaged properties of molecules in a certain conformation or under conditions different from those utilized in the database. Another known drawback concerns inability to account varying conditions of spectra recording. It was reported that the use of different 50
solvents may strongly increase the deviations and deteriorate the accuracy of prediction67. In spite of these limitations, simple algorithms and very fast calculations with reasonable accuracy in basic cases govern ubiquitous application of empirical methods in modern NMR 55
structural analysis of carbohydrates. Incremental empirical or
neural network methods of chemical shift prediction can be successfully used at the selection stage of structural hypotheses which are later verified by time-consuming molecular geometry optimization and ab initio calculations of chemical shifts70. 60
Below we provide a brief review of empirical techniques useful for research and educational purpose in the field.
3.1. Database approach
Historically the first database approach to chemical shift prediction was described by Bremser71 and was called a 65
hierarchical organization of spherical environments (HOSE). Since then it has been improved and remains the most popular structure description algorithm in database-oriented NMR predictors. Particularly this algorithm was used as one of the approaches in ACDLabs ACD/NMR and Modgraph 70
Consultants Ltd. NMR prediction software72, 73. The HOSE starts at the atom whose chemical shift has to be predicted, expands one bond away from the atom (“1st sphere”) and tries to find this environment in the reference database. If the search is successful it moves two bonds away (“2nd sphere”) and tries again and so 75
until either the fragment is not deposited in the database or the molecule boundary is reached. The HOSE approach exhibits good results for the structures where the fragments are well represented in the reference collection. As a rule of thumb, if the analyzed atoms can be predicted using three or more spheres the 80
prediction is considered reliable. In modern implementations, HOSE is extended to treat stereochemistry (3D HOSE), by assigning higher weight to the structural database entries that describe the same stereochemistry as the fragments under analysis72. 85
As realized in ACDLabs 8.0 NMR predictor this approach provided a standard error of 0.22 ppm per 1H resonance (tested on 54,608 organic molecules), and 2.33 ppm per 13C resonance (tested on 68,129 organic molecules). 62% of predicted 1H NMR chemical shifts were less than 0.1 ppm from the experimental 90
values, and 64% of 13C NMR chemical shifts were less than 1 ppm from the experimental values74. Novel versions of ACD/NMR predictors utilize the combined approach, where the results from HOSE and neural network algorithm (discussed in the next section) are compared to retrieve the best-fitting value. 95
The Table 1 illustrates the statistics on the reference databases for several nuclei valuable in carbohydrate chemistry75. More details about ACD/NMR predictor are available in a review of Elyashberg and coworkers60.
Table 1. ACD/NMR reference databases available for NMR spectra 100
prediction (data for version 11 75).
Nuclei Number of structures Number of chemical shifts
1H 210 000 1.7 million 13C 191 900 2.5 million 15N 9 287 21 782 31P 27 578 34 020
Another family of computational products utilizing HOSE approach includes Modgraph-based general-purpose 13C and heteronuclei NMR predictors73 (implemented and tested in a number of software packages, such as MestreLabs NMRPredict76 105
6 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
and PerkinElmer ChemBioOffice77). Currently Modgraph uses a HOSE code algorithm capable to analyze up to five spheres and a database of 193,352 most highly verified 13C records abstracted from the literature by Robien and coworkers78. This database is a further development of a product reported earlier79. Additionally, 5
185,517 13C and 86,480 heteronuclei records from Chemical Concepts are available as an option. Modgraph automatically selects a better 13C NMR prediction for each atom from HOSE and neural network prediction methods (see section 3.2 for the latter). The higher the number of 10
HOSE spheres was reached for each atom, the more emphasis is given to the HOSE code prediction. The target mean error of 0.18 ppm per resonance was reported after evaluation of ca. 90,000 structures and stereochemistry of the molecule was considered. Several other 13C and heteronuclei chemical shift databases 15
were reported: CSEARCH80, Chemical Concepts SpecSurf / SpecInfo81, WINDAT82, freely-accessible NMRshiftDB83. Some of these projects were continuously developed and transformed into a dedicated computational tools for empirical spectra predictions. 20
An alternative approach to encode stereochemical information in HOSE-based predictors was developed by Satoh and coworkers84. This encoding scheme, called CAST (Canonical representation of stereochemistry), includes different descriptors at the planar, conformational and configurational levels for each 25
atom. Although no usage of CAST for carbohydrates has been reported, predictions of chemical shifts in a linear triol part of 20-hydroxyecdysone exhibited an average deviation from the experimental spectrum within 0.5 ppm per resonance85. Kelleher and Simpson carried out 1H and 13C NMR predictions 30
in the form of HSQC spectrum for the 2D model of humic acid and compared it with HSQC spectra of the soil samples, including the amylopectin carbohydrate moiety86. The predictions were based on HOSE code matches and incremental algorithms implemented in ACDLabs Spec Manager 9.06. Although this 35
approach has been used to produce accurate predictions for non-carbohydrate soil components87, there was generally poor correlation between experimental signals and those simulated for the proposed structural model.
3.2. Usage of neural networks 40
Neural network is a mathematical construction allowing optimization of non-linear dependencies between input descriptors and output values88-90. It consists of artificial neurons organized in a number of layers, where each neuron is a function that transforms its input value to the output value. The first layer 45
(“input layer”) gathers numerical atomic descriptors and no calculations are performed on it. Input layer is fed with structural parameters that are converted to numbers using HOSE, increments or other structure description schema. In chemical shift prediction, the last layer (“output layer”) contains a single 50
neuron that produces the predicted chemical shift. The output value of each neuron in hidden layers in between is an input to the neuron in the next layer. Different connections between neurons have different weight parameter, and the total output depends on the input non-linearly. 55
Prediction approaches based on neural networks benefit from self-learning and ability to model properties of compounds
without understanding of the underlying phenomena, which is especially demanded for non-linear relationships typical for instrumental analytical chemistry91. To make use of a neural 60
network in NMR data prediction, it should be trained against a database of known chemical shifts in order to optimize the weights of neuron connections88, 90, 91. Radomski and coworkers showed the ability of neural networks to recognize and process spectra with low signal-to-65
noise ratio, which could hardly be analyzed by regular visual inspection92. Since then a number of applications of neural networks to prediction of the NMR chemical shifts, especially 13C, have been reported for general organic compounds91, 93 and certain biomolecular classes, including proteins94. 70
Gerbst and coworkers demonstrated that ART1-type neural network is capable to identify the class of fucoidan polysaccharides from the characteristic 13C NMR signals. However, the structure abalysis quality was satisfactory only if the neural network training set contained exactly the residues 75
present in a molecule to identify95. A combination of fragmental approach and usage of a neural network is implemented in various computational tools. Particularly, ModGraph 13C NMR predictor includes a neural network algorithm to help the prediction of molecules, which are 80
not well represented in the HOSE reference database. Testing of this neural network on 345,000 reference spectra exhibited an average deviation between experimental and calculated chemical shifts of below 2 ppm96. Purtuc and coworkers designed a neural network with 85
extensive utilization of stereochemical information in 13C NMR chemical shift prediction with no need in 3D atomic coordinates97. The data used during training and evaluation of the network were selected from the CSEARCH database of ca. 230,000 13C NMR spectra (ca. 2,700,000 chemical shifts). A 90
typical training set consisted of 400,000 examples selected on a random basis to reduce the resource consumption during network optimization79. Le Bret reported a neural network trained on 8,342 13C NMR chemical shifts described by 314 topological and chemical 95
descriptors related to the atom itself and its nearest neighborhood. The average deviation of 4.5 ppm was claimed to be independent on the size and complexity of the molecule. However only routine molecular types and molecules smaller than 64 carbons were considered98. 100
Meiler and coworkers constructed a three-layer neural network that considered 28 atom types and two summarizing parameters in every of six spheres. The best results (standard deviation 2.1 ppm for ca. 15,000 test atoms) were achieved with a number of hidden neurons from 5 to 2099. Later this network was used to 105
elucidate structures of up to 20 carbons by a genetic algorithm100 and improved by the introduction of an extended hybrid numerical description of the carbon atom environment. Genetic algorithm is an iterative search heuristic utilizing benefits of evolutional algorithms, in which solution generations undergo 110
inheritance, mutation, selection and crossover101. Standard deviation for an independent test data set of ca. 42,500 carbons was reported as 2.4 ppm102. The neural network designed by Smurnyy and coworkers recognized 32 atom types and double bond stereochemistry (as a 115
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 7
separate sphere), and its output was additionally corrected with rule-based algorithm that used increments shared by two or more substituents (“cross-increments”). The network was trained on a database of 190,000 structures and about two million chemical shifts, and validated on a database of 8,500 structures and 5
~118,000 chemical shifts. It is difficult to design a single network that covers all the range of 13C NMR chemical shifts, thus reference database was split into subdatabases accordingly to the nature of the central atom103. This neural network was used as one of the algorithms implemented in ACD Labs/NMR68. 10
Smurnyy and coworkers compared neural network and least-square linear regression approaches in prediction quality and performance, and optimized several parameters (number of subdatabases, number of structural descriptors, network parameters etc.). As a result, they obtained an average error of 15
1.5 ppm for 13C and 0.2 ppm for 1H NMR chemical shifts, and supposed that further improvement is much more dependent on the choice of structural, and especially stereochemical, descriptors and quality of the training databases rather than on the regression method. Linear regression and neural network 20
produced results of similar accuracy, however linear regression was 2-3 times faster68.
3.3. Regression-based methods
Linking of structural descriptors and chemical shifts (especially for carbons) by a mathematical relationship and obtaining weight 25
factors has been a challenging task for several decades. In 1987 McIntyre and Small developed a methodology for simulation of the 13C NMR spectra of monosaccharides. Using experimental data from literature and own recorded spectra of 35 pyranoses and methyl pyranosides, the authors constructed models that 30
related observed chemical shifts to 2-6 numerical parameters encoding aspects of carbon atom chemical environment (functions of distances, van der Waals’ energies, etc.). These parameters encoded the effects of multiple oxygen atoms in the carbon atom surrounding. They were derived from the atomic 35
coordinates optimized by MM2 calculations of both chair conformations of every monosaccharide. The authors applied a multiple linear regression analysis to construct chemical shift models independently for five carbon types in pyranose residues. The models were tested on 15 pyranoses and methyl pyranosides 40
not included in the reference set. The standard prediction error appeared to be from 0.43 to 0.85, depending on the atom type. This pioneering study104 encouraged further development of regression-based methods in computational analysis on NMR structural data of carbohydrates. 45
It was shown that a chemical shift can be represented as a function of variables describing characteristic molecular features. Within every proposed mathematical model, an experimental database can be used to calculate the regression parameters and to check the prediction. Least-square regression techniques, neural 50
network or HOSE approach were used to formulate the additivity rules within the NMR parameter prediction by incremental method on the atomic level68. In contrast to other incremental schemes, such combination requires a potentially smaller number of examples from which the necessary rules can be established, 55
followed by application to a broader range of chemical structures.
The general-purpose atom-based regression scheme, derived using least-square method, has been recently designed by Blinov and coworkers. As compared to neural network approach, usage of linear regression provided ultra-fast calculation (ca. 10,000 13C 60
NMR chemical shifts per second on a desktop computer) with an average deviation of 1.85 ppm 105. Within this scheme every atom surrounding an atom under consideration is characterized with 9 parameters (element, hybridization state, valence, etc.). The concept of “atom pairs” was introduced to the single-atom 65
increments and to add more descriptors to the structure encoding105. Mitchell and Jurs developed linear-regression mathematical models to obtain 13C NMR chemical shifts from a number of atom-based structural descriptors of monosaccharides106. These 70
descriptors included topological, geometric, and electronic information about carbon atoms in a conformation obtained by energy minimization using MM2 force field. The training data set included 55 pyranoses and 56 furanoses. As a result of multiple linear regression analysis, an eleven-descriptor model was 75
designed for pyranoses and an eight-descriptor model was designed for furanoses. The models were submitted to neural networks, giving improved results with final RMS deviation of 1.03 ppm for pyranoses and pyranosides and 1.58 ppm for furanoses and furanosides106. 80
A similar approach has been used by Clouser and Jurs for prediction of 13C NMR chemical shifts of 17 ribonucleosides107. The atoms to predict were divided into two subsets, one for those inside the ribofuranose ring, and the other for those contained in nucleosides. Multiple linear regression allowed building of a 85
four-descriptor model (three topological descriptors and one geometrical) for the former subset and an eleven-desciptor model (four electronic, three topological and two geometrical descriptors) for the latter one. Submission of the derived models to a three-layer fully-connected neural network made it possible 90
to reach the accuracy of 0.39 ppm for the first subset and 0.98 for the second one. The former value does not differ much from a regression model output as there are not enough input descriptors to make use of non-linearity of a neural network. In the case of the second subset usage of neural networks significantly 95
improved the prediction accuracy as compared to a regression model output107.
3.4. CHARGE approach
CHARGE is a semi-empirical incremental scheme based on electronic, steric and other effects parameterized for a variety of 100
functional groups108, 109. CHARGE algorithms do not include geometry optimization but should be given a determined geometry of a molecule to process. CHARGE is implemented as a part of ModGraph 1H NMR chemical shift predictor73 (implemented in MestreLabs MestreNova and Cambridge 105
ChemBioOffice). This predictor starts with generation of all 3D conformers from a primary structure, followed by CHARGE prediction for each conformer and resulting in a weighted average spectrum. The prediction includes the substituent chemical shifts approach, which is a general-purpose additive incremental 110
scheme utilizing 3D structures. This approach is the extension of the Proton Shift program developed earlier110.
8 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
The CHARGE approach combines short-range and long-range substituent effects. The short range effects are reflected in the calculation of the partial atomic charge of the atom under consideration, based upon electronegativity and polarizability of atoms in close proximity and the dihedral angles. The calculated 5
α-, β- and γ-effects produce a partial charge on the given atom, which is converted to the charge-derived chemical shift using the equation δcharge = 160.84×q-6.68. The effects of more distant atoms on the 1H NMR chemical shifts are represented as a sum of steric, electric field, anisotropic, π-electron and ring current 10
contributions. CHARGE is considered less fundamental but faster and more convenient in routine usage than ab initio calculations. No dedicated parameterization for carbohydrates has been reported, however parameterization of CHARGE for polyatomic alcohols, 15
including inositol, provided acceptable agreement with the experimental data109. Escalante-Sanchez and Pereda-Miranda used this approach to simulate oligosaccharide 1H NMR spectra and to find proper parameters for the 1st and 2nd-order analysis of the experimental 1D NMR data111. The scope of the study 20
included batatin I, batatin II and two ester-type dimers of acylated plant pentasaccharides. The experimental NMR spectroscopic values registered for batatinoside I were used as a starting point for the NMR simulation of batatins I and II. Spectroscopic simulation carried out in Mestre-C was used to reproduce the 25
registered 1H NMR data and thus permitted a correct assignment for the chemical shifts and coupling constants of all superimposed protons in batatins I and II111.
3.5. Incremental approach at the residual level
General-purpose computational tools discussed above, based on 30
incremental and neural network approaches, do not provide the accuracy sufficient for 13C NMR “fingerprint” of natural glycans. In contrast to the fragmental approach on the atomic level, algorithms that partition structures on the level of residues were much better parameterized for carbohydrates. The latter approach 35
implies application of the substitution effects to the spectra of monosaccharides or other small structural fragments. The substitution effects reflect chemical shift changes caused by addition of certain structural units to a known position in a monosaccharide. The more structural features of substituents are 40
taken into consideration, the better the spectrum simulation accuracy of is. Thus, the accuracy of chemical shift computational prediction significantly depends on completeness of the spectroscopic databases for a given class of monosaccharides. 45
Toukach and Shashkov implemented incremental 13C NMR prediction scheme developed earlier112 in the computational tool BIOPSEL, capable to predict 13C NMR chemical shifts of regular glycopolymers in water solutions113. Incremental approach was used in calculations to elucidate polymeric glycan structures 50
based on 13C NMR data only. An empirical database of chemical shifts of mono-, di- and trisaccharide fragments was obtained from retrospective literature analysis and applied in calculations. A substitution effect database derived from published spectra of di- and trisaccharides was used to calculate chemical shifts of the 55
unknown structural entities.
Rigorous verification of BIOPSEL predictions was carried out on repeating units of Proteus bacterial polysaccharides113. The published experimental structures were found among the five highest ranked predicted structures in 80% cases, of which in 60
60% cases the correct structure was ranked the highest. The simulated spectra showed average deviation from the experimental data in the range from 0.13 to 0.45 ppm. Recently chemical shift prediction module of this software became a part of Bacterial Carbohydrate Structure Database114, got web-65
interface115, and was extended to predict 13C NMR chemical shifts and glycosylation effects for oligomeric or polymeric glycans, including those containing rare monosaccharides. Widmalm and coworkers designed a web-interface116 to the CASPER program for structure elucidation of oligo- and 70
polysaccharides using 13C and 1H NMR data, including chemical shift correlation experiments117. They provided a schema for structural elucidation of polysaccharides based solely on the NMR data118. The algorithm of CASPER, which uses an incremental approach to the calculation of 13C and 1H NMR 75
chemical shifts, was developed earlier119. There are three data
Fig. 4. Conformation of a tetrasaccharide repeating unit of Shigella
dysenteriae type 2 O-antigen predicted by MM3(1996) with the use of 80
genetic algorithms120. Reproduced with permission, © Elsevier Ltd., 2005.
categories utilized in the simulation of NMR spectra: chemical shifts in monosaccharides, glycosylation shifts in disaccharides, and correction sets being the differences between the observed 85
chemical shifts for spatially strained trisaccharide models and those calculated by the additive approach119. The interface and the underlying program have been extensively tested using published data and proved to be able to simulate 13C NMR spectra for >200 structures with an average 90
error of about 0.3 ppm/resonance. When applied to the repeating units of Escherichia coli bacterial polysaccharides, the published structures were found among the five highest ranked predicted structures in 75% cases. The average deviation between calculated and experimental chemical shifts was 0.54 ppm and 95
0.06 ppm for 13C and 1H nuclei, respectively. Oligosaccharide 13C spectra were calculated with the average error of 0.23 ppm/resonance and the correct structure was ranked first or second in all the cases examined121.
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 9
4. Models and methods for carbohydrate 3D structural studies
In the present section we discuss only those theoretical models and methods that were coupled with the NMR structure analysis. Other computational approaches used in the studies of 5
carbohydrates has been reviewed elsewhere122-125.
4.1. Molecular mechanics and molecular dynamics
Molecular mechanics (MM) uses Newtonian mechanics to model molecular systems and calculates the potential energy using the sets of atomistic parameters derived from small model 10
compounds (force fields). The basics of this method are described in a monograph by Burkert and Allinger126. Several MM force fields, such as CHARMM and GLYCAM, have been specially optimized for carbohydrates44. A multitude of force field parameterizations have been studied in order to 15
account a flexible nature of carbohydrates. Of general-purpose force fields, MM3 has been one of the most popular ones for the optimization of the oligosaccharide structure. An example of atomic coordinates produced by MM3 energy calculations and genetic algorithms is depicted on Fig. 4. The search using 20
Fig. 5. Relative usage of modern carbohydrate force fields (based on citation index during 2005-2010)123. Reproduced with permission, ©
Elsevier Ltd., 2010. 25
GLYCAL software was performed in the conformational space of torsion angles of glycosidic bonds and exocyclic groups. Genetic algorithms use operators like mutation and crossover to generate offsprings over a random population of conformations evaluated by MM3 energy, and terminate after a fixed number of 30
generations or at no further improvement. This approach allows significant expansion of conformational space that can be explored at reasonable computational costs120. A brief guide to the MM force fields used for carbohydrate calculations is given in Table 2, and the usage statistics is 35
depicted in Fig. 5. A more complete list of force fields ever used for carbohydrates is provided in a review by Gerbst and coworkers124. Useful classical force fields applicable to
geometrical optimization of carbohydrates were reviewed by Imberty and Perez36. 40
Energy minimization procedures based on molecular mechanics and molecular dynamics are widely implemented in dedicated (Wavefunction Inc. Spartan127, Schrödinger MacroModel128, 129, MOSCITO130, 131, COSMOS132, 133 and other) or general-purpose (Gaussian Inc. Gaussian134, 135, GAMESS136-
45
138, Hypercube Inc. HyperChem139, 140 and other) software. Molecular dynamics (MD) is a form of computer simulation in which particles are allowed to interact for a period of time by approximations of known physics, giving a view of their motion. MM and MD usually share the same classical force fields, but 50
unlike MM, MD may be based on quantum chemical levels of theory. However, MD simulation capable to achieve convergence of rotamer population of the exocyclic C-C torsions with consideration of solvent requires longer timescale than assumed by a reasonable computational cost44. More detailed view on MD 55
methods is presented in a review by Adcock and McCammon141. The MD simulation technique is a good way to study inherent flexibility of a molecule since all degrees of freedom are explored simultaneously, although barrier crossing may still require very long simulations. The ensemble of MD-generated conformations 60
may be subsequently used for the prediction of parameters for which only a poor quantum mechanical experience exists. MD is of particular importance to analyze and predict NOEs in the NMR spectra of carbohydrates142-144. Replica-exchange molecular dynamics (REMD)145 employs a 65
set of frequently exchanged simulations with different temperatures, allowing a one-dimensional random walk in temperature and potential energy space. Usage of REMD for conformational studies of carbohydrates has been recently reviewed146. 70
4.2. Semi-empirical methods
Semi-empirical methods use sets of parameters derived from the experimental data in order to simplify the approximation of the Schrödinger equation. Therefore, relatively low computational resources are required and the calculations can be practically 75
applied to large molecules147, or used to obtain a starting point for subsequent ab initio calculations. Most of semi-empirical methods are known to operate poor on molecules with hydrogen bonding, transition structures, and molecules containing atoms for which they are poorly parameterized147. Among the semi-80
empirical methods employed in 3D structure elucidations of carbohydrates were AM1, PM3, and MNDO148, 149. Some of the studies used AM1 for the geometry optimization with subsequent DFT calculations of shielding in oligosaccharides150-152. Later publications often involved PM5 and PM6 methods153 applied to 85
carbohydrates and glycoconjugates154. Bond polarization theory (BPT) is a semi-empirical approach, designed by Sternberg and coworkers in 1988155, which linearly correlates atomic charges and chemical shifts to bond polarization energies. It was applied to the calculation of 13C NMR chemical 90
shift tensors with accuracy comparable to ab initio methods as they were in 1997156 and gave rise to a number of improvements such as COSMOS force field132. This force field allowed calculation of solid state chemical shifts at reasonable
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 10
Table 2. MM force fields reported for calculations of carbohydrates. a
Name Description Implementation a ref.
MM3 MM3(1992) MM3(1996) MM3(2000)
2nd generation molecular mechanics force field for C, H, O and N atoms. It has been extensively used for carbohydrates. The MM3 force field takes into account the stretching, bending, stretch-bending, torsional and dipolar contributions and van der Waals interactions. It accounts for the anomeric and the exo-anomeric effects and has some provisions for estimation of hydrogen bonding159.
GAUSSIAN 134, 135, PCModel 160, Tinker 161
162
MM+(91) MM+
A variant of MM2 combining a functional from MM2(77) and parameterization from MM2(91) with a number of extensions.
HyperChem 139, 140 163
CHARMM CHARM22 CHARM27
Chemistry at Harvard macromolecular mechanics, a family of classical force fields for the calculation of macromolecules using molecular dynamics, and an associated software package. CHARM22, originally designed for proteins, was parameterized for explicit water model. CHARMM27 was reported to be suitable for sugars within nucleic acids.
CHARMM 164, GROMACS 165, 166, Tinker161
164, 165,
167, 168; 169
(review)
- All-atom additive empirical force field consistent with CHARMM and parameterized for the hexopyranose monosaccharides and linkages between them.
CHARMM 170, 171
- Parameterization of the additive all-atom CHARMM force field for acyclic polyalcohols, acyclic carbohydrates, and inositol.
CHARMM 172
PARM22/SU01 CHARMM22 modified for pyranosidic carbohydrates. CHARMM 173
HSEA Hard sphere approach with consideration of the exo-anomeric effect. It was shown to be able to predict the 3D structure and conformation of large oligosaccharides.
GESA, GEGOP 174
CHEAT95 Extended atom force field for hydrated oligosaccharides, a modification of CHARM22 with special atom type to account hydrogen bonding.
CHARMM 175
HGFB A revised CHARMM-type molecular mechanics potential energy function specially developed for use in the dynamical simulation of simple carbohydrates in aqueous solution. The force field was shown to represent the vibrational spectrum and ring pucker of pyranoses.
CHARMM (?) 176
PHLB Molecular dynamics force field aimed to correct the unrealistic flexibility of the HGFB carbohydrate model. Specific dihedral angle terms are parameterized to reproduce experimental vibrational frequency data and small molecule ab initio dihedral angle rotational energy profiles.
CHARMM 177
GLYCAM_93 GLYCAM2000 GLYCAM06
This generalizable biomolecular force field was initially designed to add carbohydrate functionality to AMBER. Later this dependence was removed, as well as all general or default parameters, and explicit water was accounted for.
AMBER 178, 179 180, 181
GROMOS This classical general-purpose force field associated with MD simulation software package for the study of biomolecules (A-version) has been developed for application to aqueous or unpolar solutions of proteins, nucleotides and sugars. A gas phase version (B-version) for simulation of isolated molecules is also available.
GROMOS 182-184, GROMACS
185, 186
45A4 This parameter set based on GROMOS, was developed for the explicit solvent simulation of hexopyranose based carbohydrates.
GROMOS (?) 187
OPLS-AA Originally designed as optimized potentials for liquid simulations (all-atom) it was later extended for carbohydrates and parameterized to reproduce the ab initio calculation of energies of 4C1 pyranoses with explicit water.
MOE, Tinker, Towhee
188
COSMOS-NMR Hybrid QM/MM force field that uses localized bond orbitals with fast BPT formalism for semi-empirical calculation of atomic charges and NMR parameters. It was adapted to a variety of compounds including macromolecules and optimized for the NMR-based structure elucidation. Explicit quantum-mechanical calculation of electrostatic properties is utilized.
COSMOS 132, 133 132
CSFF A development of the PHLB and HGFB carbohydrate force fields optimized for carbohydrate solutions and having improved hydroxymethyl rotations.
CHARMM 189
AMBER A functional form from which a family of classical explicit-solvent force fields are derived for molecular dynamics of biomolecules (GAFF, GLYCAM).
MacroModel 128, 129, AMBER 178, 179, other
190; 191
(review)
Amber-H Derived from AMBER for conformational analysis of oligosaccharides. Insight II 192 193
BIO+ A force field based on CHARM22 and CHARM27. HyperChem 139, 140
a Only implementations cited in carbohydrate studies are listed; other implementations used for carbohydrates implicitly are not covered here.
computational cost, as DFT methods under periodic boundary conditions demanded much higher computational power157. Later Sternberg and coworkers used COSMOS force field in 5
combination with 13C solid state chemical shift target functions to investigate the structure of cellulose I and II158. The parameters of linear polarization model for BPT were determined from a least
square fit to atomic charges in small molecules obtained by ab initio calculations using the 6-31G(d,p) basis set. The average 10
deviation between calculated and experimental data, derived from reported chemical shifts was 0.47 ppm, 0.89 ppm and 0.67 ppm for cellulose-II, cellulose-Iα and cellulose-Iβ, respectively.
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 11
Witter and coworkers investigated the spectrum assignment for 13C-enriched bacterial cellulose Iα 194. The crystal structure was refined using the 13C NMR chemical shifts as target functions, giving 0.37 Å RMS difference with the structure determined by neutron diffraction (for heavy atoms only). Starting with 5
coordinates derived from neutron scattering, the MD simulations yielded four ensembles containing 800 structures. These four models were geometrically optimized with the given isotropic NMR chemical shift constraints and application of the crystallographic boundary conditions. 13C NMR chemical shift 10
tensors were simulated for each model (using BPT with coordinate-dependent charges) and compared with the experimental chemical shift anisotropy information obtained by 2D iso-aniso RAI acquired at magic angle spinning speed of 10 kHz. The calculations based on the COSMOS force field allowed 15
obtaining isotropic chemical shifts with average deviation of 0.59 ppm per resonance.
4.3. Ab initio and density functional modeling
A quantum chemistry modeling approach implies a combination of a theoretical method (level of theory) with a basis set. Each 20
unique pairing of method with basis set represents a certain approximation of the Schrödinger equation. Results for different systems may only be compared when they have been predicted via the same model147. The more electron correlations are considered in a theory level and the bigger a basis set is, the more 25
accurate but more computationally-expensive the calculation is. Hybrid functionals define the exchange functional as a linear combination of Hartree-Fock, local, and gradient-corrected exchange terms. The hybrid functionals most widely reported in structural studies of carbohydrates are Becke's three-parameter 30
formulations (B3LYP195, 196 and B3PW91195, 197) and their modifications. Detailed description of functionals and basis sets is beyond the scope of this review, and is reviewed elsewhere147. During recent decades density functional theory (DFT) gained increasing popularity in computations of various biomolecular 35
systems. Good accuracy at reasonable demand in computational resources is the important advantage of DFT calculations. Detailed descriptions of various functionals as well as of the scope and applications of DFT calculations were published198-204. Time-dependent DFT was reported in context of description of 40
electromagnetic field to substance interaction205, 206. QM calculations are carried out in two stages to predict the NMR properties of molecules: 1) geometry optimization to obtain three-dimensional structure; and 2) calculation of NMR parameters for a certain geometry. Very often different levels of 45
theory are applied at these stages and in most cases calculation of the NMR parameters (stage 2) requires more sophisticated level as compared to geometry optimization (stage 1). Choozing a proper combination of theory levels is an important question discussed in more details below (sections 4.3 and 4.4). 50
Several computational approaches were developed for prediction of the magnetic properties and NMR parameters. Gauge-independent atomic orbitals (GIAO) method for NMR shieldings proposed by Ditchfield in 1974207 implies that atomic orbitals have their own local gauge origins placed on the orbital 55
center and defining the vector potential of the external magnetic field. Incorporation of such features of DFT as accurate non-local
exchange-correlation functional and bigger basis sets in GIAO calculations led to significant improvement of the shielding tensor calculation quality208. 60
Attempts to improve the efficiency of the magnetic property calculations have been undertaken by applying the gauge factors to localized molecular orbitals instead of every atomic orbital. These attempts were formalized in the individual gauge localized orbital (IGLO) method209 and the localized orbital/local origin 65
(LORG) method210. The performance of IGLO was studied on small organic molecules at first,211 and later the method was combined with DFT calculations212. A few studies reported usage of GIPAW for solid state chemical shift prediction in carbohydrates213. GIPAW is a theory for all-electron magnetic 70
response within the pseudopotential approximation, based on extension of Blöchl’s PAW approach. As a valuable feature, GIPAW is valid for both finite and periodic-boundary conditions214. Density functionals commonly used in GIPAW studies have been PBE215 and KT3216. The latter is a 75
semiempirical exchange-correlation functional specially designed for the calculation of organic nuclei shielding tensors and reported to outperform hybrid functionals for molecules forming hydrogen bonds217. Comparison of GIAO, IGLO and LORG calculations showed 80
better efficiency of GIAO in terms of the required basis set and provided more accurate results218. GIAO internally extends the basis set with higher angular momentum orbitals, which are necessary for the correct description of the perturbed systems. In contrast, all atomic orbitals participating in a localized molecular 85
orbital share the same gauge factor. As compared to localized methods, GIAO is less sensitive to the quality of the employed basis set, and thus provides faster convergence of the calculated chemical shielding and does not require polarization functions to achieve the same level of accuracy218. 90
Nowadays, the main drawback of GIAO as compared to the localized methods, i.e. lower calculation performance, has been significantly compensated by development of computer hardware. The performance of modern desktop computers is now sufficient to predict NMR properties of small and medium sized 95
molecular systems with reasonable accuracy. As a result, GIAO calculations combined with density functional theory level from early 90s 219 are often used to predict NMR properties of organic and biomolecular systems. An important issue for reliable computational prediction of the 100
NMR parameters is a selection of a proper theory level for geometry optimization. NMR shielding tensor is a property that can be computed in the context of a single point energy calculation. HF/6-31G(d) on geometry optimized with B3LYP/ 6-31G(d) was cited as minimal model for predicting the NMR 105
parameters220. Due to hydrogen bonding, the basis set properly describing energies of carbohydrates should include diffuse functions; B3LYP/6-311++G(2d,2p) was reported as minimal for accurate description of aldo- and keto-hexoses in both furanose and pyranose forms221. 110
Reduction of scaling of QM calculations to the lower powers of molecular size has been a challenge. It became possible to linearize the scaling for the geometrical222 and energetic (DFT) calculations223. Within a method for the calculation of NMR chemical shielding introduced by Ochsenfeld and coworkers224, 115
12 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
the cubic increase of the computational effort with molecular size is reduced to linear. This allowed treatment of large molecules (>1000 atoms with no need for molecular symmetry) at the HF and DFT levels. According to a survey of approaches to CST calculation done 5
by Sefzik and coworkers225, in most cases none of DFT functionals could perform better than HF in calculation of chemical shielding tensor components in eight solid state
Fig. 6. The structure (A) and 13C NMR chemical shift surface for the 10
anomeric carbon at the glycosidic bond (B) of α-D-Glcp-(1-4)-α-D-Glcp disaccharide in water obtained using ONIOM(DFT:HF) method226.
Reproduced with permission, © Elsevier Ltd., 2009.
1-methylpyranosides, erythritol and sucrose, however absolute values were close to the experiment. cc-pVDZ and cc-PVTZ 15
basis sets were used. A number of other methods to predict chemical shifts using quantum-mechanics calculations, were summarized by Gregor and Mauri227. General topics related to calculation of magnetic properties and the NMR parameters are well-reviewed in the 20
scientific literature61, 64, 228, 229. The main scope of the present review are the NMR computational studies of carbohydrates and their limitations.
4.4. Hybrid QM/MM, QM/QM and ONIOM approaches
Recent development of hybrid theoretical approaches made it 25
possible to divide large molecular systems into several subsystems (layers) and to treat them at different levels230-233. In these hybrid calculations the most important and relatively small
part of the molecule (higher layer) is treated at more accurate quantum mechanical theory levels, whereas other parts of the 30
molecule are treated at the less computationally-demanding levels, such as MM or low level QM. The molecules or molecular systems are usually partitioned into two (high and low) or three (high, medium and low) subsystems. In the two-layer approach the resulted hybrid methods are noted as QM/MM, QM/QM, 35
ONIOM(QM:MM) or ONIOM(QM:QM)234. In the three-layer approach the system of interest can be described as ONIOM (QM:QM:MM) with several combinations of theory levels for different layers. Utilization of hybrid approaches significantly speeds up the 40
calculation and overcomes the size limitation in computational studies. In the best case hybrid approach combines the accuracy of high level QM calculations at the speed of relatively fast low level methods (MM, etc.). The scope and limitations of hybrid approaches for studying organic and biomolecular systems were 45
reviewed in the literature230-233 including the description of developed computational tools235. ONIOM(DFT:MM) and ONIOM(DFT:HF) calculations have shown excellent performance in structure optimization and energy calculations, particularly for derivation of chemical shift surfaces of glycosidic 50
bond carbons (example in Fig. 6, discussed below)226. Two general strategies are explored in modern carbohydrate studies involving hybrid calculations. The first strategy is based on hybrid calculations only at the geometry optimization step, followed by derivation of the NMR properties with regular 55
methods and treatment of the whole molecule at the same level (usually it is the highest QM level achievable with existing computational resources). This approach benefits mainly from performance increase on the stage of molecular structure optimization. It is a reasonable and very useful combination since 60
geometry optimization is often much more time-consuming compared to GIAO calculations of chemical shifts236. The second strategy allows utilization of the hybrid approach features both in geometry optimization and in magnetic properties calculation. Morokuma and coworkers have demonstrated the 65
efficiency of the hybrid approach to calculate the NMR chemical shifts using the two-layer ONIOM scheme237. In this calculations the small (model) system containing the atoms of interest was described at a higher level of theory, and the rest of the molecule was described at a lower level. The resulting shieldings were 70
expressed as: σiso [ONIOM] = σiso (high level, model) + σiso (low level, whole molecule) - σiso (low level, model). A general recommendation for molecule partitioning says that a minimal model system for the NMR property calculation should include a nucleus for which the high accuracy is needed and its closest 75
heavy neighbors237. The usage of combined QM/MM method for the validation of the geometrical modeling of the complex of E-selectin with sialyl Lewis X was reported by Ishida238. A combined modeling was proposed to identify complex sugar-chain conformations on the 80
reduced free energy surface. The free energy profile was evaluated by classical MD simulation followed by ab initio QM/MM energy corrections. Flexible carbohydrate structures were mapped onto the reduced QM/MM 2D free energy surface, and the details of molecular interactions between each 85
monosaccharide component and the amino acid residues at the
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 13
carbohydrate-recognition domain were identified. Using the computational procedure of the chemical shielding tensor evaluation239 the calculations for large molecules including a carbohydrate ligand were performed. This study confirmed the
modeling validity by evaluation of the 1H NMR chemical shifts 5
by ab initio QM/MM-GIAO computations at HF/6-31G*. 20 QM/MM-refined geometries sampled from the minimum free
Fig. 7. Two conformations of the IdoA2S residue of a heparin disaccharide in water: 1C4 (A) and 2S0 (B). The GlcN6S residue is in the 4C1 form. Violet dots represent sodium ions. Only a part of water molecules is shown for clarity240. Reproduced with permission, © American Chemical Society, 2011. 10
energy region in the free energy surface were used, and the averaged theoretical data were compared to the experimental NMR spectrum238. Although most proton chemical shifts were reasonably assigned by QM/MM-GIAO averaging, some resonances showed an upfield shift by 0.2-0.3 ppm, as compared 15
to the experiment. Most of these deviations were observed when monosaccharide units were exposed to a solvent-accessible region and had a relatively high flexibility. The study confirmed excellent potential of hybrid approach to study carbohydrates, as well as it pointed out the necessity of more accurate consideration 20
of solvent effects.
4.5. Interaction with solvent
The ability of carbohydrates, especially polysaccharides, to adopt a wide range of dynamic conformations in solution was recognized as the central factor for many of their biological 25
functions, and thus interaction with solvent cannot be neglected. Not only NMR properties, but also the geometry should be simulated with consideration of the solvent effects. A multitude of hydroxyl groups present in carbohydrates lead to noticeable contribution of the solvent-solute interaction and introduce 30
visible differences between solution and X-ray structures36. Structure of carbohydrates in solution is strongly influenced by solvent, which is in most cases water. In classical simulations water is often represented using a three-site (TIP3P), a four-site or a five-site water model241. As implemented in CHARMM, this 35
model implies that each atom in a water molecule is represented by a point charge and a Lennard-Jones potential energy term, and the algorithm used does not allow the water molecule geometry to change throughout the simulation. Simple water models
predominate in MD studies due to faster calculation and better 40
correspondence with existing force fields242. In contrast to rigid and non-polar molecules, carbohydrates possess strong and specific solute–solvent interactions due to hydrogen bonding and have conformational degrees of freedom, possibly with solvent-dependent distribution. Due to these factors 45
the full dynamics of carbohydrate molecules in solution is a
Fig. 8. 13C NMR chemical shift surfaces for two transglycosidic carbons of α-(1-4)-linked D-Glcp disaccharides, as a function of the glycosidic 50
bond dihedrals151. Reproduced with permission, © Elsevier Ltd., 2005.
challenging topic243. A common approach to the description of the dynamics is running an MD simulation for solute surrounded by solvent molecules with subsequent extraction of snapshots from the trajectory file. Calculation of the NMR properties 55
implies averaging over these molecular clusters as well244. However, MD simulation capable to achieve convergence of rotamer population of the exocyclic C-C torsions with consideration of solvent requires a timescale of more than 100 ns 245. This timescale is longer than assumed by reasonable 60
computational cost44.
14 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
In quantum chemical calculation using HF and DFT levels, a number of solvation models have emerged. To improve the calculation performance polarizable continuum model (PCM) represents solvent as a continuum rather than individual molecules246. Several modifications of the continuum model 5
differing in interpretation of the solvent electric conductivity led to development of DPCM (solvent is treated as a dielectric) and CPCM (solvent is treated as a conductor) models247. The performance of continuum models in various solvents and their influence on geometry optimization of solute molecules were 10
addressed248, 249. Marenich and coworkers presented several solvent-independent continuum solvation models, including those based on the quantum mechanical charge density of a solute and parameterized for various organic compounds250, 251. Among them, SM8 claimed to be the most accurate continuum solvation 15
model for prediction of the free energies of solvation of molecular solutes252. Conductor-like screening model (COSMO) of solvation treats solvent as a conducting continuum located outside the molecular cavity. The shape of the cavity depends on a certain 20
representation of method and is usually constructed from Wan der Vaals radii of the atoms of the modeled compound. In contrast to PCM, COSMO derives the solvent polarization from the distribution of the electric charge of the solute. It is more accurate for solvents with higher permittivity, such as water, 25
which can be more likely modeled as a conductor253. Bagno and coworkers tested the QM prediction of the NMR parameters of glucose in water for the snapshots taken from the MD simulation of a target molecule with up to 5.5Å water sphere. Application of COSMO at the last step of DFT processing did not 30
have a valuable effect on the accuracy of chemical shift calculations254. An explicit solvent model is physically appropriate for charged molecules with strong solute-solvent interactions255. As an example, explicit inclusion of water molecules and counterions 35
allowed a comparative study of conformational, solvent, and counterion effects on coupling constants in a heparin unit (Fig. 7)240. An example of explicit inclusion of water in HF GIAO calculations of a large molecular system within a linear-scaling method has been described224. The hybrid implicit/explicit 40
solvation was investigated by Lee and coworkers, implying explicit hydration of a solute by a layer or a sphere of water
molecules, while the bulk solvent is modeled as a continuum256. ONIOM-PCM approach provides a good opportunity for investigation of hybrid solvation models257. 45
For further details of particular solvation methods and their scope and limitations please refer to the dedicated publications242,
258.
5. Computation of NMR chemical shifts
Chemical shifts have been recognized as characteristic indicators 50
of primary and regular secondary structure of carbohydrates. This section summarizes recent applications of semi-empirical and quantum chemical computations to the prediction of the NMR shielding parameters in glycans and their derivatives. Techniques used to calculate chemical shift tensors in general organic 55
chemistry were reviewed elsewhere259. It should be noted that a direct output of chemical shift calculations (e.g. GIAO) is an anisotropic chemical shielding tensor, which can be later converted to the isotropic chemical shielding observed in liquids: σiso=(σ11+ σ22+ σ33)/3, where σii are 60
the principal components of a magnetic shielding tensor expressed along three orthogonal axes in a molecule. The chemical shift is expressed as the difference between shielding of a reference compound (normally TMS, processed at the same level of theory as a target molecule) and the calculated shielding. 65
The operation of conversion of the shielding tensor to the isotropic chemical shift is often implemented in programs providing the interface to quantum chemical software packages. A chemical shift surface (CSS, example in Fig. 8, discussed below) term is used to reflect the dependence of the chemical 70
shift of the atoms in close proximity to the glycosidic bond on its φ and ψ torsion angles.
5.1. Monosaccharides and derivatives
The following analysis of available literature data and corresponding discussion are sorted by the increasing complexity 75
of the studied system. The current section covers the results obtained for monosaccharides and their derivatives containing a single sugar ring, where the basic fundamental properties and relationship between the NMR data and molecular structure can be revealed (Table 3). 80
Table 3. GIAO prediction of chemical shifts in monosaccharides, their derivatives and conjugates
Object (molecule) Parameter a
: nuclei Calculation method Application ref.
Geometry Shielding Software
α-D-Glcp, β-D-Glcp (population-weighted conformers in aqueous solution)
CS: 1H, 13C B3LYP/ 6-31G(d,p), Solvation energies: B3LYP/6-311++G(2d,2p)
B3LYP/pcJ Gaussian 03 134,
135 analysis of the experimental data
260
β-D-Glcp (five conformers)
CS: 1H, 13C, 17O
MP2/cc-pVDZ ONIOM [MP2 : HF/ 6-311++G(2d,2p)]
Gaussian 98, TURBOMOLE 261, 262
validation of ONIOM and providing guidelines for the selection of ONIOM model systems
263
α-D-Glcp, β-D-Glcp
CS: 1H, 13C, 17O
MM+ B3PW91/6-31+G(d) RHF
Gaussian 94, HyperChem 4.5 139, 140
validation of a DFT GIAO calculation on an MM+ geometry
264
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 15
Table 3, continued
β-D-Glcp CS: 13C (solid state)
X-ray B3LYP/ 6-311+G(2d,p) (GAIOCHF procedure)
Gaussian 03 theoretical investigation of effects of the conformation and hydrogen bonding on 13C isotropic chemical shifts
265
α-D-GlcpNH3+-1,4Me2
(chitosan monomer model) CST: 1H, 15N, 17O (solid state)
X-ray; B3LYP/ 6-31++G(d,p) for protons only
B3LYP/ 6-311++G(d,p), B3LYP/6-31++G(d,p)
Gaussian 98 investigation of the hydrogen-bonding effects on the CS tensors
266
α-D-GlcpN (chitosan monomer)
CST: 1H, 15N, 17O (solid state)
X-ray; B3LYP/ 6-31++G(d,p) for hydrogens only
B3LYP/ 6-311++G(d,p) B3LYP/6-31++G(d,p)
Gaussian 98 investigation of the hydrogen-bonding effects on the CS tensors of an anhydrous crystalline structure
267
α-D-Glcp (gas phase)
CS: 1H, 13C B3LYP/ 6-31G(d,p), B3LYP/ 6-31+G(d,p)
B3LYP/cc-pVTZ; B3LYP/aug-cc-pVTZ
Gaussian 03 (QM calculations); MOSCITO130,
131 (MD simulations)
investigation of the solvent effects and comparison of calculation methods
254
α-D-Glcp (in aqueous solution)
CS: 1H, 13C B3LYP/ 6-31G(d,p), MD (OPLS-AA-type)
glucose: B3LYP/cc-pVTZ; BP86/TZ2P; water: TZP.1s; B3LYP/6-31G**
(PhO)2-P-6)-α-D-Glcp-1OMe, PhO-P-6)-α-D-Glcp-1OMe, PhO-P-6)[L-Gly(1-3)]-α-D-Glcp-1OMe,
CS: 13C, 1H, 31P
B3LYP/6-31G(d) B3LYP/6-31G(d) Gaussian 03 clarifying the structural details of the synthesized esterified methyl α-D-glucopyranoside derivatives
268
β-D-Fucp × Toluene, β-D-Glcp × 3-methylindole, β-D-Glcp × p-hydroxytoluene
CS: 1H DFT-D BLYP/TZV2D
BLYP/TZV2D Gaussian 03 (modified)
investigation of the carbohydrate–protein recognition on models
269
α-D-Lyxp-OMe, α-D-Lyxp-OMe × (H2O)1-3
13C (solid state, shielding constants)
X-ray data + PM3 B3LYP/6-31G(d) Gaussian 98 (QM calculations) HyperChem 5.02 (geometry)
confirmation of the 13C CP/MAS NMR and crystal structure analysis data and studies of the hydrogen bonding effects
270
α-D-Galp CS: 1H, 13C, 17O (solid state)
PBE/planewave, KT3/planewave, Vanderbilt's “Ultrasoft” pseudopotentials
GIPAW PBE/planewave,, KT3/planewave
CASTEP271-273
distinguishing hydrogen bonding network patterns by 1H chemical shift analysis; comparison of PBE and KT3
274
(PhB) β-D-Ribp2,4H-2, (PhB)2 β-D-ArapH-4, (PhB)2 α-D-Xyl fH-4, (PhB) α-D-Lyxf2,3H-2 (phenylboronic esters)
13C (shielding constants)
B3LYP/ 6-31+G(2d,p), PCM
PBE1PBE/ 6-311++G(2d,p), PCM
Gaussian 03 approval of a QM method as a tool for 13C NMR chemical shifts prediction
275
α-D-Lyxf, α-D-Lyxp 1C4, α-D-Lyxp 4C1, α-D-Glcp 4C1, α-D-Glcf
CS: 13C, 1H (α-D-Glcp)
BP86/TZVP, B3LYP/TZVP, MP2/TZVP, AM1
BP86/TZVP, B3LYP/TZVP, MP2/TZVP, HF SCF/TZVP
TURBOMOLE (QM calculations), HyperChem (semi-empirical calculations)
comparison and validation of theory levels and solvent models; the study of C6-O6 torsion effect on 13C NMR chemical shifts
276
D-Manp-1OMe, D-Galp-1OMe, D-Glcp-1OMe, D-Xylp-1OMe, D-Frup-1OMe, L-Sorp-1OMe, L-Rhap-1OMe, Erythritol (statistical study)
CST: 13C (solid state)
Neutron diffraction data
RHF, HFB, HFS, BLYP, B3LYP, B3P86, BVWN, SVWN, MPW1PW91 /cc-pVDZ, /cc-pVTZ
Gaussian 03 comparison of DFT and HF functionals
225
16 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
Table 3, continued
P-3:5)β-D-Ribf-1U (cUMP in aqueous solution)
CS: 1H, 13C B3LYP/ 6-31G(d,p)
B3LYP/cc-pVTZ Gaussian 03 testing the prediction method and selection of the appropriate solvent model
277
β-D-2-deoxy-Ribf-A, β-D-2-deoxy-Ribf-G, β-D-2-deoxy-Ribf-C, β-D-2-deoxy-Ribf-T
CST: 13C (C1’), 15N (sugar-linked nitrogen)
B3LYP/ 6-31G(d,p)
B3LYP/ (9s,5p,1d/5s,1p) [6s,4p,1d/3s,1p] for C, N and O; B3LYP/ (5s,1p) [3s,1p] for H (IGLO II)
Gaussian 03 studying the dependence of N1/9 and C1’ chemical shielding tensors on the C1’-N torsion angle and sugar pucker
278
β-D-Glcp α-D-Glcp β-D-Glcp-2,3,6Ac α-D-Glcp-2,3,6Ac
CS: 13C (C1) (solid state)
B3LYP/ 6-311+g(d,p)
B3LYP/6-311+g(d,p) Gaussian 03 studying molecular environment in the chiral cavities of commercial polysaccharide-based sorbents (CDMPC, ADMPC, ASMBC)
279
β-D-Xylp-OMe CS: 13C (all) CST: 13C (all), O1, O5, H1
BP86/TZVP MM2
PW91/IGLO-III deMon-KS, demon-NMR280-
282, MacroModel 128,
129 V5.0 (MM calculations)
calculation of chemical shielding dependence on the dihedral angle between C1 and methyl group
283
α-D-Xylp-OMe CS: 13C (all) CST: 13C (all),
PW91/TZVP MM
PW91/B-III 284
R-(1-4)-3,6-anhydro-α-D-Galp-OMe, R-(1-4)-3,6-anhydro-D-Gal-ol, (R = 3,4-dideoxy-β-D-erythro-hexopyranose)
CS: 1H, 13C MM3, B3LYP/ 6-31+G** (selected conformers)
GIAO modified MM3 (1992-2000) 162,
285 (MM calculations), Gaussian 98W (QM calculations)
study of signal displacement upon transition from a pyranose to the open form of anhGal
286
a Notations: CS – chemical shift; CST – chemical shift tensor.
Roslund and coworkers performed a complete assignment of the NMR spectra of α- and β-D-glucopyranose by iterative fit using PERCHit software. To support the experimental data they 5
calculated the 1H and 13C NMR chemical shifts of the glucose non-hydroxyl protons at the B3LYP/pcJ-2 (NMR data) and B3LYP/6-31G(d,p) (geometry) levels of theory. The authors chose a set of three conformers for α-D-glucose and five for β-D-glucose, as most stable in aqueous solution. They obtained 10
relative stability of the conformers (∆G°+solvation energies) and estimated their population assuming Boltzmann conformer distribution. The correlation between the population weighted averages of the calculated 1H NMR chemical shifts and the corresponding experimental values were surprisingly good (linear 15
correlation 0.976-0.977; MAD 0.11 ppm for α-D-Glcp and 0.07 ppm for β-D-Glcp). The correlation factor of calculated vs. experimental 13C NMR chemical shifts was also good (0.994-0.995), however the calculated spectrum was systematically ~10 ppm downfield260. The coupling constants in glucose were also 20
predicted (see details in section 6.1). Rickard and coworkers263 compared 13C, 1H, and 17O NMR chemical shifts obtained by HF-GIAO, MP2-GIAO, and ONIOM(MP2-GIAO:HF-GIAO) for five most stable conformers of β-D-Glcp and provided sample model systems for usage in 25
post-HF chemical shift predictions of larger carbohydrates. Six small model systems including 6 or 7 heavy atoms were taken out of the whole molecule of each conformer without changing its geometry. Severed bonds were saturated with hydrogens.
The results from HF-GIAO and MP2-GIAO differed 30
dramatically, especially for the anomeric carbon and the ring oxygen. ONIOM(MP2-GIAO:HF-GIAO) with three-carbon model system was capable to yield chemical shieldings in good agreement with the results from the whole-molecule MP2-GIAO calculations, except for the ring oxygen. Maximal discrepancies 35
for 4C1 conformers were: 2 ppm for 13C (C5), 0.09 ppm for 1H (hydroxyl group at C4) and 2.21 ppm for hydroxyl 17O (hydroxymethyl group). Maximal discrepancies for 1C4
conformers were: 1.15 ppm for 13C, 0.29 ppm for 1H (anomeric hydroxyl group). 40
The results for the ring oxygen in the 4C1 conformer indicated that a small model system, in which the severed bonds are only one bond away from the atom under calculation, was not enough to model the shielding of this atom. In contrast, 1C4 conformer exhibited good agreement for the ring oxygen chemical shift and 45
poor agreement for hydroxyl oxygens that formed hydrogen bonds to non-neighboring centers. To resolve these issues authors used 9-atom model system and decreased the discrepancy to 1.40 ppm (4C1 ring oxygen) and to less than 0.5 (1C4 hydroxyl oxygens). Authors conclude that a model system should preserve 50
outcoming hydrogen bonds for the accurate prediction of the oxygen chemical shifts. The best correlation between experimental and calculated 13C NMR chemical shifts was achieved on the 4C1G
+ and 4C1G-
conformers (see Fig. 2), as expected from the predominance of 55
these two forms in aqueous solution287. Both MP2 and ONIOM(MP2-GIAO:HF-GIAO) levels were found to represent
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 17
proton and carbon chemical shifts well, whereas gas-phase prediction of the oxygen chemical shifts much poorer correlated with the solution experimental data263. These calculations confirmed the earlier work by Kupka et al., in which DFT GIAO demonstrated better convergence than RHF method in application 5
to 1H, 13C and 17O chemical shifts of glucopyranose and its 1-C-methyl and 1-O-methyl derivatives264. Hydrogen bonds play an important role in shaping of polysaccharide molecules, and their characterization can reveal biological properties of polysaccharides266, such as recognition of 10
carbohydrate antigens by host antibodies. The nature of hydrogen bonds is strongly dependent on the electrostatic interaction, and the chemical shielding tensors at the magnetic nuclei were shown to be highly sensitive to hydrogen bond effects. High-level QM calculations were essential for the interpretation of the 15
experimentally observed isotropic chemical shifts. It was suggested that analysis of hydrogen bonding network with optimized proton positions and subsequent 1H chemical shift prediction can not only confirm, but also reveal a carbohydrate structure274. 20
Suzuki and coworkers have conducted a theoretical investigation of effects of conformation and hydrogen bonding on solid state isotropic 13C NMR chemical shifts for β-D-glucose and its oligomers. The absolute values of the predicted chemical shifts of a β-D-Glcp molecule extracted from the X-ray structure 25
without further optimization exposed a bias against the experimental CP/MAS 13C NMR, but the relative resonance positions were in reasonable agreement. The experimental linear relationship288 between the C6 chemical shift and the C6-O6 torsion angle in three predominant conformations of the 30
hydroxymethyl group was reproduced computationally, as well as dependencies for C4 and C5. In order to examine the effect of the intramolecular hydrogen bonding on 13C NMR chemical shifts in D-glucose (in gt conformation, see Fig. 2), authors calculated chemical shifts of the ring carbons as a function of the torsion 35
angle around the C3-O3 bond. C2 and C4 showed a strong dependency, which was explained by γ-gauche effect produced by the hydrogen atom. In contrast to the well-known γC-gauche effect of approx. -5 ppm289, γH-gauche effect induced an increase of the 13C NMR chemical shift by +3..+5 ppm if not reduced by 40
the formation of intramolecular hydrogen bonds265. Effects of various possible hydrogen bonds (including those with hydroxymethyl group) on the chemical shifts of all carbons in β-D-Glcp were analyzed. Khodaei and coworkers conducted a DFT study to calculate 45
the solid-state NMR parameters in crystalline chitosan/HI type I salt and showed the hydrogen bonding effects on the CS tensors266. They calculated the CS tensors of 17O, 15N, 13C, and 1H nuclei for two model systems: the monomer (non-hydrogen-bonded α-D-GlcpNH3
+-1,4Me2) and the target molecule in a 50
cluster. Both models were created from the X-ray coordinates, with subsequent optimization of protons at B3LYP/6-31++G(d,p). Esrafili et al. studied hydrogen bonding effects on the 17O, 15N, 13C and 1H CS tensors of crystalline anhydrous chitosan by comparison of the chitosan hexameric cluster to a 55
corresponding gas-phase monomer (α-D-GlcpN) 267. Both studies were dedicated mainly to a cluster model and corresponding details are given in the next section.
Bagno and coworkers applied several computational protocols combined from DFT and MD simulations to the prediction of the 60
alkyl 1H and 13C NMR chemical shifts of α-D-glucose in water254. For gas-phase calculations, geometry optimizations were carried out at B3LYP/6-31G** level. B3LYP/6-31+G** level of theory was also used to test the effect of adding diffuse functions, which were reported to be very important for 65
carbohydrates290. The NMR parameters were calculated using the adopted cc-pVTZ or aug-cc-pVTZ basis sets. MAD averaged from data for the three conformers concerning population distribution amounted to 7.1 ppm (13C) and 0.14 ppm (1H). In spite of satisfactory MAD on the absolute scale, the accuracy was 70
insufficient to assign all signals. Although a good correlation was observed (R2=0.994), 13C NMR chemical shifts were systematically overestimated. Having compared the calculated spectrum to the experimental one, the study showed that both the flexibility of the glucose molecule and the strong effect exerted 75
by water should be taken into account. In case of solution phase calculations, a structure can hardly be simulated by DFT calculations on the solute embedded in a small cluster of solvent molecules. The bias introduced by size effects and the flat potential energy surface is additionally complicated 80
by a flexible solute, such as glucose. To separate structural and solvent effects, glucose shieldings have been calculated at the B3LYP/cc-pVTZ as averages over 50-100 MD snapshots (until the convergence was reached) using a series of protocols, each of them emphasizing either a solute or a solvent. 85
In protocol a, authors used the glucose molecule geometry from the modified OPLS-AA force field calculation without explicit water molecules, but included the solvent effects using a PCM. Protocol b differed from protocol a by reoptimization of Glc at B3LYP/6-31G** prior to the NMR calculation. These two 90
protocols allowed sampling of the conformations of glucose hydroxyl groups and their rotameric distribution, included the solvent reaction field but did not include specific solvent effects. The protocols c-f employed the geometry of glucose obtained from MD simulations. In protocol c the authors included water 95
molecules surrounding glucose up to 5.5 Å from the glucose center of mass. Water molecules were modeled by TIP3P point charges and combined with glucose using ONIOM. In protocol d, glucose was simulated at BP86/TZ2P and water was at BP86/TZP.1s. Protocol e utilized B3LYP/6-31G** for water and 100
PCM solvent; f was the same as d plus COSMO solvent. The only protocol with DFT optimization of the glucose, namely protocol b, demonstrated the best correlation for both 1H (R2=0.987) and 13C (R2=0.997) NMR chemical shift simulations, and also produced the lowest MAD for 13C (1.12 ppm). From this 105
data the authors concluded that the most important factor that affected the accuracy of computed 1H NMR chemical shifts was the solute geometry, while the solvent effect could be reasonably described by self-consistent reaction field models. As judged by protocol performance comparison, glucose geometry could not be 110
accurately modeled by MD simulations alone. Surprisingly, there was no need of explicit water inclusion for the shielding constant calculation254. 13C NMR chemical shifts exhibited only a minor dependence on the solvent. Chelmeka and coworkers carried out DFT GIAO calculations 115
of 1H, 13C and 31P NMR chemical shifts at B3LYP/6-31G* level
18 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
for three synthesized methyl α-D-glucopyranoside 6-phosphate derivatives esterified by phenyloxy groups and glycine. Based on the comparison of the calculated and the experimental data authors concluded that the target compound having an amino group and a phosphate occurs in the neutral form, rather than as a 5
zwitterion, and selected one of two stereoisomers differing in the absolute configuration of a phosphorus atom. The details of comparison, deviation values and correlation factors were not reported268. Electronic structure calculations were performed on complexes 10
of β-D-fucose and β-D-glucose with toluene, p-hydroxytoluene and 3-methylindole in order to model the carbohydrate–protein recognition269. The three aromatic molecules were used as analogues of phenylalanine, tyrosine and tryptophan, respectively. The work focused mainly on vibrational frequencies 15
and energy predictions using a DFT model with added empirical atom-atom dispersive term with r-6 distance dependence. The authors validated this combined model known as DFT-D291 against PM2/aug-cc-pVTZ calculations and showed that difference between DFT-D and high-level ab initio results was 20
less than 1 kcal/mol for galactose-benzene and fucose-toluene complexes. Proton chemical shifts for the geometries from DFT-D were calculated using DFT GIAO at BLYP/TZV2D level and confirmed the observation that they are strongly perturbed by 25
complexation with aromatic groups292. The reproduction of the experimental chemical shifts was poor, however their values, relative to those in free sugars, corresponded to the vibrational frequencies of CH protons269. Paradowska et al. utilized GIAO DFT calculations to confirm 30
the results of 13C CP/MAS NMR and crystal structure analysis of a series of methyl pentopyranosides. The authors used the GIAO CPHF approach at B3LYP/6-31G* level to study a hydrogen bonding effect on α-D-Lyxp-OMe surrounded by water molecules forming mono- to tri-hydrates at C2, C3 and C4. The 35
starting geometry of D-Lyxp-OMe was taken from the X-ray data and optimized in PM3 empirical force field. The calculations yielded 13C shielding constants correlated with experimental chemical shifts with R2 = 0.993 (isolated molecule) and R2 = 0.992..0.997 (hydrates) 270. The authors could not separate effects 40
produced by different hydrogen bonds but confirmed an increase of stability with every next hydrogen bond observed for α-D-lyxofuranoside earlier293. NMR observables of methyl D-xylopyranosides were predicted with the use of DFT for geometries optimized with the 45
fixed dihedral angle φ of the C1-OMe bond283, 284. Comparison of the calculated chemical shifts with the experimental data from α- and β-anomers both in solid state and in solution allowed authors to point out the basic dependencies of the chemical shifts on this dihedral angle. The derived dependence of 1J and 3J on the C1-50
O1 torsion angle was proposed as a conformational probe. Reichvilser and coworkers studied four aldo-pentoses to test their suitability as linear linkers for the formation of covalent organic boronic ester networks. As judged by the X-ray structures of the reaction products with phenylboronic acid, arabinose and 55
xylose formed diesters, while lyxose and ribose formed 2,3- and 2,4-monoesters, respectively. 13C NMR shielding constants were calculated by DFT GIAO at PBE1PBE/6-311++G(2d,p) level of
theory in order to prove the applicability of QM methods to the prediction of 13C NMR chemical shifts of these and similar 60
compounds. The structures were optimized at B3LYP/6-31+G(2d,p) with tight convergence criteria and an ultra-fine integration grid, and proved by frequency analyses. PCM was used to model solvation in DMSO during both geometry and NMR calculations. The 65
authors achieved a linear correlation between experimental chemical shifts and predicted shielding constants (correlation factor and numerical values of shielding constants were not provided) 275. Taubert and coworkers studied the 13C NMR chemical shifts 70
for α-D-lyxofuranose, α-D-lyxopyranose 1C4, α-D-lyxopyranose 4C1, α-D-glucopyranose 4C1, and α-D-glucofuranose at ab initio and DFT theory levels using TZVP basis set276. Test calculations showed B3LYP/TZVP and BP86/TZVP to be cost-efficient levels of theory for calculation of the NMR chemical shifts in 75
monosaccharides. Geometry and NMR parameter calculation were checked against ab initio HF SCF and MP2 predictions and X-ray data. The basis set convergence was checked on tetramethylsilane by employing a variety of basis sets, including large ones. Molecular structures and chemical shifts calculated at 80
B3LYP/TZVP level was similar to those obtained at the MP2 level (-0.6..+0.6 ppm for pyranoses and +0.4..+4.0 ppm for furanoses). MAD of the calculated (both at B3LYP and MP2; without solvent effects) 13C NMR chemical shifts from the measured 85
values was 5.0-5.7 ppm, and is 7.2 ppm at BP86. Authors pointed out that a better shielding reference, such as methanol, decreased the largest deviation to 4 ppm and subsequent adding empirical constant shift (-1..-2 ppm) to the calculated values improved the agreement further. As judged by the dedicated investigation of 90
α-D-Glcf at fixed values of the C5–C6–O6–H dihedral, torsional movement of C6 introduced up to ±2 ppm chemical shift correction (to all carbon atoms) at those angles that existed in the equilibrium. The authors also tested four explicit solvent models for 95
α-D-Glcf: either a shell of 116 water molecules or a shell of those 11 water molecules forming hydrogen bonds with a solute; either allowing or forbidding the whole system to relax after water addition. None of the models was good enough to reproduce experimental 13C NMR chemical shifts with acceptable accuracy. 100
COSMO solvent model253 provided better results but still -9 ppm deviation for C6, which made the model inappropriate. As for 1H NMR chemical shifts, they were predicted for α-D-Glcp 4C1. The systematic deviation of ca. 3 ppm for hydroxyl protons was accounted for hydrogen bonding, whereas 105
solvent effects on the 1H NMR chemical shifts of the aliphatic protons were small (less than 0.4 ppm, except 1.3 ppm for the anomeric proton) 276. Bagno and coworkers presented an experimental and quantum chemical NMR study of the mononucleotide cyclic 110
uridinemonophosphate in water277. They calculated 1H and 13C NMR chemical shifts and 1H–1H, 13C–1H, 31P–13C and 31P–1H coupling constants using DFT. The NMR parameters and the conformer distribution were calculated at B3LYP/cc-pVTZ level. Solvent reaction field has been included using the PCM model for 115
NMR only (protocol b), for both geometry and NMR (protocol c),
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 19
or for none of (protocol a). cUMP has only two conformational degrees of freedom: the hydroxyl group on C2’ and the dihedral angle C6–N1–C1’–C2’ between the ribose residue and the nucleobase. Due to this, a search for the conformers was done by scanning the potential energy surface, rather than by full MD 5
simulation. After optimization at B3LYP/6-31G(d,p), 24 obtained structures converged to three minima almost isoenergetic in aqueous solution. Protocol c allowed obtaining a good correlation with the experimental chemical shifts (R2=0.986 for 1H; R2=0.996 for 13C) 10
and placement of signals in the correct order. Comparison of data from different protocols showed that the solvent effects were essential for the calculation of the NMR properties but not important for the geometry optimization, as observed earlier for D-glucose in water254. 15
This study confirmed that the 1H and 13C spectra of polar, flexible molecules in aqueous solution can be predicted with the same accuracy as less complex systems. No explicit inclusion of water molecules was needed to achieve this accuracy, but the usage of PCM was necessary277. 20
Sychrovsky and coworkers applied QM methods to investigate the dependence of N1/N9 and C1′ CS tensors on the glycosidic torsion angle and sugar pucker in four standard
2′-deoxynucleosides (dAde, dGua, dCyt, dThy). The study aimed at prediction of the cross-correlated relaxation rates between the 25
shielding tensor of the sugar-linked nitrogen of a nucleobase and the C1′-H1′ dipole-dipole278 (see section 7). All geometrical parameters were gradient optimized at the B3LYP/6-31G(d,p) level, except the C1’-N torsion angle fixed at different values. The shielding tensors were calculated using 30
IGLO II basis sets specific for each nuclei and exhibited a significant degree of conformational dependence on C1’-N dihedral angle and sugar pucker. No numerical values for CST components and chemical shifts were provided except dependence of the isotropic 15N chemical shift on the glycosidic 35
torsion angle for C2’-endo and C3’-endo sugar puckers of every deoxynucleoside.
5.2. Oligosaccharides and polysaccharides
Combining monosaccharides into the more complex and diverse structures of oligo- and polysaccharides leads to various 40
structural changes reflected by the NMR observables. Therefore, discussion of computational modeling and NMR structural studies of complex glycans and their derivatives (Table 4) is essential.
45
Table 4. Prediction of chemical shifts in oligo- and polysaccharides.
Object (molecule) Parameter a
: nuclei Calculation method Application ref.
Geometry Shielding Software
β-D-Glcp-(1-4)-β-D-Glcp (from 4-fold helical ASMBC) α-D-Glcp-(1-4)-α-D-Glcp (from 3-fold helical CDMPC)
CS: 13C (C1) (solid state)
B3LYP/ 6-311+g(d,p)
B3LYP/ 6-311+g(d,p)
Gaussian 03134, 135 studying molecular environment in the chiral cavities of commercial polysaccharide-based sorbents (CDMPC, ADMPC, ASMBC)
279
β-D-Glcp-(1-1)-β-D-Glcp α-D-Glcp-(1-1)-α-D-Glcp α/β-D-Glcp-(1-2)-D-Glcp α/β-D-Glcp-(1-3)-D-Glcp α/β-D-Glcp-(1-4)-D-Glcp α/β-D-GlcpNAc-(1-3)-L-Thr,1-NHMe,2Ac α/β-D-GlcpNAc-(1-3)-L-Ser,1-NHMe,2Ac
CSS(φ,ψ): 13C (C1)
AM1 HF/3-21G, HF/6-311G**
Tripos Sybyl294 6.5 (model build), Spartan127 5.0.1 (semi-empirical calculations), Gaussian 98 (QM calculations), Wolfram Mathematica295 3.0 (trigonometric fit)
studying the dependence of the anomeric carbon chemical shift on the glycosidic bond dihedral angles in oligosaccharide and glycopeptide model compounds
152
α-D-Glcp-(1-1)-α-D-Glcp, β-D-Glcp-(1-4)-β-D-Glcp, [-4)-β-D-Glcp-(1-]4, [-4)-β-D-Glcp-(1-]6, -4)-β-D-Glcp-(1-, -4)-α-D-Glcp-(1-
CSS(φ,ψ): 13C (glycosidic bond carbons)
AM1 (monosaccharides constrained in 4C1)
HF/3-21G, HF/6-311G**
Sybyl 6.5 (model build) Spartan 5.0.1 (semi-empirical calculations) Gaussian 98 (QM calculations) Mathematica 3.0 (trigonometric fit)
determination of a 3D structure
151
D-Glcp-α-(1→4)-D-Glcp (α-, β-, γ-, ε-, and ι-cyclodextrins)
CSS(φ,ψ): 13C (C1)
X-ray, AM1
HF/3-21G, 6-311G**
Sybyl 6.8 (model build), Spartan 5.0.1 (semi-empirical calculations), Gaussian 98 (QM calculations), AMBER 6.0 178, 179 (MD simulations), Mathematica 3.0 (trigonometric fit)
testing the prediction methodology and computation of the anomeric carbon chemical shifts in cyclodextrins
150
D-Glcp-α-(1→4)-D-Glcp (α-, β-, γ-, ε-, and ι-cyclodextrins)
CSS(φ,ψ): 13C (C1)
HF/6-31G* ONIOM (B3LYP/ 6-31G*: HF/6-31G*)
B3LYP/6-31G* ONIOM (B3LYP/ 6-31G* : HF/6-31G*)
Gaussian 03, Mathematica 3.0 (trigonometric fit)
computation of the anomeric carbon chemical shifts in cyclodextrins
226
20 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
Table 4, continued
β-D-Glcp-(1-4)-β-D-Glcp (cellobiose)
CS:13C X-ray B3LYP/6-311+G(2d,p) (GAIOCHF procedure)
Gaussian 03 theoretical investigation of effects of the conformation and hydrogen bonding on 13C isotropic chemical shifts
265
β-D-Glcp-(1-4)-β-D-Glcp (cellobiose)
CS: 13C (C1-C4)
MD (GROMOS)
HF/6-31G(d) Gaussian 09 (QM calculations), GROMACS166 3.3 (MD simulations)
modelling of the conformational space of amorphous cellulose
296
-4)-β-D-Glcp(1- (Iα and Iβ cellulose)
CS: 13C (solid state)
GIPAW PBE/planewave
mPW1PW91/ 6-31G(d)
VASP297 5.4 (geometry), Gaussian 09 (NMR)
conformational studies of cellulose
298
-4)-β-D-Glcp(1- (Iα and Iβ cellulose)
CS: 1H, 13C, 17O (solid state)
B3LYP/6–31+G* (hydrogens only)
B3LYP/6-31+G*, B3LYP/ 6-31++G**
Gaussian 03 investigation differences in crystalline structure and hydrogen bond pattern in Iα and Iβ cellulose
299
α-D-Glcp-(1-4)-α-D-Glcp, α-D-Glcp-(1-4)-β-D-Glcp
CS: 1H, 13C (solid state)
PBE/planewave GIPAW PBE/planewave
CASTEP271-273 (geometry optimization), PARATEC code300, 301 (QM calculations)
investigation of weak hydrogen bonding
302
α-D-Glcp-(1-2)-β-D-Fruf (sucrose)
CST: 13C (solid state)
Neutron diffraction data
RHF, HFB, HFS, BLYP, B3LYP, B3P86, BVWN, SVWN, MPW1PW91 /cc-pVDZ, /cc-pVTZ
Gaussian 03 comparison of DFT and HF functionals
225
α-D-Glcp-(1-1)-α-D-Glcp (α,α-D-trehalose), β-D-Galp-(1-4)-β-D-Glcp, α-D-Glcp-(1-2)-β-D-Fruf (sucrose)
CSS(φ,ψ): 13C (C1) (amorphous state)
MM (BIO85, CHARM27, AMBER) on fixed φ and ψ
TNDO, B3LYP/ 6-31+G(d,2p), B3LYP/ 3-21+G**, B3PW91/ 3-21+G**
HyperChem 139, 140 (geometry), Gaussian 03 (chemical shifts)
exploration of the local structure of sugars in glassy state
303
complex of E-selectin with sialyl Lewis X b
CS: 1H QM/MM (see section 4.4)
QM/MM-GIAO (HF/6-31G*)
Own QM/MM program based on HONDO package304
validation of the geometrical modeling
238
α-D-Glcp-(1-2)-β-D-Fruf (sucrose), α-D-Glcp-(1-1)-α-D-Glcp (α,α-D-trehalose), α-D-Glcp-(1-4)-α-D-Glcp (D-maltose)
CS, CST: 13C (solid state)
X-ray and neutron diffraction data, PBE/planewave, Vanderbilt's “Ultrasoft” pseudopotentials
GIPAW PBE/planewave, Troullier-Martins norm-conserving pseudopotentials
CASTEP271-273
comparison of calculations to the chemical shift anisotropy amplification data
305
α-D-GlcpNH3+ / I-
(chitosan salt cluster) CST: 1H, 13C, 15N, 17O (solid state)
X-ray; B3LYP/ 6-31++G(d,p) for hydrogens only
LD: B3LYP/ 6-311++G(d,p), B3LYP/ 6-31++G(d,p), 6-31G (other), LANL2DZ (iodine ions)
Gaussian 98 investigation of the hydrogen bonding effects on the CS tensors
266
α-D-GlcpN (chitosan cluster)
CST: 1H, 13C, 15N, 17O (solid state)
X-ray; B3LYP/ 6-31++G(d,p) for hydrogens only
B3LYP/ 6-311++G(d,p), B3LYP/ 6-31++G(d,p)
Gaussian 98 investigation of the hydrogen bonding effects on the CS tensors of anhydrous crystalline structure
267
β-L-Fucp(1-4)α-D-Galp-OMe, β-L-Fucp (1-4)α-D-Glcp-OMe, β-L-Fucp (1-3)α-D-Glcp-OMe
CS: 1H (hydroxyl protons)
B3LYP/ 6-31G(d) HF/ 6-311++G(2d,2p), B3LYP/ 6-311++G(2d,2p)
MM3 (geometry), Gaussian 98 (QM calculations)
studying the effect of hydration on the chemical shift of hydroxyl protons
306
a Notations: CS – chemical shift; CST – chemical shift tensor; CSS – chemical shift surface.
b Sialyl Lewis X is α-Neup5Ac-(2-3)-β-D-Galp-(1-4)-β-D-GlcpNAc-(3-1)-α-D-Fucp
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 21
13C NMR chemical shifts of the anomeric carbons in oligo- and polysaccharides can be used as conformational probes due to their periodic dependence on the glycosidic bond dihedral angles307. Kasat and coworkers used GIAO calculations at DFT level to 5
predict the 13C NMR chemical shifts of the anomeric carbons in a carbohydrate backbone, as well as of carbons in non-carbohydrate side chains while studying molecular environment in the chiral cavities of polysaccharide-based sorbents, cellulose tris(3,5-dimethylphenylcarbamate) (CDMPC), amylase tris(3,5-10
dimethylphenylcarbamate) (ADMPC), and amylase tris[(S)-α-methylbenzylcarbamate] (ASMBC)279. The authors computed the anomeric carbon 13C NMR chemical shielding in the monomers of cellulose, amylase, amylase acetate and cellulose acetate (summarized in Table 3) and in dimers extracted from ASMBC 15
octamer with 4-fold helix and CDMPC nonamer with 3-fold helix optimized by DFT methods. The geometry of the octa- and nonamers was constructed from the X-ray data using linked-atom least-squares method308. The simulations showed that helicity strongly affects the C1 20
chemical shift, clarified effects of the side chains on polymer conformations and supported the hypothesis of a 3-fold helical conformation of CDMPC and of a 4-fold one of ADMPC, which is important for the explanation of enanthioseparation of racemates on these sorbents due to differences in their high-order 25
structure309. The authors conclude that the strength of H-bonds of the C=O and NH groups in the chiral cavities of these polysaccharide-based polymers are significantly different, which may be a major factor affecting the selectivity of chiral solutes279. Swalina and coworkers used GIAO calculation to study the 30
dependence between the anomeric carbon chemical shift and the glycosidic bond ⟨φ,ψ⟩ dihedral angles in oligosaccharide and glycopeptide model compounds152. They computed full chemical shift surfaces (CSSs) versus φ and ψ for D-Glcp-D-Glcp disaccharides with (1→1), (1→2), (1→3), and (1→4) linkages in 35
both α- and β-configurations. φ and ψ were fixed in 20° steps and the geometries were optimized using the AM1 semi-empirical Hamiltonian. To simulate an observed chemical shift CSSs were corrected by adding a correction factor of +7.1 ppm calculated from comparison of TMS and dioxane as references. After 40
Bolzmann averaging of CSS, accounting for the distribution of conformers, predicted chemical shift values exhibited an RMS deviation of 1.4 ppm from the experimental data. The authors derived empirical equation of the form 13C δC1=f(φ,ψ) obtained by fitting the raw ab initio data to the trigonometric series 45
expansions, following Le and coworkers310, and realized it as a Perl script. For a series of 91 and 325 terms, RMS between the raw and derived chemical shift values was 0.56 ppm and 0.31 ppm, respectively. To reduce the computational cost CSSs were calculated using 50
the 3-21G basis set and scaled using the reference 6-311** level calculations. To obtain the scaling factor, duplicate GIAO 13C calculations using the 3-21G and 6-311G** basis sets were performed on AM1-optimized models of eight disaccharides (96 carbons). The 13C NMR chemical shifts predicted using both 55
basis sets were then correlated (R2=0.992), and the resulting linear relationship was employed to scale 3-21G results. To test the approach, 13C CSSs were calculated using a locally-dense
basis set (6-311** for the anomeric carbons and nearest neighbors; 3-21G for the remaining atoms311) or 6-311G** basis 60
set in particular cases. The RMS deviation between the scaled CSS and the test CSS obtained with locally-dense or large basis set was less than 1 ppm for disaccharides. Similar surfaces were also obtained for GlcNAc-Thr and GlcNAc-Ser model glycopeptides in α- and β-configurations. 65
Selection of any of three different conformations of the peptide moiety (freely relaxed, extended and α-helical) virtually did not affect the CSSs. In contrast to the threonine derivative, the serine derivative possessed two CS maxima on the CSS. Authors explained it by the sterically induced polarization of the electron 70
density around the anomeric carbon caused by the methyl group of threonine152. The above methodology was used later in a number of studies. Particularly, Sergeev and Moyna utilized derivation of 13C CSSs for the determination of the spatial structure of glucose 75
oligosaccharides in solid state from the experimental 13C NMR data of glycoside bond carbons (Fig. 8)151. During the CSS derivation the level of theory, basis set and scaling procedure were the same as reported for model glycopeptides152. In order to take into account the experimental chemical shifts of the 80
glycosidic bond carbons during molecular modeling, the potential energy function of the MMFF94 force field was augmented with an NMR pseudopotential energy term. This term included a function derived from CSS using a 91-term trigonometric fit, and a constant chosen so as to give the force field energy term 85
compatible weight. The authors approved the method on α-(1→1) and β-(1→4)-linked oligosaccharides by reproducing the three-dimensional structure obtained from the X-ray studies with an RMS deviation of heavy atom positions equal to 0.14Å, 0.12Å and 0.25Å 90
(trehalose, cellobiose and cellotetraose, respectively)151. In contrast, lowest energy conformer of cellotetraose predicted in vacuo by MMFF94 without NMR constraints was significantly different. Further, the authors determined the spatial structure of cellohexaose and generated structural models for cellulose II and 95
amylose V6, using hexasaccharides as models and CSSs obtained from disaccharides. These studies supported φ and ψ estimates reported earlier312, 313. O’Brien and Moyna tested the same method on cyclo-oligomaltoses (α-, β-, γ-, ε-, and ι-cyclodextrin) and 100
α-cyclodextrin inclusion complexes with 1,4-disubstituted benzenes in solid state and in solution. They used the same approach as Swalina and coworkers, and the same basis sets for generation of the anomeric carbon CSSs and derivation of the empirical formula for the chemical shift. For the solid-state 105
structures, D-Glcp-α-(1→4)-D-Glcp glycosidic bond dihedral angles were taken directly from the X-ray data. The calculated solid-state 13C NMR chemical shifts of anomeric carbons of all residues in α-, β- and γ-cyclodextrin overestimated the observed CP/MAS data by ca. 0.8 ppm 110
(cyclodextrins) and 0.4-0.6 ppm (inclusion complexes). Calculations on ε-, and ι-cyclodextrins also predicted an average chemical shift within +0.8 ppm from the solution data and allowed characterization the band-flipped residues by the abnormal upfield shift of the anomeric carbon signal. 115
22 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
O’Brien and Moyna also employed derivation of 13C NMR chemical shifts from averaging back-calculated 13C shift trajectories from a series of 5 ns MD simulations of α-, β- and γ-cyclodextrin with explicit TIP3P water molecules filling an octahedral buffer of 10 Å. Application of the empirical formula 5
obtained from solid-state CSSs calculation gave an excellent agreement with the solution 13C NMR data (MAD 0.36 ppm)150. Lefort and coworkers studied the local structure and conformational disorders of selected disaccharides in amorphous state by comparison of CPMAS data to 13C NMR chemical shift 10
surfaces of C1 calculated by GIAO for MM-optimized geometries. They provided a numerical procedure to treat discontinuities in the CPMAS spectrum, and demonstrated that force field geometry optimization did not critically hamper the accuracy of the results303. 15
Tafazzoli and Ghiasi studied the anomeric carbon chemical shifts of α-, β- and γ-cyclodextrins in solution using two-layer ONIOM method237. The higher level of theory (B3LYP/6-31G*) included all atoms in the pyranose rings, and the lower one (HF/6-31G*) included all other atoms. The PCM model was 20
employed to model the solvent effects. The 13C NMR chemical shift surfaces for C1 in D-Glcp-α-(1→4)-D-Glcp fragment in gas phase and in solution were calculated employing the GIAO B3LYP/6-31G* method and compared to the ONIOM (B3LYP/6-31G*: HF/6-31G*) results 25
obtained for a disaccharide model. The empirical equation relating isotropic 13C shifts with the glycosidic bond ϕ and ψ dihedral angles was derived using a trigonometric expansion. The calculated average chemical shift in solution deviated from the experimental data by -0.4..+0.8 ppm, and deviations of C1 30
chemical shifts for residues 1 and 2 in α-cyclodextrin were predicted with an accuracy of 0.6 ppm and 0.5 ppm, respectively226. Conformation of the cellulose fragments have been explored and probed against experimental NMR observables in a number 35
of publications151, 265, 296, 303. Kubicki and collegues achieved an RMS error less than 3 ppm for 13C chemical shift simulation with GIPAW in periodic boundary condition for tg/NetA conformations of cellulose Iα and Iβ298. Esrafili and coworkers obtained MAD of less than 7% in DFT 13C calculations of 40
cellulose spectra, and showed that 13C chemical shifts could serve a probe for differentiation between Iα and Iβ structures299. Suzuki and coworkers applied DFT calculations to reproduce experimental dependences of 13C NMR chemical shifts on the conformation of β-D-Glcp (see details in the previous section), 45
cellobiose and cellobiose units of native cellulose capped with hydrogen atoms. The geometry was extracted from the X-ray structure without further optimization. D-Cellobiose and the cellobiose units revealed appreciable dependences of the predicted C1’ and C4 chemical shifts on the torsion angles in the 50
(1→4)-β-glycosidic linkage. In a region of the crystalline conformational minimum C1’ chemical shift was found to depend mainly on φ, whereas C4 on both φ and ψ265. The authors explained calculated chemical shifts in disaccharide units basing on γH-gauche effects and their reduction by intra-residue 55
hydrogen bonding. On the contrary, inter-residue hydrogen bonding had almost no effect on 13C NMR chemical shifts.
Khodaei and coworkers used a molecule in a cluster as a model system for the chitosan/HI salt and calculated hydrogen bonding effects on the CST of 17O, 15N, 13C, and 1H nuclei (see details in 60
the previous section). According to the locally dense basis set method314 used to speed up the calculation, the target molecule and the neighboring nuclei directly involved in its hydrogen bonding were calculated at 6-311++G(d,p) and 6-31++G(d,p) basis sets, whereas the other nuclei were calculated at 6-31G and 65
LANL2DZ (iodine ions) basis sets. The authors observed that the theoretical B3LYP/6-311++G(d,p) isotropic 13C NMR chemical shifts overestimated the experimental values (MAD 5.8, least square linear fit with R2=0.97), while chemical shifts obtained from B3LYP/6-31++G(d,p) underestimated them (MAD 5.0, 70
least square linear fit with R2=0.96). They report the results from the 6-311++G(d,p) basis set as more reliable than those from the 6-31++G(d,p) one. The difference in the isotropic shielding between monomer and target molecule in a cluster was analyzed in respect with 75
O6H…O, O6H…I, NH..O and NH..I hydrogen bonding. The authors revealed a 40 ppm increase of predicted O6 chemical shift due to the intermolecular hydrogen bonding in a cluster, as compared to the monomer, while the difference at other oxygen sites was not so dramatic. NH hydrogen bonds reduced the 80
predicted isotropic 15N chemical shift by 18.39 ppm266. Esrafili and coworkers investigated hydrogen-bonding effects on the 17O, 15N, 13C and 1H CS tensors of anhydrous chitosan as compared to its monomeric unit (α-D-GlcpN) in gas phase. The DFT calculations were performed at B3LYP/6-311++G(d,p) and 85
6-31++G(d,p) for the X-ray geometry with protons reoptimized at B3LYP/6-31++G(d,p). Authors explained deviations in 17O, 15N, and 1H CST components and anisotropy by the formation of the hydrogen bonds, primarily O3H…O and NH…O. Good correlation between the predicted and experimental isotropic 13C 90
NMR chemical shifts (R2=0.985) indicated that hydrogen bonding effects in chitosan are sufficiently described when the neighboring chains are represented by monomeric units only. As followed from the QM calculations, the intra- and intermolecular hydrogen bonding played an essential role in determination of the 95
relative orientation of oxygen and nitrogen CST principal components in the molecular frame axes267. Bekiroglu and coworkers performed the comparative QM calculations on the disaccharides (β-L-Fucp-(1→4)-α-D-Galp-OMe, β-L-Fucp-(1→4)-α-D-Glcp-OMe, and β-L-Fucp-(1→3)-α-100
D-Glcp-OMe) using HF and DFT methods. They calculated the chemical shift difference (∆δ) between the hydroxyl protons in the disaccharide and the corresponding monosaccharide methyl glycoside. The lowest energy geometries of MM3 calculations were taken 105
as starting conformers for a full DFT optimization. The ∆δ values obtained from HF and DFT calculations were similar, although the HF calculations gave systematically more upfield values than DFT calculations. The calculations in vacuo showed that one or two OH protons 110
in each disaccharide, which exhibit hydrogen bonding to the neighboring ring oxygens, are strongly deshielded (∆δ>0). In contrast, the experimental NMR data indicated shielding of these protons (∆δ<0)315. This discrepancy was accounted for the solvent effects, which were confirmed by monitoring the 115
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 23
Fig. 9. Dependence of 3JC1-2OH on the glycosidic torsion angle ω and the
C1/HO2 dihedral angle θ calculated by DFT for methyl α- and β-D-glucopyranoside mimics (A, B), respectively) and methyl α- and β-D-5
mannopyranoside mimics (C, D, respectively), all having deoxy functions at C3, C4, and C6316. Reproduced with permission, © Elsevier Ltd., 2009.
chemical shift of the hydroxyl proton of methanol in water and other solvents, modeling the acetal groups of disaccharides. Intramolecular hydrogen bonding leads to the reduced hydration 10
of a particular hydroxyl proton, and to a consequent upfield shift306. Yates and coworkers studied the anomeric forms of maltose by 1H-13C MAS-J-HMQC solid-state NMR spectroscopy. They used chemical shift calculations for the assignment of 1H NMR 15
spectrum. Further calculations showed that the difference in the calculated 1H NMR chemical shift between the crystal and an isolated molecule with the same geometry was a quantitative measure of weak intermolecular C-H⋅⋅⋅O hydrogen bonding302. Geometry optimizations were performed using the DFT code 20
CASTEP271-273, which utilized a planewave basis set to expand the charge density and electronic wave functions, and pseudopotentials to represent the core electrons. The PBE exchange-correlation function215 and “ultrasoft” pseudopotentials with a maximum planewave cutoff of 30Ryd were used. The 25
NMR chemical shifts were computed using the PARATEC300, 301 code that employs the GIPAW method214, which is based on DFT and the plane-wave pseudopotential approach301. The calculations used a PBE exchange-correlation functional, a plane-wave basis set with a maximum energy of 80Ryd and Trouiller-Martins317 30
norm-conserving pseudopotentials. MAD of the calculated isotropic 13C NMR chemical shift from the experimental values was 1.0 ppm for the α-anomer and 0.9 ppm for the β-anomer, the highest deviations being observed for the anomeric (up to +3.0 ppm) and C6 (up to -1.4 ppm) carbons. 35
6. Computation of NMR coupling constants
Empirical relationships between molecular geometry of saccharides, such as torsion angles, and the spin-spin coupling were historically widely and successfully applied for NMR structure elucidation, especially for the identification of 40
monomeric composition of oligo- and polysaccharides43, 318, 319.
However, in spite of important scope and several useful applications, such approach is subject to fail for compounds different from those used for the calibration of the empirical equations. Especially difficult are the cases of intermolecular 45
aggregation and intercation with polar solvents and atypical functional groups. Undoubtedly, computational modeling of spin-spin coupling and its interdependencies with distinct structural parameters (like bond angles and dihedral angles) is one of the most demanding areas of research. Scalar couplings are averaged 50
linearly among conformers in solution, and thus their interpretation in terms of conformationally flexible molecular structure more straightforward320. The averaging allows easy connection with MD simulations and provides the NMR description of structural flexibility in carbohydrates. 55
A widely used non-relativistic approach to the simulation of nuclear coupling originates from well-known Ramsey equations321. Indirect scalar nuclear spin–spin coupling constant is associated with four terms: Fermi contact (dominant term, FC), orbital diamagnetic (DSO), orbital paramagnetic (PSO), spin-60
dipole (SD). The Fermi term contribution is often dominating322, especially in carbon-hydrogen saturated systems. The computational cost can be significantly reduced by calculation of the dominating FC term in a larger basis set and the computationally expensive but smaller remaining contributions 65
from the other terms in a smaller basis set323. Quantum-chemical approaches to the calculation of indirect spin-spin coupling constants have been reviewed in details recently63. As follows from the comparison of computational cost of electronic wave function and electronic density approaches in 70
calculation of spin couplings, the latter performs faster. DFT was recognized as a good tool for accurate prediction of coupling constants in medium and large molecules. In calculation of the coupling constants, DFT has been shown to give reasonable potential energy surfaces for aldo- and ketohexoses and to reduce 75
the basis superposition error in hydrogen-bonded systems, such as monosaccharides324. It was found that improved accuracy in spin–spin couplings could be obtained from the DFT calculations at DFT-optimized geometries instead of experimental or higher-level geometries325. The choice of geometry calculation theory 80
affects the accuracy of coupling calculations. As tested on methyl α-xylopyranoside, MM3 geometry gave better results for the calculation of 1JCH and 2JHH couplings, while DFT geometry produced slightly better results for 3JHH couplings284. In contrast to chemical shifts, the quantitative prediction of the 85
coupling constants is known to have a problem of linear correlation being far from ideal values (intercept = 0 and slope = 1). This is usually associated with a lack of accuracy in calculation of the Fermi contact term326. Accurate description of the electron density at the nuclei, which is needed to calculate this 90
term, often requires a specially designed basis set327. The result of coupling constant simulation versus geometrical parameters of molecules is often expressed in the form of derivation of the Karplus equation (3J = C0 + C1cosφ + C2cos2φ or 3J = C3 + C4cosφ + C5cosφ2) 328, which relates a vicinal 95
coupling constant to the torsional angle around the central bond of the fragment. Typical values for C3, C4 and C5, obtained by averaging over different H-C-C-H fragments in heparin fragments, are 0.2, -0.6. and 9.6 respectively329. Nowadays, Karplus equations have been derived virtually for every 100
24 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
combination of nuclei and coupling pathways that occur in carbohydrates. Except vicinal couplings, these combinations include atoms not connected by three bonds forming a dihedral angle, but have coupling dependent on the torsion angles of substituents either inside or outside the coupling pathway. As an 5
example, 3JC1-2OH coupling constants surfaces were obtained for methyl α- and β-D-glucopyranoside mimics with deoxy functions at C3, C4 and C6 (Fig. 9). These curves indicate a minimal dependence on the glycosidic torsion angle and strong dependence on C1/HO2 dihedral angle316. 10
6.1. Intra-residue coupling constants
Analysis of NMR coupling constants is a canonical structural tool to characterize carbohydrates. A brief review provided in the present section shows an outstanding potential of computational studies to make a valuable insight into the structural and 15
electronic origins of values measured experimentally. Computations of intra-residue coupling constants (within a single monosaccharide) are discussed in the present section and summarized in Table 5. 20
Table 5. Prediction of coupling constants in monosaccharides.
Object (molecule) Coupling constant a
(e.g. 3JH-N-C-H)
Calculation method Application ref.
Geometry Coupling Software
β-D-Xylp-OMe 1,2,3JCH, 3JHH BP86/TZVP, MM2
PW91/IGLO-III deMon-KS, demon-NMR280-282, MacroModel128, 129 5.0 (MM calculations)
calculation of coupling constant dependence on the dihedral angle between C1 and methyl group
283
α-D-Xylp-OMe 1,2,3JCH, 3JHH PW91/TZVP MM
PW91/B-III 284
1C4, 2S0, 4C1 α-L-IdopA2S-OMe, Na+, (4C1, 2S0), (4C1, 1C4)
α-D-GlcpNS6S-(1→4)-α-L-IdopA2S-OMe, (heparin disaccharide),
3JH,H B3LYP/6-31++G** B3LYP/6-31++G** JAGUAR 3.5
(geometry), Gaussian 03 (couplings)
derivation of a Karplus equation
329
2S0 α-L-IdopA2S 3JH,H B3LYP/6-31+G* B3LYP/6-31+G*
[-4)-β-D-GlcpNS6S-(1-4)-α-L-IdopA2S-(1-]3
(heparin hexasaccharide)
3JH,H (in all IdopA rings)
MD (GLYCAM03) Altona and Haasnoot formalism
AMBER 5.1, AMBER 6.0 178, 179
investigation of the conformational flexibility of IdopA rings
330
1C4 α-L-IdopA2S(1→4)-α-D-GlcpN6S-1OMe, 2S0 α-L-IdopA2S(1→4)-α-D-GlcpN6S-1OMe, (heparin disaccharide)
3JHH, 1JCH, 3JCH (in rings)
B3LYP/ 6-311++G**, M05-2X/ 6-311++G** (with explicit solvent, ONIOM)
B3LYP Gaussian 03, Gaussian 09134, 135
studying coupling constants variations upon counterion and solvent effects
240
α-L-IdopA2S 3JH,H (all in ring)
MD (GROMOS96, GLYCAM06), then HF/6-31G(d)
Altona and Haasnoot formalism
Gaussian 03 (QM calculations), GROMACS166 3.3, AMBER 9.0 (MD simulations)
comparison of the prediction force of two force fields
331
α-D-Glcp-(1→1)-α-D-Glcp 3JH5,C1, 3JHH (all vicinal)
MD, (CHARMM carbohydrate force field)
Karplus type equation with the Haasnoot-Altona parameterization
CHARMM164 establishing a comprehensive understanding of the hydration pattern of trehalose
332
α-D-Glcp, β-D-Glcp
3JH,H (except with OH)
B3LYP/6-31G(d,p) B3LYP/pcJ-2 Gaussian 03 support of the experimental data
260
2HOMe-THP (model) 2JH5H6, 2JC5,H6, 2JC6,H5, 3JC4,H6
HF, B3LYP/ 6-311++G (d,p)
FF-DPT (Fermi-contact), B3LYP/ [5s2p1d/3s1d]
Gaussian 03 derivation of the Karplus equations
333
β-D-Glcp-1OMe, β-D-Glcp-1SMe, β-D-Glcp-1Ethyl,
3JH1C1OC, 3JH1C1SC, 3JH1C1CC,
β-D-Glcp-1OMe, β-D-Glcp-1SMe, β-D-Glcp-1Ethyl,
3JH1C1OC, 3JH1C1SC, 3JH1C1CC, in aqueous and methanol solutions
HF, B3LYP/ 6-311++G (d,p)
FF-DPT (Fermi-contact), B3LYP/ 6-311++G (d,p)
Gaussian 03 derivation of the Karplus equations
334
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 25
Table 5, continued
2-hydroxymethyl-THP and other models
2JH6H6’,
3JH5,H6,
1JC5,H5, 1JC6,H6
B3LYP/6-31G(d) B3LYP/ [5s2p1d/3s1d]
Gaussian 94 derivation of the Karplus equations
335
β-D-Glcp-(1-3)-4H-pyran-4-one, β-D-4-deoxy-XylHexp-(1-3)-4H-pyran-4-one, (erigeroside and its model)
2JC5H6, 2JC6H5, 3JC4H6, 2JH6R–H6S, 3JH5H6
HF, B3LYP/ 6-311++G (d,p)
FF-DPT (Fermi-contact), B3LYP/ 6-311++G (d,p)
Gaussian 03 derivation of the Karplus equations and validation of the DFT methodology
336
β-D-Glcp-(1-3)-4H-pyran-4-one (erigeroside)
1JC1H1
2-deoxy-β-D-eryPenf 1JCH, 2JCH, 3JCH, 1JCC, 2JC3C5, 3JC1C5, 3JC2C5
B3LYP/6-31G(d) FF-DPT (Fermi-contact), B3LYP/[5s2p1d/2s]
Gaussian 94 validation of DFT methodology
320
2-deoxy-β-D-eryPenf 1-3JCH, 1-3JCC HF/6-31G(d)
B3LYP/6-31G(d) FF-DPT (Fermi-contact), B3LYP/[5s2p1d/2s]
Gaussian 94 investigation of the effect of hydroxymethyl conformation on the conformational energies and structure
337
2-deoxy-β-D-eryPenf-1NH2, 2-deoxy-β-D-eryPenf-1NH3
+
1-3JCH, 1JCC 3JCC
HF/6-31G(d) B3LYP/6-31G(d)
FF-DPT (Fermi-contact), B3LYP/[5s2p1d/2s]
Gaussian 94 investigation of the effect of the amino group on molecular properties
338
P-3:5)β-D-Ribf-1U (cUMP in aqueous solution)
JHH, JCH, JPH, JPC
B3LYP/6-31G(d,p) B3LYP/cc-pVTZ Gaussian 03 testing the prediction method and selection of the appropriate solvent model
277
α- and β-L-Eryf-1OMe-2,3-epoxy
JH,H (all in ring)
MP2, DFT B3LYP/ 6-311++G(d,p)
coupled perturbed DFT; B3LYP/ 4-31G, 6-31G(d,p), 6-311G(d,p), 6-311++G(d,p), aug-cc-pVDZ, IGLO II, IGLO III
Gaussian (geometry optimization), Cologne 99 339 (coupling calculation)
interpretation of the 1H-1H coupling constants of synthesized compounds and comparison of prediction methods
340
α- and β-L-Eryf-1OMe 2,3-epicyclic derivatives with S, NH, NR
B3LYP/ 6-311++G(d,p)
coupled perturbed DFT; B3LYP/IGLO II
α-D-GlcpNAc 3JH-N-C-H B3LYP/6-31G(d,p)
with explicit solvent: MD snapshot optimized in MMX force field
SD, PSO, DSO terms: B3LYP/IGLOO-III; FC term: B3LYP/HIIIsu3
Gaussian 03D (QM calculations), CHARM (MD calculations)
studying the dependence of 3JH-N-C-H coupling on conformation, dynamics and solvent; derivation of the Karplus curve
341
β-D-GlcpNAc, α-GalpNAc
B3LYP/HIIIsu3 (FC term only)
a See Fig. 2 for atom numbering.
Gandhi and Mancera probed two MD force fields in unconstrained molecular dynamics simulations of 2-O-sulfo-α-L-iduronic acid ring conformational flexibility in aqueous 5
solution331. The authors reported that the GROMOS96186 force field with the SPC/E water potential could successfully predict the dominant skew-boat to chair conformational transition of the IdoA2S in water, whereas the GLYCAM06180 (augmented with non-bonded parameters for sulfates and sulfamates) and the 10
TIP3P water potential sampled transitional conformations between the boat and chair forms. Simulations using GROMOS96 exhibited no pseudorotational equilibrium fluctuations and hence no inter-conversion between the boat and twist boat ring conformers. Simulation of proton NMR coupling 15
constants showed that in contrast to GLYCAM06 the GROMOS96 force field could predict the 2S0 (skew-boat) to 1C4
(chair) conformational ratio (17:83) in better agreement with the experiment. Unlike GLYCAM93, which reproduced experimental couplings well330, GLYCAM06 does not have an 20
explicit definition of the anomeric carbon, which was considered a reason of its poorer predictive force. Since a 2S0–
1C4 transition was observed after 81 ps, the 2S0 coupling constants were averaged over the initial 81 snapshots within GROMOS simulations, while the 1C4 coupling constants were averaged over 25
419 random snapshots from the remaining 419 ps. In both works330, 331 the averaged 3JH,H coupling constants were calculated using the Altona and Haasnoot formalism319 from the MD data with respect to conformer ratio, and compared to the reported experimental values. 30
A detailed look into influence of counterion and solvent on conformation and coupling constants in heparin disaccharide was
26 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
reported by Hricovini240. He utilized B3LYP calculations on the geometries obtained at B3LYP and M05-2X342 theory levels to compare coupling constants of both conformers (2S0 IdoA2S + 4C1 GlcN6S and 1C4 IdoA2S + 4C1 GlcN6S) in explicit water solution and with Na+ and Ca2+ counterions. Better geometry 5
predictive strength of B3LYP, as compared to M05-2X, was shown for a disaccharide with Na+ counterions. Comparison of the calculated averaged vicinal couplings with the experimental data indicated that the 1C4 conformation of IdoA2S (Ca2+) was nearly exclusively populated. In contract to direct and 10
transglycosidic H-C couplings, averaged proton couplings were hardly affected by solvent effects. Engels and Perez calculated 3JH,H couplings for vicinal hydrogens in α,α-trehalose to study the disaccharide dynamics in water solution332. The authors calculated interglycosidic coupling 15
as well, and more details on geometry optimization are given in section 6.2. Intra-residue homonuclear couplings were calculated using a Karplus type equation with the Haasnoot-Altona parameterization319 accounting for the coupling dependence not only on the dihedral angle, but also on the electronegativity of the 20
participating atoms, and on the orientation of α- and β-substituents. Roslund and coworkers calculated coupling constants for α- and β-D-glucopyranose, as well as chemical shifts (see section 5.1). All the terms contributing to the J-couplings were predicted 25
at B3LYP/pcJ-2 on geometry optimized at B3LYP/6-31G(d,p). The authors estimated conformer populations by applying Boltzmann distribution to the relative stability of the conformers (more details are discussed in chemical shift prediction section above). The correlation factor between the population-weighted 30
averages of the calculated couplings and the experimental results was 0.994-0.995 (MAD 0.49 Hz for α-D-Glcp, 0.62 Hz for β-D-Glcp), but certain deviations were quite significant. Among vicinal couplings, the most significant differences were observed for 3JH5,H6 values (73%), indicating that the selected subsets of 35
hydroxymethyl rotamers and their relative stabilities did not thoroughly reflect the equilibrium of D-glucose in aqueous solution. For vicinal coupling within a pyranose ring, deviation values confirmed a good applicability of the selected method to reproduce the experimental data260. 40
There are more J-couplings observed in oligosaccharides, as compared to the number of NOEs sensitive to the conformational parameters. Thus, ability to predict coupling constants versus conformation of a glycosidic bond and an exocyclic group of sugar residues may become a useful tool in conformational 45
studies. Tafazzoli and Giashi derived Karplus equations by least-square parameterization from non-linear regression analysis of the simulated vicinal coupling constants related to dihedral angles ω (C5-C6), θ (C6-O6) andϕ (C1-X) in various glycosides of 50
glucose and galactose333. These studies demonstrated the ability of the DFT to predict J-couplings in aqueous solution. 2-hydroxymethyltetrahydropyran was used as a carbohydrate model. The authors optimized the geometry using a hybrid HF-DFT scheme, the adiabatic connection method B3LYP/ 55
6-311++G(d,p) with no initial symmetry restrictions, and the PCM method for the solvent effects on the conformational equilibrium. Heteronuclear coupling constants involving a
hydroxymethyl group were obtained by Fermi-contact FF-DPT calculations at B3LYP level using a basis set [5s2p1d/3s1d] 60
designed for calculation of J-couplings in the exocyclic group of a carbohydrate model compound (2-hydroxymethyl-tetrahydropyran)335. Tafazzoli and Giashi used the same model to simulate 2JH5H6,
2JCH and 1JCH coupling constants. A multitude of factors affect the CH bond length in the 65
hydroxymethyl group and direct 13C-1H coupling constants that are almost reverse-proportional to the bond length343, thus Karplus equations for these couplings possess large RMS errors. For each of three stable C5-C6 rotamers, a dependence of 2JC5,H6
on the θ angle was derived. In contrast to 2JC5,H6, 2JC6,H5 values 70
were almost insensitive to the θ angle because C5-O5 torsion is fixed by the ring conformation. Comparison of 3JC4,H6 values calculated for the model compound with those calculated for the 4-hydroxy-substituted model, did not reveal any correlation between substitution at 75
position 4 and ω / θ angular dependence of 3JC4,H6. The Karplus equations for the couplings above are given in eq. 3-8 of the original publication333. These authors studied 3JCXC1H1 dependence on the ϕ angle in 1-substituted glycosides. They used [5s2p1d/3s1d] basis set for 80
the calculation on β-D-Glcp derivatives (X = O, S, C)333 and obtained theoretical Karplus equations (e.g., for O-glycosides: 3JCOC1H1 = 6.68cos2ϕ+0.89cosϕ+0.11; RMS=0.65 Hz) resembling an empirical equation proposed previously by Tvaroska and coworkers for 1-thioglycosides344. 85
They also applied B3LYP/6-311++G (d,p) calculations to anomeric vicinal coupling constants of these compounds in order to model couplings in various derivatives of glucose and galactose (OMe-, SMe-, Et-, NHMe-, Cl- and F-glycosides ) in PCM-modeled water and methanol334. Least-squares 90
parameterization of the calculated series of coupling constants gave Karplus equations slightly differing in the last constant term only, which were close to the Karplus equations derived from the experiment315. Stenutz and coworkers studied homo- and heteronuclear 95
coupling constants involving a hydroxymethyl group of a carbohydrate model (2-hydroxymethyltetrahydropyran)335. Working on the DFT-optimized geometries of each of three C5-C6 rotamers, authors designed an extended basis set [5s2p1d/3s1d] as an improvement of the previously reported set 100
[5s2p1d/2s]. The new basis set aimed at more accurate simulation of interproton spin couplings. Three Karplus equations were derived (2JH6S,H6R,
3JH5,H6R, 3JH5,H6S) and compared to those
obtained from the experimental JHH values in 4,6-pyruvate derivatives of methyl glucosides and methyl galactosides. The 105
largest deviation (0.5 Hz) was within an RMS error (0.3-0.9). The authors showed that θ angle (C6-O6 torsion) affects 2JH6,H6 more significantly than the H-C-H bond angle. As for 1JCH, the authors found that fitting the calculated coupling constants to only two torsion angles ω / θ yields 110
relatively large RMS errors, presumably due to a solvent effect on C-O torsional behavior, which agrees with the known solvent dependence of 1JCH in saccharides345. Tafazzoli and co-authors performed DFT simulations of the anomeric center and exocyclic group (in three staggered 115
orientations) of the β-D-Glcp in erigeroside and
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 27
4-deoxy-β-D-xylo-hexopyranose residues in a model compound to support data of a detailed NMR investigation of erigeroside from Satureja khuzistanica. The model differed from erigeroside by the absence of a hydroxyl group at position 4 and allowed avoidance of intramolecular hydrogen bonding and study of the 5
effect of a hydroxyl group on couplings involving C4. The authors calculated complete hyper surfaces for 1JC1H1,
2JC5H6, 2JC5H6,
3JC4H6, 2JH6R–H6S and 3JH5H6 and derived Karplus equations
to correlate all these couplings to C5–C6 (ω), C6–O6 (θ) and C1-O1 (ϕ) torsion angles with RMS deviation from 0.3 to 1.3 Hz. 10
These calculated J-couplings were in agreement with experimental values, confirming nearly quantitative prediction of DFT-calculated heteronuclear coupling constants in aqueous solution modeled by PCM336. The performance of DFT and a specially designed basis set 15
was demonstrated by Cloran and coworkers320 on the example of 2-deoxy-β-D-erythro-pentofuranose, the major component of DNA. These studies have shown that DFT can be used to calculate reliable JCH and JCC values in carbohydrates without scaling, within -6% and +10% of experimental values, 20
respectively. Computed molecular parameters and 1JCH spin-spin coupling constants in ten geometrically optimized envelope shapes were compared to the scaled values reported from HF and MP2 methods346. As a result, the authors concluded that DFT geometry 25
optimization substantially contributed to the difference between the scaled couplings and the DFT-derived 1JCH values. Indirect JCH exhibited weaker (2JCH) or no (3JCH) dependence on the geometry optimization method. Computed JCH values were up to 10% larger in DFT calculations than the corresponding scaled 30
HF/MP2 values, but the coupling trends predicted by both methods were almost identical. 1JCC, 2JCC and 3JCC were computed as a function of ring conformation, and theory level-dependent corrections were evaluated by comparison with HF calculations. With the accuracy 35
achieved these coupling constants can be used as monosaccharide conformational probes. All indirect couplings in the optimized structures were determined by finite (Fermi-contact) field double perturbation theory with a basis set [5s2p1d/2s] previously constructed to recover Fermi contact contribution to 13C-13C 40
coupling constants347. Later these authors investigated the effect of a hydroxymethyl group conformation on the molecular properties of the same ten geometries of 2-deoxy-β-D-erythro-pentofuranose337. Carbon-involving spin-spin coupling constants were computed using the 45
same methodology on DFT-reoptimized geometries of a gg rotamer typical for nucleic acids. The authors presented a detailed comparison of coupling magnitudes with those observed earlier320 for a gt rotamer in solution. 1JCH appeared to be most affected by C4-C5 bond rotation, presumably due to substantial changes in 50
C-H bond length accompanying the rotation. The results for 2JCH, 3JCH, 2JCC and 3JCC confirmed prior predictions of coupling dependence on the ring conformation. Cloran and coworkers investigated the effect of the amino-substitution at C1 on the molecular properties of the same 55
compound, including coupling constants338. They compared DFT predictions at B3LYP/[5s2p1d/2s] for both protonated and unprotonated form to those published for 2-deoxy-β-D-erythro-
pentofuranose without an amino group320. These studies proved a suggestion that a different projection rule is required to predict 60
2JC2,H1 in nucleosides348. Accordingly to the reported findings, N-substitution of O1 exerts only a minor effect on the magnitudes of 2JC1,H2 and 3JC1,H3, as well as on magnitudes of 3JCCCH,
3JCOCH and 3JCOCC, regardless of the state of N-protonation. In contrast, 2JCCH couplings are strongly modulated by substitution at the 65
carbon bearing a coupled proton; a much smaller effect is observed when the substitution occurs at the coupled carbon. The direct couplings were predicted to increase by ca. 10 Hz (1JC1,H1) and to decrease by ca. 2-4 Hz (1JC1,C2) upon N-protonation, which makes them a probe of a protonation state of aminosugars in 70
solution338. Calculation of homo- and heteronuclear coupling constants, as well as chemical shifts, of mononucleotide cyclic uridinemonophosphate was performed by Bagno and coworkers277. The overview of computational method used, 75
including conformational search is given above in section 5.1. The solvent was modeled using PCM both in geometry and coupling constants calculations. The calculation of coupling constants included all four Ramsay terms. In spite of a good correlation between calculated and 80
experimental values (JHH: R2 = 0.998, MAD = 0.90 Hz; JCH: R2 = 0.974, MAD = 13.4 Hz), slope and intercept of the best fit line were not ideal, especially for JHH couplings. Nevertheless, this common problem in coupling constant calculation (see introduction to this section) did not prevent a qualitative 85
agreement. Bour and coworkers interpreted indirect spin-spin NMR 1H-1H coupling constants of the synthetic erythrofuranose derivatives on the basis of ab initio modeling340. Epoxy, epithio, and epimino groups were attached to the sugars to limit their conformational 90
flexibility. These restrictions improved the calculation performance and simplified the estimation of the dependence of the spin coupling on the molecular geometry. Fully relaxed geometries were optimized with the HF, MP2, and DFT (B3LYP and BPW91) methods using the 4-31G, 95
6-31G(d,p), and 6-311++G(d,p) basis sets. The authors modeled benzene solution using COSMO248. To calculate spin-spin couplings they used the coupled perturbed approach of the DFT method349 in vacuum and various basis sets, including NMR-optimized IGLO II and IGLO III 209. B3LYP/IGLO II 100
computations included all four important magnetic terms in the Hamiltonian. Typically all coupling constants within methyl 2,3-epoxy-L-erythrofuranoside exhibited only a minor (<15%) dependence on the selected conformer and a basis set (IGLO II vs. IGLO III) and fitted the experimental data within 1 Hz. 105
However, the only short-range coupling constant 2J4R,4S decreased by 2.1 Hz upon a bigger basis set. The authors used a series of conformers optimized at MP2/6-311++G(d,p) to test the heminal and vicinal coupling prediction against different basis sets containing from 88 to 376 basis functions. Except for 4-31G, the 110
calculation accuracy was similar but the agreement with the experiment did not improve with the basis set size increase. This confirmed the complexity of the spin-spin coupling modeling regarding all four Hamiltonian terms and confirmed the earlier findings of Helgaker and coworkers350. 115
28 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
Mobli and Almond used DFT methods to calculate coupling constants between HN and H2 in the N-acetylated amino sugars and to derive Karplus equations for 3JH-N-C-H
341. Ab initio calculation slightly overestimated the coupling constants. In contrast to an explicit solvent model explored by MD 5
simulations, an implicit-solvent PCM method lowered the magnitude of the calculated values, bringing them closer to the experiment. The authors explained worse results of explicit solvent inclusion by highly dynamic interactions with water, which were difficult to simulate by static DFT equations. 10
However, models predicted with explicit solvent were more conformationally realistic. The D-pyranose rings of the N-acetylated amino sugars were fixed in the 4C1-chair conformation and optimized at B3LYP/6-31G(d,p). For every amide group rotamer of 15
α-D-GlcpNAc, SD, PSO and DSO spin–spin coupling terms were calculated at B3LYP/IGLOO-III (11s,7p,2d/6s,2p) [7s,6p,2d/4s,2p]209 while the FC term was calculated using a bigger HIIIsu3 basis-set351, (14s,7p,2d/9s,2p) [14s,6p,2d/9s,2p]. Due to the observed FC term dominance β-D-GlcpNAc and 20
α-GalpNAc were processed with FC term only.
The Karplus equations derived by non-linear least-square fitting (e.g. for full calculation of α-D-GlcpNAc: 3JH-N-C-H = 9.81cos2(θ+φ)-1.51cos(θ+φ)+0.62) exhibited similar trends to those previously reported for peptide amide groups, although the 25
coupling constants were greater in magnitude. The authors showed that the analysis of molecular dynamics should not be neglected in order to reproduce experimental values of 3JH-N-C-H. Dynamical spreads at the acetamido groups were obtained by integration of a Karplus curve and subsequent analysis of the 30
group libration range.
6.2. Inter-residue coupling constants
Detailization of various effects on intra-residue coupling constants provided in the previous section makes it possible to reveal the intriguing questions related to structure and bonding of 35
more complex carbohydrates. Indeed, long-range coupling constants across glycosidic bonds serve a probe for oligo- and polysaccharide conformation. Karplus-type interpretation of these coupling constants (in addition to NOEs) provides spatial constraints for the glycosidic bond torsion angles352. 40
Representative examples of predictions of inter-residue coupling constants in carbohydrates are summarized in Table 6.
Table 6. Prediction of coupling constants in oligo- and polysaccharides
Object (molecule) Coupling constant
(e.g. 3JH-N-C-H)
Calculation method Application ref.
Geometry Coupling Software
α-D-Glcp-(1→1)-α-D-Glcp, α-D-Glcp-(1→4)-D-Glcp, β-D-Galp-(1→4)-D-Glcp, β-D-Glcp-(1→4)-D-Glcp, β-D-Glcp-(1→6)-D-Glcp, α-D-Galp-(1→6)-D-Glcp, α-D-Glcp-(1→3)-D-Glcp, β-D-Glcp-(1→3)-D-Glcp
3JCH (inter-glycosidic)
s-MD (Amber-H force field)
Karplus-type correlation curve derived by Tvaroska and coworkers353
Insight II Molecular modeling program 192 (v. 4.0.0), molecular mechanics / dynamics package (v. 2.9)
testing the suitability of the molecular modeling approach
354
1C4 α-L-IdopA2S(1→4)-α-D-GlcpN6S-1OMe, 2S0 α-L-IdopA2S(1→4)-α-D-GlcpN6S-1OMe, (heparin disaccharide)
3JCH (inter-glycosidic)
B3LYP/ 6-311++G**, M05-2X/ 6-311++G** (with explicit solvent, ONIOM)
B3LYP Gaussian 03, Gaussian 09 134, 135
studying coupling constant variations upon counterion and solvent effects
240
α-D-Glcp-(1→1)-α-D-Glcp 3JCH (inter-glycosidic)
MD, (CHARMM carbohydrate force field)
Karplus-type correlation curve derived by Tvaroska and coworkers353
CHARMM 164 establishing a comprehensive understanding of the hydration pattern of trehalose
332
β(1-4)-linked disaccharide models
2JCOC (inter-glycosidic)
B3LYP/6-31G(d) FF-DPT (Fermi-contact), B3LYP/[5s2p1d,2s]
Gaussian 94 studying the influence of structural factors on transglycosidic 2JCOC
355
β-D-Ribf-1OMe-(3-P-5)- -β-D-Ribf-1OMe (RNA backbone) in 16 “experimental” conformations
2J, 3J, 4J between all 1H, 13C and 31P
MM/Amber B3LYP/6-31G(d,p) (with explicit solvent)
coupled perturbed DFT, B3LYP/IGLO II and IGLO III
Amber 178, 179 (geometry optimization), Gaussian 03 (NMR calculation)
interpretation of nucleic acid backbone conformation using coupling constants
356
β-D-2-deoxy-Ribf-1(N-base), N-base=A,C,G,U,T
3JC-H1’, 3JC-H1’, 1JC1’-H1’
B3LYP/6-31G(d) DFT/FPT PW86/IGLO-III (FC term); SOS-DFPT (PSO and DSO terms);
Gaussian 98 deMon-NMR 280-
282
study of relationship between spin coupling and the glycosidic torsion angle
357
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 29
Cheetham and coworkers calculated the interglycosidic heteronuclear coupling constants (3JH1Cx and 3JC1Hx) for a series of eight α- or β-linked glucosyl- and galactosyl-glucopyranoses. The authors utilized a Karplus relationship of Tvaroska and coworkers353: J = 5.7cos2φ−0.6cos φ+0.5 with cos2φ and cosφ 5
conformationally averaged from the s-MD trajectories. The aim of this calculation was to determine if a molecular modeling approach would be adequate to provide results similar to those obtained experimentally. The crystal conformation of each disaccharide was used as a starting geometry for the MD 10
simulations in Amber-Homans force field193, with the explicit inclusion of water. The authors showed that, except for C1-O1-Cx-Hx dihedral angle in β(1-4)-linked disaccharides, their relatively simple modeling could reproduce results close to the experimental and other modeling studies354. 15
Engelsen and Perez calculated interglycosidic heteronuclear coupling constants in α,α-trehalose within a study aimed at establishing a comprehensive understanding of the hydration pattern of this disaccharide and its comparison to sucrose332. Starting from X-ray geometry, the authors ran a 2.5 ns MD 20
simulation in CHARMM carbohydrate force field with the explicit inclusion of 485 TIP3P water molecules. The heteronuclear coupling constant 3JH,C across the glycosidic linkage was calculated accounting to the obtained adiabatic map and the equation for the C-O-C-H fragment parameterized by 25
Tvaroska and coworkers353. The derived value 2.3 Hz was much closer to the experiment (2.5 or 3.3 H) than 1.5 Hz obtained by the MD simulation in vacuum. In contrast, precision of intra-residue 3JH5,C1 calculation was not affected by the solvent inclusion. 30
Since DFT calculations were shown to yield computed JCC within ∼10% of experiment without scaling320, 2JCOC values could be computed to within 0.2-0.3 Hz of the experimental couplings. This accuracy allowed Cloran and coworkers to study influence of structural factors on trans-O-glycosidic 2JCOC using DFT 35
calculations. Geometric optimizations were conducted at B3LYP/6-31G*. 13C-13C spin coupling constants were obtained by finite-field double perturbation theory calculations using a basis set previously constructed for similar systems347. Only the FC component was recovered, as main component of JCC in 40
saturated systems. The calculation supported the observation that JCOC depended mainly on the φ angle of a glycosidic bond. The increase of a valent angle at oxygen produced more negative JCOC coupling. Sychrovsky and coworkers calculated indirect heteronuclear 45
coupling constants and related them to the backbone torsion angles and sugar pucker of nucleic acids. The authors used 16 known conformations of the nucleic acid backbone, including well-characterized double-helical forms (B-DNA, A-DNA, A-RNA), and applied them to baseless dinucleoside phosphate as to 50
a molecular model. The initial models of the dinucleoside phosphates were constructed as reported for RNA fragments358 and for the most prevalent DNA conformations, BI and A359. These “experimental” geometries were relaxed by molecular mechanics 55
with torsion angles of the backbone restrained to keep the conformation close to a class-identifying state. After geometry relaxation, the nitrogenous bases were substituted by methyl
groups and both the 5’ and 3’ ends were terminated by hydroxyl groups. 60
The coupled perturbed DFT method349 at B3LYP /IGLO II and IGLO III levels of theory209 was used for the calculation of 1H, 13C, and 31P coupling constants by including all four coupling terms. Explicit hydration, the PCM solvent model and their combination were compared. The PCM hydration accounted for 65
the dominant part of calculated coupling difference between in vacuo and the hydrated models. The authors calculated all possible coupling constants across two, three of four bonds and correlated them to each of seven torsion angles characterizing a βDRibf-1OMe(3-P-5)βDRibf-1OMe fragment, so that 70
experimentally observed conformations of the nucleic acid backbone could be characterized with a specific set of J-couplings356. Three of torsion angles in nucleic acid backbone have sharp population distribution. Accordingly to the authors, 3J couplings 75
correlated with “sharp” torsions are not properly described by classical Karplus equation and should be parameterized with explicit consideration of other torsions, either as a multidimensional Karplus curve or a curve parameterized with neighboring torsions fixed. 80
Munzarova and Sleknar357 studied correlation of a non-backbone glycosidic torsion angle in deoxynucleosides with the heteronuclear coupling constants of the anomeric protons. The backbone torsion angles were frozen to their values in B-DNA. The authors derived phase-shifted Karplus equation and 85
parameterized it separately for every nucleoside, as purine and pyrimidine nucleosides exhibited different torsional dependence of couplings.
7. Computation of NMR relaxation rates
Best and coworkers reported the results of molecular dynamics 90
simulations compared to the data of the NMR relaxation experiments for maltose and isomaltose. The 13C longitudinal relaxation time (T1) is dependent on the dipolar relaxation between a carbon and its directly attached proton. The relaxation parameters are a function of a spectral density. The most 95
frequently used method of experimental characterization of the spectral density is a model-free formalism. Generalized order parameters may be directly calculated from simulation. The equations linking these parameters together were published elsewhere360. Both maltose vacuum and solution simulations were 100
started from the saddle point (0,0) between the two major energy wells on the adiabatic map. On the basis of the adiabatic map, the lowest minima were chosen as starting points for the simulations of isomaltose. As a result longitudinal relaxation times were obtained for each carbon in the maltose and isomaltose (see Table 105
2 in the original publication). Later this group ran MD simulations with explicit water on the minimal model compounds for the α(1-6) branch point of amylopectin: trisaccharide panose and the tetrasaccharide 62 α-D-glucosylmaltotriose (maltotriose (1-6)-glucosylated at its middle 110
glucose residue)361. Calculation of the NMR longitudinal relaxation times for panose showed good agreement with the experimental values, and validated the simulation dynamics used. As a check of the validity of the simulation dynamics,
30 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
longitudinal T1 relaxation times sensitive to both the extent and time scale of molecular motion, were directly calculated from the trajectories. Model-free formalism is most suitable for molecules with intramolecular motion much more rapid than molecular tumbling, which may be wrong for small oligosaccharides. 5
Besides this, the comparison between the experiment and the simulation is done through indirectly fitted parameters. To resolve these issues authors adopted the approach they used previously to calculate T1 relaxation times directly from MD simulation360. Relaxation times were simulated separately for CH 10
and CH2 carbon atoms in each panose ring (see Table 3 in the original publication). In both publications by Best and coworkers360, 361, all calculations were done using the PHLB carbohydrate force field specifically parameterized for carbohydrates177 and implemented 15
in CHARMM. This parameter set addressed the excessive flexibility in the earlier force field (HGFB), and the incorrect preference of the primary alcohols in that force field for adopting the tg rotamer. The TIP3P model was used for explicit representation of water. 20
The cross-correlated relaxation rates of double- and zero-quantum coherences have widespread applications in structural and conformational studies of biomolecules, including probing of the O-glycosidic linkage conformation in carbohydrates362. Sychrovsky and coworkers applied quantum chemical 25
calculation methods to investigate the dependence of N1/N9 and C1′ CS tensors on the glycosidic torsion angle and sugar pucker in 2′-deoxynucleosides (dAde, dGua, dCyt, dThy). They calculated cross-correlated relaxation rates using reduced equations published elsewhere363 and tested applicability of 30
Ravindranathan’s363 and Duchardt’s364 methods to deoxyribonucleic acids (DNAs). According to their results, these CS tensors exhibited a significant degree of conformational dependence on C1’-N torsion angle and sugar pucker, which should be taken into consideration while interpreting cross-35
correlated relaxation rates between the N1/N9 CS tensor and C1′-H1′ dipole-dipole in DNAs278. The geometry of all nucleosides was gradient optimized at B3LYP/6-31G(d,p) level. All geometrical parameters, except the C1’-N torsion angle, were freely optimized. The NMR shielding 40
tensors were calculated using the GIAO B3LYP/(9s,5p,1d/5s,1p) [6s,4p,1d/3s,1p] for carbon, nitrogen and oxygen and B3LYP/ (5s,1p) [3s,1p] for hydrogen (basis set IGLO II209). A summary of the CS tensor calculation is included in Table 3 (see above). Longitudinal and transversal relaxivities are the inverse of the 45
spin-lattice T1 and spin-spin T2 relaxation times, respectively. They contain information both on structure and molecular dynamics. Several attempts have been undertaken to predict relaxivities of hyaluronan oligomers. Calculations of 13C relaxivities based on the combination of MD simulation 50
algorithms with diffusion theory and mode-coupling approach (MCD) were performed on hyaluronan (HA)2 and (HA)4 oligomers in water solutions365. CHARMM168 was used to generate a hyaluronan structure and to merge it in a water box. Authors achieved an agreement between the calculations for the 55
two oligomers with the experimental data obtained by the “inversion-recovery” technique. For details about mode-coupling diffusion please refer to the original publication365.
Furlan and coworkers presented the calculation of dynamic properties of the hyaluronan (HA)4 with consideration of the 60
hydrophobic effect in water solution and local hydrophilic effects due to hydrogen bonding with the solvent. Several configurational distributions and dynamical parameters related to nuclear magnetic relaxation, sensitive both to the molecular structure and the mobility, were calculated from the replica-65
exchange Monte Carlo statistics at different temperatures. The diffusion theory was applied to the calculation of the longitudinal 13C relaxivities366. With the MD calculation method proved, authors reported an investigation of molecular structure and detection of the critical length of a hyaluronan polymer. They 70
followed the protocol established for the quantitative description of the size and shape of biopolymer chains including the construction of chain models by Monte Carlo simulation on the basis of conformational statistical weights of representative dimeric units. The OPLS-AA force field was used for the 75
generation of the conformational energy landscape. The mode-coupling diffusion (MCD) theory with the RM2-II basis set was applied to the calculation of the NMR relaxivities. The computed relaxivities are within 25% of the experimental data, and in all cases they are larger than in the experiments367. Parameterization 80
of the OPLS-AA force field for carbohydrates has been reported earlier188.
8. Computation of other NMR parameters
Averaged characteristics of molecular systems, including NOEs, can be calculated within the studies of the conformational 85
equilibrium. As soon as a conformational map is calculated, the averaging is usually performed over energies with an assumption that population of conformaers fits the Boltzmann distribution124. Comparison of the resulting predicted NOEs with the experimental data is a widely used approach to the validation of 90
MD experiments. Trajectories obtained from MD simulations in CSFF force field in TIP3P water model allowed analytical derivation of T1 and T2 relaxation time, cross-correlated relaxation rates and NOEs. The stochastic approach to processing MD simulation data 95
was shown suitable for description of diffusion dynamics of molecules with mainly torsional internal mobility, as demonstrated for γ-cyclodextrin368 and model tri- and pentasaccharides369. MM3 was reported as a good force field to produce 100
confomation maps for NOE calculations370. Gerbst and coworkers confirmed the adequacy of molecular modeling of fucoidans by comparison of the calculated NOEs of protons at a glycosidic bond to the experimental values. In case of fucobiose, the absence of signal overlap allowed to utilize steady-state NOE370. Starting 105
from MM3 conformation maps, the authors calculated NOEs using the iterative Noggle and Shirmer equation371, and averaged the results according to Boltzmann distribution. Selection of conformational minima with energies within 10% of the global energy minimum gave satisfactory results, particularly average 110
difference between experimenmtal and calculated relative NOEs was from 1.3% to 2.5% depending on the disaccharide sufation. In case of linear sulfated fucotriosides, anomeric proton signal overlap prohibited a utilization of a steady-state NOE. Transient
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 31
NOEs were calculated as 1/r6, r being an interatomic distance obtained from the optimized geometry. Transient NOEs were Boltzmann-averaged over the MM3 conformation maps calculated for every dissacharide fragment using the same methodology as for fucobiosides. The resulting NOESY cross-5
peak volumes showed weak correlation to the experiment, which encouraged authors to model a reducing end sulfate group as undissociated rather than as an anion372. Casset and coworkers examined the potential energy hypersurface of sucrose using molecular mechanics calculations 10
in MM3(92) force field interfaced with two different algorithms for conformational searching (systematic grid-search approach, and CICADA procedure). CICADA (Channels In Conformational space Analyzed by Driver Approach) method drives selected dihedral angles to 15
explore the low-energy regions and permits full geometry relaxation. Using the grid-search approach, the relaxed adiabatic map of sucrose was calculated as a function of the glycosidic torsion angles, and three families of stable conformers were identified. The CICADA procedure found all minima and the 20
low-energy conversion pathways for sucrose in agreement with those located by the grid-search approach. Theoretical NOESY volumes were calculated using full relaxation method from an ensemble-averaged relaxation matrix, as described earlier373. All accessible conformations derived from 25
either the grid-search or the CICADA method were taken into account. Two sets of NOESY volumes were calculated using averaging methods appropriate for both slow and fast internal motions. The agreement factors, which are relative deviations between the calculated and the experimental non-diagonal 30
400 MHz NOESY volumes, were 0.170 (grid-search, fast motion), 0.175 (grid-search, slow motion), 0.175 (CICADA, fast motion), 0.163 (CICADA, slow motion). These values improved (0.118-0.139) when only intensive NOE peaks were considered. The study demonstrated the ability of the CICADA method to 35
reproduce the potential energy surface of a flexible molecule and therefore to simulate its NOEs159. Landstrom and Widmalm carried out an atomistic all-atom MD simulation of the branching region of Aeromonas salmonicida O-specific polysaccharide, using the β-D-ManpNAc-40
(1→4)[α-D-Glcp-(1→3)]-α-L-Rhap-OMe trisaccharide as a model, with explicit solvent molecules. The MD simulations with 1 µs duration revealed a dynamic conformational process on the nanosecond time scale, which had lacked the attention of researchers. The obtained results emphasized the predictive 45
power of MD simulations in the studies of biomolecular systems and explained an unusual NOE due to conformational exchange143. The MD simulations employed PARM22/SU01, which is a CHARMM22-type force field modified for carbohydrates173 and implemented in NAMD 2.6b1. Initial 50
conditions were prepared by placing the model trisaccharide in a previously equilibrated cubic water box, followed by energy minimization and heating. The smooth particle-mesh Ewald method374 was used to calculate the full electrostatic interaction. The 500 MHz NOESY volumes for H-2 (Rha) – H-2 55
(ManNAc) homonuclear interaction were simulated as a function of mixing time for two conformation states of the above trisaccharide model in MestreLabs Research Mspin375 using a full
relaxation-matrix formalism376, and a molecular reorientation correlation time of 200 ps. 60
Blundell and coworkers reported the complete resolution and assignment of nuclei in hyaluronan oligosaccharides with seven different naturally occurring terminal rings. They simulated the non-first-order line shape of the H-2VII proton in HA6
AN (structure 7 in the original publication) and used this data in spectrum 65
assignment. GAMMA software377 utilized in the simulation employed multiple iterative rounds with floating values for 3JH,H and ∆δ(H-3,H-4) to find the best fit to the intensity data for H-2VII proton. All five protons within the GlcA ring were used to model the spin system378. 70
9. Conclusions
Based on the literature data discussed in the present review we compare the scope and performance of various computational approaches to predict the NMR parameters of carbohydrates. For a representative view, the accuracy of 13C NMR chemical 75
shifts calculations by different methods was summarized for two simple monosaccharides, α- and β-anomeric forms of D-glucose (Table 7). On average, empirical schemes provide better accuracy with RMS in the range of 0.07–2.51 ppm (in many cases less than 1 ppm). DFT and ab initio calculations have shown varying 80
performance with the RMS values varying from 1.75 to 9.81 ppm. The best accuracy was observed for the calculations at B3PW91/6-31+G(d)//MM+ level (RMS = 1.75 ppm)264 and at PBE/TZ2p level (RMS = 2.43 ppm)236. Methyl 6-O-(diphenylphospho)-a-D-glucopyranoside 85
[(PhO)2P(O)-O-6)α-D-Glcp-1OMe ], a sugar derivative with an uncommon substituent, can serve as a crucial test. Chemical shift simulation for such a compound is a complicated task both for density functional theory (due to complex electronic structure) and for empirical schemes (due to poor representation in 90
chemical shift databases). A summary of theoretical predictions by different methods reveals important trends (and excellent predictive potential for the rest of the molecule (RMS = 0.11 and 0.52 ppm). This difference in accuracy can be accounted for the rigid structure of a benzene ring and the presence of only a single 95
stereoisomer with the given atom connectivity, which is a structural fragment highly populated in the chemical shift database. Sucrose disacharide serves as a test of chemical shift prediction strength of molecules containing glycosidic bonds and 100
residues poorly populated in chemical shift databases, such as fructopyranose. Fig. 10 illustrates the results of empirical and quantum-mechanical calculations. Only GIAO calculation at B3LYP/6-311G++(2d,2p) level in COSMO water model (Fig. 10C) could accurately predict chemical shifts of carbons 105
adjascent to a glycosidic bond, while empirical prediction was the only one to produce a correct order of signals. The accuracy of quantum mechanical predictions was similar, except for the non-hydroxylated carbon of the fructose residue (F5). The statistical and performance data of these calculations are summarized in 110
Table 9. To conclude, recent outstanding progress in development of theoretical methods and rapid increase of hardware performance greatly facilitated the application of computational tools to NMR-
32 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
based structural studies of carbohydrates. Nowadays, prediction of the NMR parameters of mono- and small oligosaccharides benefit from formalized procedures and became a routine task.
Nevertheless, in spite of widespread areas of potential application, the limitations of computational approaches are still 5
an important issue. Table 8). DFT calculations at B3LYP level provide reasonable accuracy for carbohydrate moiety (RMS = 3.28 ppm) and an unacceptable error for the rest of the molecule (RMS = 5.89 ppm). Changing density functional to PBE improves description 10
of the non-carbohydrate part of the molecule (RMS = 3.03 ppm), but shows poor results for a monosaccharide core (RMS = 9.37 ppm), allowing a suggestion that B3LYP is better adopted for conformationally flexible structures. An important point for theoretical calculations at density 15
functional and ab initio levels is a choice of reference. As summarized in and excellent predictive potential for the rest of the molecule (RMS = 0.11 and 0.52 ppm). This difference in accuracy can be accounted for the rigid structure of a benzene ring and the presence of only a single stereoisomer with the given 20
atom connectivity, which is a structural fragment highly populated in the chemical shift database. Sucrose disacharide serves as a test of chemical shift prediction strength of molecules containing glycosidic bonds and residues poorly populated in chemical shift databases, such as 25
fructopyranose. Fig. 10 illustrates the results of empirical and quantum-mechanical calculations. Only GIAO calculation at B3LYP/6-311G++(2d,2p) level in COSMO water model (Fig. 10C) could accurately predict chemical shifts of carbons adjascent to a glycosidic bond, while empirical prediction was the 30
only one to produce a correct order of signals. The accuracy of quantum mechanical predictions was similar, except for the non-hydroxylated carbon of the fructose residue (F5). The statistical and performance data of these calculations are summarized in Table 9. 35
To conclude, recent outstanding progress in development of theoretical methods and rapid increase of hardware performance greatly facilitated the application of computational tools to NMR-based structural studies of carbohydrates. Nowadays, prediction of the NMR parameters of mono- and small oligosaccharides 40
benefit from formalized procedures and became a routine task. Nevertheless, in spite of widespread areas of potential application, the limitations of computational approaches are still an important issue.
Table 8, noticeable difference in accuracy may be observed upon 45
changing a reference from benzene (RMS = 9.37, 3.03 ppm) to ethylene glycol (RMS = 9.31, 4.76 ppm). Despite the rare nature of the molecule, empirical methods gave good prediction with RMS deviation being as small as 2.01 ppm using carbohydrate-optimized BIOPSEL algorithm 50
(and excellent predictive potential for the rest of the molecule (RMS = 0.11 and 0.52 ppm). This difference in accuracy can be accounted for the rigid structure of a benzene ring and the presence of only a single stereoisomer with the given atom connectivity, which is a structural fragment highly populated in 55
the chemical shift database. Sucrose disacharide serves as a test of chemical shift prediction strength of molecules containing glycosidic bonds and residues poorly populated in chemical shift databases, such as fructopyranose. Fig. 10 illustrates the results of empirical and 60
quantum-mechanical calculations. Only GIAO calculation at B3LYP/6-311G++(2d,2p) level in COSMO water model (Fig.
10C) could accurately predict chemical shifts of carbons adjascent to a glycosidic bond, while empirical prediction was the only one to produce a correct order of signals. The accuracy of 65
quantum mechanical predictions was similar, except for the non-hydroxylated carbon of the fructose residue (F5). The statistical and performance data of these calculations are summarized in Table 9. To conclude, recent outstanding progress in development of 70
theoretical methods and rapid increase of hardware performance greatly facilitated the application of computational tools to NMR-based structural studies of carbohydrates. Nowadays, prediction of the NMR parameters of mono- and small oligosaccharides benefit from formalized procedures and became a routine task. 75
Nevertheless, in spite of widespread areas of potential application, the limitations of computational approaches are still an important issue.
Table 8). An obvious drawback of the specific empirical schemes is the limitation in the analyzed structures, since BIOPSEL 80
cannot predict chemical shifts of the aromatic part, while CASPER lacks the interface for phosphorus-bonded
monosaccharides. Empirical algorithms with non-specific databases demonstrated a good predictive potential for chemical shifts of the carbohydrate moiety (RMS = 3.14 and 3.83 ppm)85
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 33
Table 7. Comparison of 13C NMR spectra of α- and β-glucopyranose calculated by various methods in solution or gas phase.
Method (chemical shifts // geometry)
13C NMR spectrum of D-Glcp a,c, ppm RMS error b,c, ppm
Notes and reference
Non-empirical methods B3LYP/pcJ//B3LYP/6-31G(d,p) 104.74 82.14 83.96 79.55 81.89 70.10 (α)
108.59 85.01 86.00 79.23 86.58 70.18 (β) 9.81 (α) 9.68 (β)
260
BP86/TZVP//BP86/TZVP 103.16 79.17 81.03 74.10 79.53 65.28 6.81 (α) for α-D-Glcp 4C1 276
B3LYP/cc-pVTZ//B3LYP/6-31G(d,p) 101.45 79.34 81.01 75.25 79.70 66.36 (α) 6.70 (α) gas phase, averaged through conformers254
ONIOM [MP2 : HF/6-311++G(2d,2p)]// MP2/cc-pVDZ
102.88 79.37 79.65 80.59 76.55 70.37 (β) 6.28 (β) for β-D-Glcp 4C1 T 263
HF SCF/TZVP//B3LYP/TZVP 86.52 67.63 69.34 63.24 68.58 57.19 (α) 5.37 (α) for α-D-Glcp 4C1 276
B3LYP/TZVP//B3LYP/TZVP 100.06 77.88 79.77 72.60 77.91 64.23 (α) 5.12 (α) MP2/TZVP//B3LYP/TZVP 99.95 77.70 79.51 72.15 77.32 64.09 (α) 4.87 (α) ONIOM [MP2 : HF/ 6-311++G(2d,2p)]// MP2/cc-pVDZ
102.93 79.18 80.15 74.83 80.28 66.88 (β) 4.50 (β) for β-D-Glcp 4C1 G+ 263
MP2/TZVP//MP2/TZVP 99.61 77.37 78.61 71.73 75.59 63.88 (α) 4.25 (α) for α-D-Glcp 4C1 276
ONIOM [MP2 : HF/ 6-311++G(2d,2p)]// MP2/cc-pVDZ
102.61 78.95 80.09 71.25 79.60 62.84 (β) 3.39 (β) for β-D-Glcp 4C1 G- 263
PBE/TZ2p 94.47 73.45 77.42 73.23 75.46 63.25 (α) 2.43 (α) normalized against ethylenglycol236 B3PW91/6-31+G(d)//MM+ 91.6 70.6 72.3 72.1 70.4 63.9 (α)
94.8 74.6 74.6 72.4 74.6 63.5 (β) 1.75 (α) 1.83 (β)
264
Empirical methods HOSE + neural net (ACDLabs 10) 95.33 74.67 76.68 70.71 76.66 62.10 (α)
95.33 74.67 76.68 70.71 76.66 62.10 (β) 2.51 (α) 0.65 (β)
required manual setting of geometry d
HOSE (MestreNova/ ModGraph) 95.54 74.73 74.52 70.41 76.24 62.11 (α) 95.54 74.73 74.52 70.41 76.24 62.11 (β)
2.15 (α) 1.09 (β)
results for α- and β-glucose were the same.
HOSE (ACDLabs 10) 95.56 73.08 73.73 70.13 74.93 62.38 (α) 95.56 73.08 73.73 70.13 74.93 62.38 (β)
1.55 (α) 1.78 (β)
required manual setting of geometry d
Incremental (BIOPSEL / BCSDB) 93.3 72.7 74.0 70.9 73.2 61.9 (α) 97.1 75.4 77.0 70.9 77.2 62.1 (β)
0.42 (α) 0.30 (β)
reported accuracy was “best” (mark 4 of 0..4)114
Incremental (CASPER) 92.99 72.47 73.78 70.71 72.37 61.84 (α) 96.84 75.20 76.76 70.71 76.76 61.84 (β)
0.08 (α) 0.07 (β)
reported expected error was 0.00 117
Experimental data 30°C, in water 92.77 72.15 73.43 70.32 72.10 61.27 (α)
96.59 74.81 76.43 70.27 76.61 61.42 (β) 260
in water 93.3 72.7 74.0 70.9 72.7 61.9 (α) 97.1 75.4 77.0 70.9 77.2 62.1 (β)
112
70°C, in water 92.99 72.47 73.78 70.71 72.37 61.84 (α) 96.84 75.20 76.76 70.71 76.76 61.84 (β)
119
92.9 72.5 73.8 70.6 72.3 61.6 (α) 96.7 75.1 76.7 70.6 76.8 61.7 (β)
318
25°C, in water 92.48 71.87 73.15 70.04 71.82 60.99 (α) 254 93.3 73.1 74.4 71.2 72.9 62.4 (α) 379
a Signals are in the order of atom enumeration (C1-C6)
b RMS error was calculated against the experimental spectrum averaged from six listed sources.
c Anomeric configurations are in parentheses.
d ACDLabs could not properly minimize the geometry starting from the template “Glc”. Prior to calculation, the starting geometry of α- and β-Glcp was 5
manually set with subsequent built-in MM2 minimization. Results for α- and β-glucose in water were the same.
and excellent predictive potential for the rest of the molecule (RMS = 0.11 and 0.52 ppm). This difference in accuracy can be accounted for the rigid structure of a benzene ring and the presence of only a single stereoisomer with the given atom 10
connectivity, which is a structural fragment highly populated in the chemical shift database. Sucrose disacharide serves as a test of chemical shift prediction strength of molecules containing glycosidic bonds and residues poorly populated in chemical shift databases, such as 15
fructopyranose. Fig. 10 illustrates the results of empirical and quantum-mechanical calculations. Only GIAO calculation at B3LYP/6-311G++(2d,2p) level in COSMO water model (Fig. 10C) could accurately predict chemical shifts of carbons adjascent to a glycosidic bond, while empirical prediction was the 20
only one to produce a correct order of signals. The accuracy of quantum mechanical predictions was similar, except for the non-hydroxylated carbon of the fructose residue (F5). The statistical and performance data of these calculations are summarized in Table 9. 25
To conclude, recent outstanding progress in development of theoretical methods and rapid increase of hardware performance greatly facilitated the application of computational tools to NMR-based structural studies of carbohydrates. Nowadays, prediction of the NMR parameters of mono- and small oligosaccharides 30
benefit from formalized procedures and became a routine task. Nevertheless, in spite of widespread areas of potential application, the limitations of computational approaches are still an important issue.
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 34
Table 8. Prediction of 13C NMR chemical shifts (in ppm) for (PhO)2P(O)-O-6)α-D-Glcp-1OMe using different methods.
Method Software Chemical shifts of (PhO)2P(O)-O-6)α-D-Glcp-1OMe RMS error b
C1 C2 C3 C4 C5 C6 Me i-Ph o-Ph m-Ph p-Ph Glc part
Other atoms
PBE/TZ2p (geometry, NMR)
236
PRIRODAa 380, normalized against benzene (default)
106.32 76.67 79.97 74.38 74.60 50.63 54.61 157.72, 158.47
120.08, 119.18
129.31, 128.89
124.39, 122.96
9.37 3.03
PRIRODAa 380, normalized against
ethylene glycol 102.25 72.60 75.90 70.31 70.53 46.56 50.54 154.02
116.01, 115.11
125.24, 124.82
119.61 9.31 4.76
B3LYP/ 6-31G(d) (geometry, NMR)
268
Gaussian 03 135 98.02 72.42 74.11 77.16 69.90 69.31 55.20 145.25 114.95 122.55 118.02 3.28 5.89
HOSE and its variations
Mestre Nova a 73, 76 103.31 72.66 74.67 70.79 75.08 66.89 55.81 151.31 120.53 130.07 126.15 3.83
0.52
ACDLabs a 68, 381 102.44 71.84 75.90 69.94 72.85 67.38 55.50 150.60 120.10 129.80 125.50 3.14 0.11 Incremental at residual level
BIOPSEL 113, 115 99.3 72.7 74.0 70.9 72.7 66.9 no output 2.01 - poor accuracy reported с
CASPER 116, 121 no output (P-linked sugars are not supported) - - Experiment in CDCl3
268 95.51 71.88 73.92 69.63 70.29 68.22 55.27 150.45 120.12 129.84 125.52 0 0
a Starting geometry was obtained by MM2, then re-optimized in the specified software. b RMS was calculated against experimental data in CDCl3. c BIOPSEL reported the calculation accuracy as poor (mark 1 of 0..4) for a saccharide part and as 0 of 0..4 for a non-saccharide part.
Empirical schemes are easy to use and provide good accuracy 5
for compounds possessing a widespread structural motif, while the performance for molecules with atypical substituents (not parameterized for) or with an unexpected secondary structure can be poor. Further extension of databases and algorithmic improvements are expected to enhance the application of 10
empirical algorithms. In contrast to the empirical methods, DFT and ab initio calculations should be suitable for computational predictions of the NMR parameters for any carbohydrate structures and substituents from the first principles. A careful selection of the 15
theory levels used for geometry optimization and NMR calculation allows achievement of reasonable accuracy for small systems (monosaccharides). An outstanding advantage of DFT and ab initio calculations is the ability to predict the NMR
parameters other than chemical shifts. Particularly, its successful 20
applications for the prediction of spin-spin coupling constants, relaxation rates, NOEs, chemical shifts in atypical solvents, and conformation-specific NMR parameters have been reported. Finally, we anticipate a rapid progress in application of DFT and ab initio calculations for prediction of the NMR parameters 25
of carbohydrates with more dedicated impact on everyday practical needs in experimental structure determination. At the same time, empirical correlations are still in use for routine NMR predictions and for glycans built up of three and more sugar residues, since right now it is too early to expect superior 30
performance of DFT calculations in all aspects of carbohydrate structure analysis.
Table 9. The performance of empirical and density functional prediction of 13C NMR spectrum of sucrose. a 35
Parameters BCSDB/BIOPSEL b 114
(empirical) GIAO at
PBE/TZ2p level c 236 GIAO at
B3LYP/6-311G(2d,p) level c COSMO / GIAO at
B3LYP/6-311G++(2d,2p) level d
RMS, ppm 2.6 5.4 4.6 3.9
Linear correlation 0.994 0.981 0.953 0.981
Calculation time e <0.1 sec 29 min 12.5 hours 67.8 hours
a Reference experimental data are from the 13C NMR spectrum at 25°C in D2O.
b Used chemical shift database is dedicated to water solutions of carbohydrates.
c The spectrum was normalized against chemical shifts of ethylene glycol (were 63.4 ppm in CDCl3 379). The geometry was optimized in the same basis set.
d The calculations were carried out in COSMO water model. The geometry was optimized in the same basis set. The spectrum was normalized against 40
ethylene glycol in CDCl3 (63.4 ppm379), while normalization against ethylene glycol in D2O (67.3 ppm379) gave an RMS error of 7.4 ppm.
e Performance data were obtained on a personal computer with Intel Core 2 quad-core processor running at 3.0 GHz.
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 35
Fig. 10. Deviation of 13C chemical shifts of sucrose disaccharide predicted by different methods versus experimental spectrum. Dashed lines represent
correlations between signals. Black values (A) were predicted empirically using an incremental schema. Experimental spectrum (B) was recorded at 25°C in D2O. Signal assignment is denoted by F (fructose residue) or G (glucose residue) symbol and the carbon number. Red values (C) were calculated at
PBE/TZ2p level, and green values (D) were calculated at B3LYP/6-311G++(2d,2p) level using COSMO water model. See calculation details in Table 9. 5
10. Abbreviations
Force fields in italic. QM theory levels in bold. QM functionals and basis sets in bold-italic. 10
3D – three-dimensional 3D HOSE – enhanced HOSE that utilizes stereochemistry
information AMBER – assisted model building with energy refinement 15
B3LYP – Becke three-parameter with Lee-Yang-Parr B3PW91 – Becke three-parameter with Perdew-Wang 91 BD, BD(T), BD(TQ) – Brückner energies (including doubles, triples 20
and quadruples)
BPT – bond polarization theory CCS, CCSD, CCSDT – coupled cluster methods (including singles, 25
doubles and triples) CHARMM – chemistry at Harvard macromolecular mechanics CHF - coupled Hartree-Fock CI – configuration interaction CICADA – channels in conformational space analyzed by 30
driver approach COSMO – conductor-like screening model COSMOS – computer simulations of molecular structures CPCM – conductor-like polarizable continuum model CP/MAS - cross-polarization / magic angle spinning 35
CS – (isotropic) chemical shift CSS – chemical shift surface
36 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
CST – chemical shielding tensor CSFF – carbohydrate solution force field CSGT – continuous set of gauge transformations IGAIM – individual gauges for atoms in molecules DFT – density functional theory 5
DFT-D – density functional theory with distance-dependent dispersive term
DPCM – dielectric polarizable continuum model DSO - diamagnetic spin orbit (term) FC - Fermi contact (term) 10
FF-DPT - finite-field double perturbation theory GIAO – gauge-including atomic orbital GIPAW – gauge-including projector augmented-wave GLYCAM – glycan molecular mechanics force field GROMOS – Groningen molecular simulation 15
HOSE – hierarchical organization of spherical environments HF – Hartree-Fock IGLO – individual gauge for localized orbitals LANL2DZ – Los Alamos national laboratory 2-double-z LD - locally dense 20
MAD – mean absolute deviation MCD – mode-coupling diffusion MD – molecular dynamics MM2, MM3 – molecular mechanics MNDO – modified neglect of differential overlap 25
MPx – Møller–Plesset perturbation theory at order x NMR – nuclear magnetic resonance NOE – nuclear Overhauser effect ONIOM – our own n-layer integrated molecular orbital and
molecular mechanics approach 30
OPLS-AA – optimized potentials for liquid simulations - all-atom
PBE – Perdew-Burke-Ernzerhof PCM – polarizable continuum model PHLB – Palma-Himmel-Liang-Brady 35
PSO -paramagnetic spin orbit (term) QCISD, QCISD(T), QCISD(TQ) – quadratic configuration interaction RAI – recoupling of anisotropy information 40
RMS – root mean square SCF – self-consistent field SD -spin dipolar (term) TIP3P – transferable intermolecular potential, 3 point 45
Compound names used in the review: 2HOMe-THP - 2-hydroxymethyltetrahydropyran 2-deoxy-eryPen – 2-deoxy-erythropentose Ara – arabinose DMSO – dimethylsulphoxide 50
Gal – galactose GalNAc - 2-acetamido-2-deoxygalactose Glc – glucose GlcNAc - 2-acetamido-2-deoxyglucose Ery – erythrose 55
Fuc – fucose Ido – idose IdopA2S – 2-sulpho-iduronic acid
Lyx – lyxose Rib – ribose 60
TMS - tetramethylsilane Xyl – xylose
11. Acknowledgements
This review was written in the framework of development of the NMR prediction engine of Carbohydrate Structure Database 65
funded by Russian Foundation for Basic Research, grants 05-07-90099 and 04-12-00324. Authors thank Prof. Y.A. Knirel and Prof. A.S. Shashkov for critical reading.
12. References 70
1. N. Gaidzik, U. Westerlind and H. Kunz, Chem Soc Rev, 2013, 42,
4421-4442.
2. R. D. Astronomo and D. R. Burton, Nat Rev Drug Discov, 2010, 9,
308-324.
3. T. J. Boltje, T. Buskas and G. J. Boons, Nat Chem, 2009, 1, 611-622. 75
4. B. Ernst and J. L. Magnani, Nat Rev Drug Discov, 2009, 8, 661-677.
5. M. A. Johnson and D. R. Bundle, Chem Soc Rev, 2013, 42, 4327-
4344.
6. A. W. Barb and J. H. Prestegard, Nat Chem Biol, 2011, 7, 147-153.
7. M. C. Gambetta, K. Oktaba and J. Müller, Science, 2009, 325, 93-96. 80
8. M. Molinari, Nat Chem Biol, 2007, 3, 313-320.
9. P. C. Pang, P. C. N. Chiu, C. L. Lee, L. Y. Chang, M. Panico, H. R.
Morris, S. M. Haslam, K. H. Khoo, G. F. Clark, W. S. B. Yeung and
A. Dell, Science, 2011, 333, 1761-1764.
10. G. A. Rabinovich and M. A. Toscano, Nat Rev Immun, 2009, 9, 338-85
352.
11. T. Yoshida-Moriguchi, L. Yu, S. H. Stalnaker, S. Davis, S. Kunz, M.
Madson, M. B. A. Oldstone, H. Schachter, L. Wells and K. P.
Campbell, Science, 2010, 327, 88-92.
12. О. Alper, Science, 2001, 291, 2338-2343. 90
13. Z. Shriver, S. Raguram and К. Sasisekharan, Nat Rev Drug Discov,
2004, 3, 863-873.
14. N. C. Reichardt, M. Martin-Lomas and S. Penades, Chem Soc Rev,
2013, 42, 4358-4376.
15. J. F. G. Vliegenthart and R. J. Woods, NMR spectroscopy and 95
computer modeling of carbohydrates: recent advances, 2006.
16. R. A. Dwek, Chem Rev, 1996, 96, 683-720.
17. C. J. Jones and C. K. Larive, Nat Chem Biol, 2011, 7, 758-759.
18. D. Mohnen and M. L. Tierney, Science, 2011, 332, 1393-1394.
19. S. M. Velasquez, M. M. Ricardi, J. G. Dorosz, P. V. Fernandez, A. D. 100
Nadra, L. Pol-Fachin, J. Egelund, S. Gille, J. Harholt, M. Ciancia, H.
Verli, M. Pauly, A. Bacic, C. E. Olsen, P. Ulvskov, B. L. Petersen, C.
Somerville, N. D. Iusem and J. M. Estevez, Science, 2011, 332,
1401-1403.
20. V. Wittmann and R. J. Pieters, Chem Soc Rev, 2013, 42, 4492-4503. 105
21. L. L. Kiessling and J. C. Grim, Chem Soc Rev, 2013, 42, 4476-4491.
22. C.-I. Lin, R. M. McCarty and H.-w. Liu, Chem Soc Rev, 2013, 42,
4377-4407.
23. S. Park, J. C. Gildersleeve, O. Blixt and I. Shin, Chem Soc Rev, 2013,
42, 4310-4326. 110
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 37
24. J. Hirabayashi, M. Yamada, A. Kuno and H. Tateno, Chem Soc Rev,
2013, 42, 4443-4458.
25. S. H. Rouhanifard, L. U. Nordstrom, T. Zheng and P. Wu, Chem Soc
Rev, 2013, 42, 4284-4296.
26. USA Pat., 6346604, 2002. 5
27. N. A. Kocharova, O. G. Ovchinnikova, I. S. Bushmarinov, F. V.
Toukach, A. Torzewska, A. S. Shashkov, Y. A. Knirel and A.
Rozalski, Carbohydr Res, 2005, 340, 775-780.
28. A. Corma, S. Iborra and A. Velty, Chem Rev, 2007, 107, 2411-2502.
29. H. Röper, Starch, 2002, 54, 89-99. 10
30. L. D. Schmidt and P. J. Dauenhauer, Nature, 2007, 447, 914-915.
31. T. Werpy and G. Petersen, Top value added chemicals from biomass.
Vol. I. Results of screening for potential candidates from sugars and
synthesis gas, U.S. Department of Energy, Golden, CO, 2004.
32. T. Ståhlberg, W. Fu, J. M. Woodley and A. Riisager, ChemSusChem, 15
2011, 4, 451-458.
33. S. Van de Vyver, J. Geboers, P. A. Jacobs and B. F. Sels,
ChemCatChem, 2011, 3, 82-94.
34. M. E. Zakrzewska, E. Bogel-Łukasik and R. Bogel-Łukasik, Chem
Rev, 2011, 111, 397-417. 20
35. M. Chidambaram and A. T. Bell, Green Chem, 2010, 12, 1253-1262.
36. A. Imberty and S. Pérez, Chem Rev, 2000, 100, 4567-4588.
37. S. Pérez, C. Gauthier and A. Imberty, in Oligosaccharides in
chemistry and biology: a comprehensive handbook, eds. B. Ernst, G.
Hart and P. Sinay, Wiley/VCH: Weinheim, 2000, pp. 969-1001. 25
38. M. Hricovíni, Curr Med Chem, 2004, 11, 2565-2583.
39. R. Stenutz, The structure and conformation of saccharides
determined by experiment and simulation, Stockholm University,
1997.
40. H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. 30
Weissig, I. N. Shindyalov and P. E. Bourne, Nucleic Acids Res, 2000,
28, 235-242.
41. N. E. Chayen, Prog Biophys Mol Biol, 2005, 88, 329-337.
42. D. Stock, O. Perisic and J. Löwe, Prog Biophys Mol Biol, 2005, 88,
311-327. 35
43. J. Duus, C. H. Gotfredsen and K. Bock, Chem Rev, 2000, 100,
4589−4614.
44. M. Frank and S. Schloissnig, Cell Mol Life Sci, 2010, 67, 2749-2772.
45. J. Jiménez-Barbero, M. D. Díaz and P. M. Nieto, Anticancer Agents
Med Chem, 2008, 8, 52-63. 40
46. V. Roldós, F. J. Cañada and J. Jiménez-Barbero, ChemBioChem,
2011, 12, 990-1005.
47. F. Nicotra, L. Cipolla, B. La Ferla, C. Airoldi, C. Zona, A. Orsato, N.
Shaikh and L. Russo, J Biotechnol, 2009, 144, 234-241.
48. T. R. Rudd, E. A. Yates and M. Hricovíni, Curr Med Chem, 2009, 45
16, 4750-4766.
49. C. Jones, J Pharm Biomed Anal, 2005, 38, 840-850.
50. W. H. Organisation, World Health Organisation, Tech. Rep., Ser.927,
World Health Organisation, 2005.
51. W. A. Bubb, Concepts in Magn Reson A, 2003, 19A, 1-19. 50
52. H. S. Atreya and T. Szyperski, Methods Enzymol, 2005, 394, 78-108.
53. P. Guntert, Prog Nucl Magn Reson Spectrosc, 2003, 43, 105-125.
54. M. Kainosho, T. Torizawa, Y. Iwashita, T. Terauchi, A. M. Ono and
P. Guntert, Nature, 2006, 440, 52-57.
55. D. Malmodin and M. Billeter, Prog Nucl Magn Reson Spectrosc, 55
2005, 46, 109-129.
56. R. C. Tyler, D. J. Aceti, C. A. Bingman, C. C. Cornilescu, B. G. Fox,
R. O. Frederick, W. B. Jeon, M. S. Lee, C. S. Newman, F. C.
Peterson, G. N. Phillips, Jr., M. N. Shahan, S. Singh, J. Song, H. K.
Sreenath, E. M. Tyler, E. L. Ulrich, D. A. Vinarov, F. C. Vojtik, B. F. 60
Volkman, R. L. Wrobel, Q. Zhao and J. L. Markley, Proteins, 2005,
59, 633-643.
57. T. Lütteke, Beilstein J Org Chem, 2012, 8, 915-929.
58. G. I. Csonka, K. Elias and I. G. Csizmadia, Chem Phys Lett, 1996,
257, 49-60. 65
59. M. I. Bilan, A. G. Grachev, A. S. Shashkov, N. E. Nifantiev and A. I.
Usov, Carbohydr Res, 2006, 341, 238-245.
60. M. E. Elyashberg, A. J. Williams and G. E. Martin, Prog Nucl Magn
Reson Spectrosc, 2008, 53, 1-104.
61. M. W. Lodewyk, M. R. Siebert and D. J. Tantillo, Chem Rev, 2011, 70
112, 1839-1862.
62. L. B. Casabianca and A. C. de Dios, J Chem Phys, 2008, 128,
052201.
63. T. Helgaker, M. Jaszuński and M. Pecul, Prog Nucl Magn Reson
Spectrosc, 2008, 53, 249-268. 75
64. T. Helgaker, M. Jaszuński and K. Ruud, Chem Rev, 1999, 99, 293-
352.
65. J. Vaara, Phys Chem Chem Phys, 2007, 9, 5399-5418.
66. J. F. G. Vliegenthart, in NMR spectroscopy and computer modeling
of carbohydrates: recent advances, eds. J. F. G. Vliegenthart and R. 80
J. Woods, 2006, vol. 930, pp. 1-19.
67. _, Upstream Solutions. NMR prediction,
http://www.upstream.ch/products/nmr.html#Prediction%20Quality,
Accessed 2013 May 15.
68. Y. D. Smurnyy, K. A. Blinov, T. S. Churanova, M. E. Elyashberg 85
and A. J. Williams, J Chem Inf Model, 2008, 48, 128-134.
69. M. Elyashberg, K. Blinov, Y. D. Smurnyy, T. S. Churanova and A.
Williams, Magn Reson Chem, 2010, 48, 219-229.
70. M. Elyashberg, K. Blinov and A. Williams, Magn Reson Chem,
2009, 47, 371-389. 90
71. W. Bremser, Anal Chim Acta, 1978, 103, 355-365.
72. R. R. Sasaki and B. A. Lefebvre, Burlington, VT, 2006.
73. _, Modgraph. NMRPredict,
http://www.modgraph.co.uk/product_nmr.htm, Accessed 2013 May
15. 95
74. A. Williams, B. Lefebvre and R. Sasaki, Putting ACD/NMR
predictors to the test,
http://web.archive.org/web/20080902230404/http://www.acdlabs.co
m/products/spec_lab/predict_nmr/chemnmr/, Accessed 2012 Oct 20.
75. _, ACD/Labs. ACD/NMR databases, 100
http://www.acdlabs.com/products/dbs/nmr_db/, Accessed 2012 Oct
20.
76. _, MestreLab Research. Mnova NMRPredict,
http://mestrelab.com/software/mnova-nmrpredict-desktop/, Accessed
2013 May 15. 105
77. _, PerkinElmer. ChemBioOffice,
http://www.cambridgesoft.com/Ensemble_for_Chemistry/ChemBioO
ffice/, Accessed 2013 May 15.
38 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
78. N. Haider and W. Robien, NMRPREDICT-Server,
http://nmrpredict.orc.univie.ac.at/, Accessed 2013 May 15.
79. V. Schütz, V. Purtuc, S. Felsinger and W. Robien, Fresenius J Anal
Chem, 1997, 359, 33-41.
80. Robien, Nachr Chem Tech Lab, 1998, 46, 74-77. 5
81. W. Bremser and M. Grzonka, Microchim Acta, 1991, 104, 1-6.
82. S. V. Trepalin, A. V. Yarkov, L. M. Dolmatova, N. S. Zefirov and S.
A. E. Finch, J Chem Inf Comput Sci, 1995, 35, 405-411.
83. C. Steinbeck, S. Krause and S. Kuhn, J Chem Inf Comput Sci, 2003,
43, 1733-1739. 10
84. H. Satoh, H. Koshino, K. Funatsu and T. Nakata, J Chem Inf Comput
Sci, 2001, 41, 1106-1112.
85. H. Satoh, H. Koshino, J. Uzawa and T. Nakata, Tetrahedron, 2003,
59, 4539-4547.
86. B. P. Kelleher and A. J. Simpson, Environ Sci Technol, 2006, 40, 15
4605-4611.
87. M. W. I. Schmidt and A. G. Noack, Global Biogeochem Cycles,
2000, 14, 777-793.
88. H. M. Cartwright, in Artificial neural networks: methods and
applications, ed. D. J. Livingstone, 2008, vol. 458, pp. 1-13. 20
89. L. Terfloth and J. Gasteiger, Drug Discov Today, 2001, 6(15) Suppl.,
S102-S108.
90. J. Zou, Y. Han and S. S. So, in Artificial neural networks: methods
and applications, ed. D. J. Livingstone, 2008, vol. 458, pp. 14-22.
91. M. Jalali-Heravi, in Artificial neural networks: methods and 25
applications, ed. D. J. Livingstone, 2008, vol. 458, pp. 78-118.
92. J. P. Radomski, H. van Halbeek and B. Meyer, Nat Struct Biol, 1994,
1, 217-218.
93. J. Aires-de-Sousa, M. C. Hemmer and Gasteiger, Anal Chem, 2002,
74, 80-90. 30
94. Y. Shen and A. Bax, J Biomol NMR, 2010, 48, 13-22.
95. A. G. Gerbst, A. A. Grachev, N. E. Ustuzhanina, N. E. Nifantiev, A.
A. Vyboichtchik, A. S. Shashkov and A. I. Usov, J Carbohyd Chem,
2010, 29, 92-102.
96. _, Modgraph. Neural Network Prediction, 35
http://www.modgraph.co.uk/product_nmr_network.htm, Accessed
2013 May 15.
97. V. Purtuc, V. Schütz, S. Felsinger and W. Robien, Estimation of 13C-
NMR chemical shift values using neural network technology,
http://homepage.univie.ac.at/wolfgang.robien/wr_alpha.html, 40
Accessed 2013 May 15.
98. С. Le Bret, SAR QSAR Env Res, 2000, 11, 211-234.
99. J. Meiler, R. Meusinger and M. Will, J Chem Inf Comput Sci, 2000,
40, 1169-1176.
100. J. Meiler and M. Will, J Chem Inf Comput Sci, 2001, 41, 1535-1546. 45
101. J. H. Holland, Adaptation in natural and artificial systems, MIT
Press Cambridge, MA, USA, 1992.
102. J. Meiler, W. Maier, M. Will and R. Meusinger, J Magn Reson, 2002,
157, 242-252.
103. Y. D. Smurnyy, K. A. Blinov and B. A. Lefebvre, Pacific Groove, 50
CA, 2006.
104. M. K. McIntyre and G. W. Small, Anal Chem, 1987, 59, 1805-1811.
105. K. A. Blinov, Y. D. Smurnyy, T. S. Churanova, M. E. Elyashberg
and A. J. Williams, Chemometrics and Intelligent Laboratory
Systems, 2009, 97, 91-97. 55
106. B. E. Mitchell and P. C. Jurs, J Chem Inf Comput Sci, 1996, 36, 58-
64.
107. D. L. Clouser and P. C. Jurs, J Chem Inf Comput Sci, 1996, 36, 168-
172.
108. R. J. Abraham and M. Mobli, Spectrosc Eur, 2004, 4, 16-23. 60
109. R. J. Abraham, J. J. Byrne, L. Griffiths and R. Koniotou, Magn Reson
Chem, 2005, 43, 611-624.
110. R. Bürgin Schaller, M. E. Munk and E. Pretsch, J Chem Inf Comput
Sci, 1996, 36, 239-243.
111. E. Escalante-Sanchez and R. Pereda-Miranda, J Nat Prod, 2007, 70, 65
1029-1034.
112. G. M. Lipkind, A. S. Shashkov, Y. A. Knirel, E. V. Vinogradov and
N. K. Kochetkov, Carbohydr Res, 1988, 175, 59-75.
113. F. V. Toukach and A. S. Shashkov, Carbohydr Res, 2001, 335, 101-
114. 70
114. F. V. Toukach, J Chem Inf Model, 2011, 51, 159-170.
115. F. V. Toukach, Bacterial CSDB: 13C NMR prediction,
http://csdb.glycoscience.ru/help/nmr.html, Accessed 2013 May 15.
116. G. Widmalm, Casper, http://www.casper.organ.su.se/casper/,
Accessed 2013 May 15. 75
117. M. Lundborg and G. Widmalm, Anal Chem, 2011, 83, 1514-1517.
118. M. Lundborg, C. Fontana and G. Widmalm, Biomacromolecules,
2011, 12, 3851-3855.
119. P. E. Jansson, L. Kenne and G. Widmalm, J Chem Inf Comput Sci,
1991, 31, 508-516. 80
120. A. Nahmany, F. Strino, J. Rosen, G. J. Kemp and P. G. Nyholm,
Carbohydr Res, 2005, 340, 1059-1064.
121. P. E. Jansson, R. Stenutz and G. Widmalm, Carbohydr Res, 2006,
341, 1003-1010.
122. J. D. Dyekjaer and K. Rasmussen, Mini Rev Med Chem, 2003, 3, 85
713-717.
123. E. Fadda and R. J. Woods, Drug Discov Today, 2010, 15, 596-609.
124. A. G. Gerbst, A. A. Grachev, A. S. Shashkov and N. E. Nifant'ev,
Rus J Bioorg Chem, 2007, 33, 24-37.
125. R. J. Woods and M. B. Tessier, Curr Opin Struct Biol, 2010, 20, 575-90
583.
126. U. Burkert and N. Allinger, Molecular mechanics, American
Chemical Society, Washington, DC, 1982.
127. _, Wavefunction, Inc. Spartan software,
http://www.wavefun.com/products/spartan.html, Accessed 2013 May 95
15.
128. _, Schrödinger. MacroModel,
http://www.schrodinger.com/products/14/11/, Accessed 2013 May
15.
129. F. Mohamadi, N. G. J. Richard, W. C. Guida, R. Liskamp, M. Lipton, 100
C. Caufield, G. Chang, T. Hendrickson and W. C. Still, J Comput
Chem, 1990, 11, 440-467.
130. D. Paschek and A. Geiger, Department of Physical Chemistry
University of Dortmund, Dortmund, Germany., MOSCITO 4. edn.,
2002. 105
131. _, Moscito, 139.30.122.11/MOSCITO/, Accessed 2013 May 15.
132. M. Möllhoff and U. Sternberg, J Mol Mod, 2001, 7, 90-102.
133. U. Sternberg, F. T. Koch and P. Losso, COSMOS. Computer-
Simulation von Molekül-Strukturen, http://www.cosmos-software.de/,
Accessed 2013 May 15. 110
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 39
134. _, Gaussian Inc. Gaussian,
http://www.gaussian.com/g_prod/g09.htm, Accessed 2013 May 15.
135. M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A.
Robb, J. R. Cheeseman, J. A. Montgomery, Jr., T. Vreven, K. N.
Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. 5
Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A.
Petersson, H. Nakatsuji, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J.
Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai,
M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B. Cross, C. Adamo,
J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, 10
R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma,
G. A. Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S.
Dapprich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick, A.
D. Rabuck, K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A.
G. Baboul, S. Clifford, J. Cioslowski, B. B. Stefanov, G. Liu, A. 15
Liashenko, P. Piskorz, I. Komaromi, R. L. Martin, D. J. Fox, T.
Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M.
Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C.
Gonzalez and J. A. Pople, Gaussian Inc., Wallingford CT., Gaussian
03, Revision C.02. edn., 2004. 20
136. M. S. Gordon and M. W. Schmidt, in Theory and applications of the
computational chemistry: the first 40 years, eds. C. E. Dykstra, G.
Frenkling, K. S. Kim and G. Scuseria, Elsevier Science, Amsterdam,
2005, pp. 1167-1189.
137. M. Gorgon, Gamess, http://www.msg.ameslab.gov/GAMESS/, 25
Accessed 2013 May 15.
138. M. W. Schmidt, K. K. Baldridge, J. A. Boatz, S. T. Elbert, M. S.
Gordon, J. H. Jensen, S. Koseki, N. Matsunaga, K. A. Nguyen, S. Su,
T. L. Windus, M. Dupuis and J. A. Montgomery, J Comput Chem,
1993, 14, 1347-1363. 30
139. _, Hypercube, Inc. HyperChem,
http://www.hyper.com/Products/tabid/354/Default.aspx, Accessed
2013 May 15.
140. M. Froimowitz, Biotechniques, 1993, 14, 1010-1013.
141. S. A. Adcock and J. A. McCammon, Chem Rev, 2006, 106, 1589-35
1615.
142. L. M. Kroon-Batenburg, J. Kroon and B. R. Leeflang, Carbohydr
Res, 1993, 245, 21-42.
143. J. Landstrom and G. Widmalm, Carbohydr Res, 2010, 345, 330-333.
144. G. Widmalm, R. A. Byrd and W. Egan, Carbohydr Res, 1992, 229, 40
195-211.
145. Y. Sugita and Y. Okamoto, Chem Phys Lett, 2000, 329, 261-270.
146. S. Re, W. Nishima, N. Miyashita and Y. Sugita, Biophys Rev, 2012,
4, 179-187.
147. J. B. Foresman and A. E. Frisch, Exploring chemistry with electronic 45
structure methods, 2nd ed, Gaussian Inc., 1996.
148. T. J. Rutherford, J. Wilkie, C. Q. Vu, K. D. Schnackerz, M. K.
Jacobson and D. Gani, Nucleosides Nucleotides Nucleic Acids, 2001,
20, 1485-1495.
149. M. Rahal-Sekkal, N. Sekkal, D. C. Kleb and P. Bleckmann, J Comput 50
Chem, 2003, 24, 806-818.
150. E. P. O’Brien and G. Moyna, Carbohydr Res, 2004, 339, 87-96.
151. I. Sergeev and G. Moyna, Carbohydr Res, 2005, 340, 1165-1174.
152. C. W. Swalina, R. J. Zauhar, M. J. DeGrazia and G. Moyna, J Biomol
NMR, 2001, 21, 49-61. 55
153. J. Stewart, J Mol Model, 2004, 13, 1173-1213.
154. P. J. Madeira, N. M. Xavier, A. P. Rauter and M. H. Florêncio, J
Mass Spectrom, 2010, 45, 1167-1178.
155. U. Sternberg, J Mol Phys, 1988, 63, 249-267.
156. U. Sternberg and W. Priess, J Magn Reson, 1997, 125, 8-19. 60
157. D. Sebastiani, G. Goward, I. Schnell and M. Parrinello, Comput Phys
Commun, 2002, 147, 707-710.
158. U. Sternberg, F. T. Koch, W. Priess and R. Witter, Cellulose, 2003,
10, 189-199.
159. F. Casset, A. Imberty, C. Herve du Penhoat, J. Koca and S. Perez, J 65
Mol Struct, 1997, 395-396, 211-224.
160. _, Serena Software. PCModel, http://www.serenasoft.com/, Accessed
2013 May 15.
161. _, Tinker Molecular modelling, http://dasher.wustl.edu/ffe/, Accessed
2013 May 15. 70
162. N. L. Allinger, Y. H. Yuh and J. H. Lii, J Am Chem Soc, 1989, 1,
8551-8566.
163. A. Hocquet and M. Langgård, J Mol Model, 1998, 4, 94-112.
164. B. R. R. Brooks, C. L. L. Brooks, A. D. D. Mackerell, L. Nilsson, R.
J. J. Petrella, B. Roux, Y. Won, G. Archontis, C. Bartels, S. Boresch, 75
A. Caflisch, L. Caves, Q. Cui, A. R. R. Dinner, M. Feig, S. Fischer, J.
Gao, M. Hodoscek, W. Im, K. Kuczera, T. Lazaridis, J. Ma, V.
Ovchinnikov, E. Paci, R. W. W. Pastor, C. B. B. Post, J. Z. Z. Pu, M.
Schaefer, B. Tidor, R. M. M. Venable, H. L. L. Woodcock, X. Wu,
W. Yang, D. M. M. York and M. Karplus, J Comput Chem, 2009, 30, 80
1545-1614.
165. B. Hess, C. Kutzner, D. Van Der Spoel and E. Lindahl, J Chem
Theory Comput, 2008, 4, 435-447.
166. D. Van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and
H. J. Berendsen, J Comput Chem, 2005, 26, 1701-1718. 85
167. B. R. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S.
Swaminathan and M. Karplus, J Comput Chem, 1983, 4, 187-217.
168. A. D. MacKerell, J. N. Banavali and N. Foloppe, Biopolymers, 2001,
56, 257-265.
169. A. D. MacKerell, Jr., B. Brooks, C. L. Brooks, III, L. Nilsson, B. 90
Roux, Y. Won and M. Karplus, in The encyclopedia of computational
chemistry, ed. P. V. R. Schleyer, John Wiley & Sons, Chichester,
1998, vol. 1, pp. 271-277.
170. O. Guvench, S. N. Greene, G. Kamath, J. W. Brady, R. M. Venable,
R. W. Pastor and A. D. Mackerell Jr, J Comput Chem, 2008, 29, 95
2543-2564.
171. O. Guvench, E. Hatcher, R. M. Venable, R. W. Pastor and A. D.
MacKerell, J Chem Theory Comput, 2009, 5, 2353-2370.
172. E. R. Hatcher, O. Guvench and A. D. MacKerell Jr, J Chem Theory
Comput, 2009, 5, 1315-1327. 100
173. R. Eklund and G. Widmalm, Carbohydr Res, 2003, 338, 393-398.
174. R. U. Lemieux, K. Bock, L. T. J. Delbaere, S. Koto and V. S. Rao,
Can J Chem, 1980, 58, 631-653.
175. M. L. C. E. Kouwijzer and P. D. J. Grootenhuis, J Phys Chem, 1995,
99, 13426-13436. 105
176. S. N. Ha, A. Giammona, M. Field and J. W. Brady, Carbohydr Res,
1988, 180, 207-211.
177. R. Palma, P. Zuccato, M. E. Himmel, G. Liang and J. W. Brady, in
Glycosyl hydrolases in biomass conversion, eds. M. E. Himmel, J. O.
40 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
Baker and J. N. Saddler, American Chemical Society, Washington,
DC, 2001, vol. 769, pp. 112-130.
178. _, AMBER Home page, http://ambermd.org/, Accessed 2013 May 15.
179. D. A. Case, T. E. Cheatham, T. Darden, H. Gohlke, R. Luo, K. M.
Merz, Jr., A. Onufriev, C. Simmerling, B. Wang and R. Woods, J 5
Comput Chem, 2005, 26, 1668-1688.
180. K. N. Kirschner, A. B. Yongye, S. M. Tschampel, J. González-
Outeiriño, C. R. Daniels, B. L. Foley and R. J. Woods, J Comput
Chem, 2008, 29, 622-655.
181. R. J. Woods, R. A. Dwek, C. J. Edge and B. Fraser-Reid, J Phys 10
Chem, 1995, 99, 3832-3846.
182. A. P. Eichenberger, J. R. Allison, J. Dolenc, D. P. Geerke, B. A. C.
Horta, K. Meier, C. Oostenbrink, N. Schmid, D. Steiner, D. Wang
and W. F. van Gunsteren, J Chem Theory Comput, 2011, 7, 3379-
3390. 15
183. A. P. E. Kunz, J. R. Allison, D. P. Geerke, B. A. C. Horta, P. H.
Hünenberger, S. Riniker, N. Schmid and W. F. van Gunsteren, J
Comput Chem, 2012, 33, 340-353.
184. N. Schmid, C. D. Christ, M. Christen, A. P. Eichenberger and W. F.
van Gunsteren, Comput Phys Commun, 2012, 183, 890-903. 20
185. M. Christen, P. H. Hünenberger, D. Bakowies, R. Baron, R. Bürgi, D.
P. Geerke, T. N. Heinz, M. A. Kastenholz, V. Kräutler, C.
Oostenbrink, C. Peter, D. Trzesniak and W. F. van Gunsteren, J
Comput Chem, 2005, 26, 1719-1751.
186. S. A. H. Spieser, J. Albert van Kuik, L. M. J. Kroon-Batenburg and J. 25
Kroon, Carbohydr Res, 1999, 322, 264-273.
187. R. D. Lins and P. H. Hünenberger, J Comput Chem, 2005, 26, 1400-
1412.
188. W. Damm, A. Frontera, J. Tirado-Rives and W. L. Jorgensen, J
Comput Chem, 1997, 18, 1955-1970. 30
189. M. Kuttel, J. W. Brady and K. J. Naidoo, J Comput Chem, 2002, 23,
1236-1243.
190. S. J. Weiner, P. A. Kollman, D. T. Nguyen and D. A. Case, J Comput
Chem, 1986, 7, 230-252.
191. J. W. Ponder and D. A. Case, Adv Prot Chem, 2003, 66, 27-85. 35
192. _, Accelrys, Inc. InsightII, http://lms.chem.tamu.edu/insightII.html,
Accessed 2013 May 15.
193. S. W. Homans, Biochemistry, 1990, 29, 9110-9118.
194. R. Witter, U. Sternberg, S. Hesse, T. Kondo, F. T. Koch and A. S.
Ulrich, Macromolecules, 2006, 39, 6125-6132. 40
195. A. D. Becke, Phys Rev A, 1988, 38, 3098-3100.
196. C. Lee, W. Yang and R. G. Parr, Phys Rev B, 1988, 37, 785-789.
197. J. P. Perdew and Y. Wang, Phys Rev B, 1992, 45, 13244-13249.
198. F. Jensen, Introduction to computational chemistry, 2nd ed, John
Wiley & Sons Ltd., 2007. 45
199. W. Koch and M. C. Holthausen, A chemist's guide to density
functional theory, 2nd ed, John Wiley & Sons Ltd., 2001.
200. E. G. Lewars, Computational chemistry. Introduction to the theory
and applications of molecular and quantum mechanics, 2nd ed,
Springer Science+Business Media B.V., 2011. 50
201. M. Orio, D. A. Pantazis and F. Neese, Photosynth Res, 2009, 102,
443-453.
202. D. Sholl and J. A. Steckel, Density functional theory: a practical
introduction, Wiley-Interscience, 2009.
203. Y. Zhao and D. G. Truhlar, Acc Chem Res, 2008, 41, 157-167. 55
204. Y. Zhao and D. G. Truhlar, Theor Chem Acc, 2008, 120, 215-241.
205. M. E. Casida and M. Huix-Rotllant, Annu Rev Phys Chem, 2012, 63,
287-323.
206. M. A. Marques and E. K. Gross, Annu Rev Phys Chem, 2004, 55,
427-455. 60
207. R. Ditchfield, Mol Phys, 1974, 27, 789-807.
208. G. Schreckenbach and T. Ziegler, J Phys Chem, 1995, 99, 606-611.
209. W. Kutzelnigg, U. Fleischer and Schindler, in NMR basic principles
and progress, Springer Verlag, Berlin/Heidelberg, 1991, vol. 213, pp.
165-262. 65
210. A. E. Hansen and T. D. Bouman, J Chem Phys, 1985, 82, 5035-5047.
211. M. Schindler and W. Kutzelnigg, J Chem Phys, 1982, 76, 1919-1933.
212. V. G. Malkin, O. L. Malkina, M. E. Casida and D. R. Salahub, J Am
Chem Soc, 1994, 116, 5898-5908.
213. C. Bonhomme, C. Gervais, F. Babonneau, C. Coelho, F. Pourpoint, 70
T. Azaïs, S. E. Ashbrook, J. M. Griffin, J. R. Yates, F. Mauri and C.
J. Pickard, Chem Rev, 2012, 112, 5733-5779.
214. C. J. Pickard and F. Mauri, Phys Rev B, 2001, 63, 245101.
215. J. P. Perdew, K. Burke and M. Ernzerhof, Phys Rev Lett, 1996, 77,
3865-3868. 75
216. T. W. Keal and D. J. Tozer, J Chem Phys, 2004, 121, 5654-5660.
217. J. Kongsted, K. Aidas, K. V. Mikkelsen and S. P. A. Sauer, J Chem
Theor Comput, 2008, 4, 267-277.
218. K. Wolinski, J. F. Hinton and P. Pulay, J Am Chem Soc, 1990, 112,
8251-8260. 80
219. K. Friedrich, G. Seifert and G. Grossmann, Z Phys D, 1990, 17, 45-
46.
220. J. R. Cheeseman, G. W. Trucks, T. A. Keith and M. J. Frisch, J Chem
Phys, 1996, 104, 5497-5509.
221. J. H. Lii, B. Ma and N. L. Allinger, J Comp Chem, 1999, 20, 1593-85
1603.
222. C. Ochsenfeld, Chem Phys Lett, 2000, 327, 216-223.
223. G. E. Scuseria, J Phys Chem A, 1999, 103, 4782-4790.
224. C. Ochsenfeld, J. Kussmann and F. Koziol, Angew Chem Int Ed,
2004, 43, 4485-4589. 90
225. T. H. Sefzik, D. Turco, R. J. Iuliucci and J. C. Facelli, J Phys Chem
A, 2005, 109, 1180-1187.
226. M. Tafazzoli and M. Ghiasi, Carbohydr Polym, 2009, 78, 10-15.
227. T. Gregor, F. Mauri and R. Car, J Chem Phys, 1999, 111, 1815-1822.
228. T. Helgaker, S. Coriani, P. Jørgensen, K. Kristensen, J. Olsen and K. 95
Ruud, Chem Rev, 2012, 112, 543-631.
229. R. Abraham and M. Mobli, Modelling 1H NMR Spectra of Organic
Componds: Theory, Applications, and NMR Prediction Software,
Wiley, NY, 2008.
230. H. Lin and D. G. Truhlar, Theor Chem Acc, 2007, 11, 185-199. 100
231. A. Lodola, C. J. Woods and A. J. Mulholland, Ann Rep Comput
Chem, 2008, 4, 155-169.
232. H. M. Senn and W. Thiel, in Atomistic approaches in modern
biology, ed. M. Reiher, Springer, Berlin, 2007, vol. 268, pp. 173-290.
233. T. Vreven and K. Morokuma, Ann Rep Comput Chem, 2006, 2, 35-105
51.
234. M. Svensson, S. Humbel, R. D. J. Froese, T. Matsubara, S. Sieber
and K. Morokuma, J Phys Chem, 1996, 100, 19357-19363.
235. P. B. Karadakov, Annu Rep Prog Chem C, 2001, 97, 61-90.
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 41
236. P. A. Belyakov and V. P. Ananikov, Russ Chem Bull Int Ed, 2011,
60, 2626.
237. P. B. Karadakov and K. Morokuma, Chem Phys Lett, 2000, 317, 589-
596.
238. T. Ishida, J Phys Chem B, 2010, 114, 3950-3964. 5
239. Q. Cui and M. Karplus, J Phys Chem B, 2000, 104, 3721-3743.
240. M. Hricovíni, J Phys Chem B, 2011, 115, 1503-1511.
241. W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey and
M. L. Klein, J Chem Phys, 1983, 79, 926-935.
242. W. L. Jorgensen and J. Tirado-Rives, Proc Natl Acad Sci USA, 2005, 10
102, 6665-6670.
243. G. Barone, D. Duca, A. Silvestri, L. Gomez-Paloma, R. Riccio and
G. Bifulco, Chem Eur J, 2002, 8, 3240-3245.
244. M. Pavone, G. Brancato, G. Morelli and V. Barone, ChemPhysChem,
2006, 7, 148-156. 15
245. J. Gonzalez-Outeirino, K. N. Kirschner, S. Thobhani and R. J.
Woods, Can J Chem, 2006, 84, 569-579.
246. S. Miertus and J. Tomasi, Chem Phys, 1982, 65, 239-245.
247. M. Cossi, N. Rega, G. Scalmani and V. Barone, J Comput Chem,
2003, 24, 669-681. 20
248. V. Barone, M. Cossi and J. Tomasi, J Comput Chem, 1998, 19, 404-
417.
249. B. Mennucci, J. Tomasi, R. Cammi, J. R. Cheeseman, M. J. Frisch, F.
J. Devlin, S. Gabriel and P. J. Stephens, J Phys Chem A, 2002, 106,
6102-6113. 25
250. A. V. Marenich, C. J. Cramer and D. G. Truhlar, J Phys Chem B,
2009, 113, 6378-6396.
251. A. V. Marenich, R. M. Olson, C. P. Kelly, C. J. Cramer and D. G.
Truhlar, J Chem Theory Comput, 2007, 3, 2011-2033.
252. C. J. Cramer and D. G. Truhlar, Acc Chem Res, 2008, 41, 760-768. 30
253. A. Klamt and G. Schüürmann, J Chem Soc Perkin Trans 2, 1993,
799-805.
254. A. Bagno, F. Rastrelli and G. Saielli, J Org Chem, 2007, 72, 7373-
7381.
255. V. Sychrovský, B. Schneider, P. Hobza, L. Zídek and V. Sklenár, 35
Phys Chem Chem Phys, 2003, 5, 734-739.
256. M. S. Lee, F. R. Salsbury and M. A. Olson, J Comput Chem, 2004,
25, 1967-1978.
257. R. I. Maurer and C. A. Reynolds, J Comput Chem, 2004, 25, 627-
631. 40
258. J. Tomasi, B. Mennucci and R. Cammi, Chem Rev, 2005, 105, 2999-
3094.
259. J. C. Facelli, Prog Nucl Magn Reson Spectrosc, 2011, 58, 176-201.
260. M. U. Roslund, P. Taehtinen, M. Niemitz and R. Sjoeholm,
Carbohydr Res, 2008, 343, 101-112. 45
261. _, TURBOMOLE GmbH. Program Package for ab initio Electronic
Structure Calculations, http://www.turbomole-gmbh.com/, Accessed
2013 May 15.
262. R. Ahlrichs, M. Bär, M. Häser, H. Horn and C. Kölmel, Chem Phys
Lett, 1989, 162, 165-169. 50
263. G. A. Rickard, P. B. Karadakov, G. A. Webb and K. Morokuma, J
Phys Chem A, 2003, 107, 292-300.
264. T. Kupka, G. Pasterna, P. Lodowski and W. Szeja, Magn Reson
Chem, 1999, 37, 421-426.
265. S. Suzuki, F. Horii and H. Kurosu, J Mol Struct, 2009, 921, 219-226. 55
266. S. Khodaei, N. L. Hadipour and M. R. Kasaai, Carbohydr Res, 2007,
342, 2396-2403.
267. M. D. Esrafili, F. Elmi and N. L. Hadipour, J Phys Chem A, 2007,
111, 963-970.
268. E. Chelmecka, K. Pasterny, M. Gawlik-Jedrysiak, W. Szeja and R. 60
Wrzalik, J Mol Struct, 2007, 834-836, 498-507.
269. R. K. Raju, A. Ramraj, M. Vincent, I. Hillier and N. Burton, Phys
Chem Chem Phys, 2008, 10, 6500-6508.
270. K. Paradowska, T. Gubica, A. Temeriusz, M. K. Cyranski and I.
Wawer, Carbohydr Res, 2008, 343, 2299-2307. 65
271. _, CASTEP Home page, http://www.castep.org/, Accessed 2013 May
15.
272. S. J. Clark, M. D. Segall, C. J. Pickard, P. J. Hasnip, M. J. Probert, K.
Refson and M. C. Payne, Z Kristallogr, 2005, 220, 567-570.
273. M. D. Segall, P. J. D. Lindan, M. J. Probert, C. J. Pickard, P. J. 70
Hasnip, S. J. Clark and M. C. Payne, J Phys Condens Matter, 2002,
14, 2717-2744.
274. M. Kibalchenko, D. Lee, L. Shao, M. C. Payne, J. J. Titman and J. R.
Yates, Chem Phys Lett, 2010, 498, 270-276.
275. M. M. Reichvilser, C. Heinzl and P. Klufers, Carbohydr Res, 2010, 75
345, 498-502.
276. S. Taubert, H. Konschin and D. Sundholm, Phys Chem Chem Phys,
2005, 7, 2561-2569.
277. A. Bagno, F. Rastrelli and G. Saielli, Magn Reson Chem, 2008, 46,
518-534. 80
278. V. Sychrovsky, N. Muller, B. Schneider, V. Smrecki, V. Spirko, J.
Sponer and L. Trantirek, J Am Chem Soc, 2005, 127, 14663-14667.
279. R. B. Kasat, N. H. Wang and E. I. Franses, Biomacromolecules,
2007, 8, 1676-1685.
280. _, deMon. A software package for density functional theory (DFT) 85
calculations, http://www.demon-
software.com/public_html/program.html, Accessed 2013 May 15.
281. D. R. Salahub, J. Weber, A. Goursot, A. M. Köster and A. Vela, in
Theory and applications of the computational chemistry: the first 40
years, eds. C. E. Dykstra, G. Frenkling, K. S. Kim and G. Scuseria, 90
Elsevier Science, Amsterdam, 2005, pp. 1079-1097.
282. A. St-Amant and D. R. Salahub, Chem Phys Lett, 1990, 169, 387-
392.
283. M. Hricovíni, O. L. Malkina, L. Bizik, T. Nagy and V. G. Malkin, J
Phys Chem A, 1997, 101, 1756-1762. 95
284. O. L. Malkina, M. Hricovíni, F. Bízik and V. G. Malkin, J Phys
Chem A, 2001, 105, 9188-9195.
285. C. A. Stortz, J Comput Chem, 2005, 26, 471-483.
286. D. A. Navarro and C. A. Stortz, Carbohydr Res, 2008, 343, 2292-
2298. 100
287. Y. Nishida, H. Ohrui and H. Meguro, Tetrahedron Lett, 1984, 25,
1575-1578.
288. F. Horii, A. Hirai and R. Kitamaru, Polymer Bull, 1983, 10, 357-361.
289. M. Barfield and S. H. Yamamura, J Am Chem Soc, 1990, 112, 4747-
4758. 105
290. J. C. Corchado, M. L. Sánchez and M. A. Aguilar, J Am Chem Soc,
2004, 126, 7311-7319.
291. S. Grimme, J Comput Chem, 2004, 25, 1463-1473.
292. M. C. Fernandez-Alonso, F. J. Canada, J. Jimenez-Barbero and G.
Cuevas, J Am Chem Soc, 2005, 127, 7379-7386. 110
42 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
293. A. G. Evdokimov, J. M. L. Martin and Kalb, J Phys Chem A, 104,
5291-5297.
294. _, Tripos. Sybyl, http://www.jprtechnologies.com.au/tripos/discovery-
informatics/sybyl/, Accessed 2013 May 15.
295. _, Wolfram. Mathematica, http://www.wolfram.com/mathematica/, 5
Accessed 2013 May 15.
296. T. Mori, E. Chikayama, Y. Tsuboi, N. Ishida, N. Shisa, Y. Noritake,
S. Moriya and J. Kikuchi, Carbohydr Polym, 2012, 90, 1197-1203.
297. G. Kresse and J. Furthmüller, Phys Rev B, 1996, 54, 11169-11186.
298. J. Kubicki, M.-A. Mohamed and H. Watts, Cellulose, 2013, 20, 9-23. 10
299. M. D. Esrafili and H. Ahmadin, Carbohydr Res, 2012, 347, 99-106.
300. _, NERSC. PARATEC code,
https://www.nersc.gov/users/software/applications/materials-
science/paratec/, Accessed 2013 May 15.
301. B. G. Pfrommer, J. Demmel and H. Simon, J Comp Phys, 1999, 150, 15
287-298.
302. J. R. Yates, T. N. Pham, C. J. Pickard, F. Mauri, A. M. Amado, A. M.
Gil and S. P. Brown, J Am Chem Soc, 2005, 127, 10216-10220.
303. R. Lefort, P. Bordat, A. Cesaro and M. Descamps, J Chem Phys,
2007, 126, 014510. 20
304. M. Dupuis, A. Marquez and E. R. Davidson, in Quantum Chemistry
Program Exchange (QCPE). Indiana University, Bloomington, IN
47405., HONDO edn.
305. L. Shao, J. R. Yates and J. J. Titman, J Phys Chem A, 2007, 111,
13126-13132. 25
306. S. Bekiroglu, A. Sandstrom, L. Kenne and C. Sandstrom, Org Biomol
Chem, 2004, 2, 200-205.
307. M. C. Jarvis, Carbohydr Res, 1994, 259, 311-318.
308. P. J. C. Smith and S. Arnott, Acta Cryst A, 1978, 34, 3-11.
309. C. Yamamoto and Y. Okamoto, Bull Chem Soc Jpn, 2004, 77, 227-30
257.
310. H. Le, J. G. Pearson, A. C. de Dios and E. Oldfield, J Am Chem Soc,
1995, 117, 3800-3807.
311. D. B. Chesnut and K. D. Moore, J Comput Chem, 1989, 10, 648-659.
312. P. Langan, Y. Nishiyama and H. Chanzy, Biomacromolecules, 2001, 35
2, 410-416.
313. Y. Nishiyama, P. Langan and H. Chanzy, J Am Chem Soc, 2002, 124,
9074-9082.
314. D. B. Chesnut, B. E. Rusiloski, K. D. Moore and D. A. Egolf, J
Comput Chem, 1993, 14, 1364-1375. 40
315. I. Ivarsson, C. Sandström, A. Sandström and L. Kenne, J Chem Soc
Perkin Trans 2, 2000, 2147-2152.
316. B. Coxon, in Adv Carbohydr Chem Biochem, Elsevier, 2009, vol. 62,
pp. 17-82.
317. N. Troullier and J. L. Martins, Phys Rev B, 1991, 43, 1993-2006. 45
318. K. Bock and H. Thøgersen, Annu Rep NMR Spectrosc, 1982, 13, 1-
57.
319. C. A. G. Haasnoot, F. A. A. M. de Leeuw and C. Altona,
Tetrahedron, 1980, 36, 2783-2792.
320. F. Cloran, I. Carmichael and A. S. Serianni, J Phys Chem A, 1999, 50
103, 3783-3795.
321. N. F. Ramsey, Phys Rev, 1953, 91, 303-307.
322. M. Pecul and J. Sadlej, in Computational chemistry: reviews of
current trends, ed. J. Leszczynski, World Scientific, 2003, vol. 8, pp.
131–160. 55
323. W. Deng, J. R. Cheeseman and M. J. Frisch, J Chem Theory Comput,
2006, 2, 1028-1037.
324. B. Bose, S. Zhao, R. Stenutz, F. Cloran, P. Bondo, G. Bondo, B.
Hertz, I. Carmichael and A. S. Seianni, J Am Chem Soc, 1998, 120,
11158-11173. 60
325. T. Helgaker, O. B. Lutnæs and M. Jaszuński, J Chem Theory
Comput, 2007, 3, 86-94.
326. T. Helgaker and M. Pecul, in Calculation of NMR and EPR
parameters: theory and applications, eds. M. Kaupp, M. Bühl and V.
G. Malkin, Wiley-VCH, Weinheim, 2004, p. 101. 65
327. F. Jensen, J Chem Theory Comput, 2006, 2, 1360-1369.
328. M. Karplus, J Am Chem Soc, 1959, 85, 2870-2871.
329. M. Hricovíni and F. Bízik, Carbohydr Res, 2007, 342, 779-783.
330. J. Angulo, P. M. Nieto and M. Martín-Lomas, Chem Commun, 2003,
1512-1513. 70
331. N. S. Gandhi and R. L. Mancera, Carbohydr Res, 2010, 345, 689-
695.
332. S. B. Engelsen and S. Perez, J Phys Chem B, 2000, 104, 9301-9311.
333. M. Tafazzoli and M. Ghiasi, Carbohydr Res, 2007, 342, 2086-2096.
334. M. Tafazzoli and M. Ghiasi, J Mol Struct, 2007, 814, 127-130. 75
335. R. Stenutz, I. Carmichael, G. Widmalm and A. S. Serianni, J Org
Chem, 2002, 67, 949-958.
336. M. Tafazzoli, M. Ghiasi and M. Moridi, Spectrochimica Acta Part A,
2008, 70, 350-357.
337. F. Cloran, I. Carmichael and A. S. Serianni, J Am Chem Soc, 2001, 80
123, 4781-4791.
338. F. Cloran, Y. Zhu, J. Osborn, I. Carmichael and A. S. Serianni, J Am
Chem Soc, 2000, 122, 6435-6448.
339. E. Kraka, J. Grafenstein, J. Gauss, F. Reichel, L. Olsson, Z. Konkoli
and D. Cremer, Goteborg University, Goteborg, Sweden., Program 85
package COLOGNE 99. edn., 1999.
340. P. Bour, I. Raich, J. Kaminsky, R. Hrabal, J. Cejka and V.
Sychrovsky, J Phys Chem A, 2004, 108, 6365-6372.
341. M. Mobli and A. Almond, Org Biomol Chem, 2007, 5, 2243-2251.
342. Y. Zhao, N. E. Schultz and D. G. Truhlar, J Chem Theory Comput, 90
2006, 2, 364-382.
343. A. S. Serianni, J. Wu and I. Carmichael, J Am Chem Soc, 1995, 117,
8645-8650.
344. I. Tvaroška, K. Mazeau, M. Blanc-muesser, S. Lavaitte, H. Driguez
and F. R. Taravel, Carbohydr Res, 1992, 229, 225-231. 95
345. K. Bock and C. Pedersen, Carbohydr Res, 1979, 71, 319-321.
346. T. J. Church, I. Carmichael and A. S. Serianni, J Am Chem Soc,
1997, 119, 8946-8964.
347. I. Carmichael, J Phys Chem, 1993, 97, 1789-1792.
348. T. Bandyopadhyay, J. Wu, W. A. Stripe, I. Carmichael and A. S. 100
Serianni, J Am Chem Soc, 1997, 119, 1737-1744.
349. V. Sychrovsky, J. Gräfenstein and D. Cremer, J Chem Phys, 2000,
113, 3530-3547.
350. T. Helgaker, M. Jaszuński, K. Ruud and A. Górska, Theor Chem Acc,
1998, 99, 175-182. 105
351. O. B. Lutnæs, T. A. Ruden and T. Helgaker, Magn Reson Chem,
2004, 42, S117-S127.
352. C. A. Bush, M. Martin-Pastor and A. Imberty, Annu Rev Biophys
Biomol Struct, 1999, 28, 269-293.
This journal is © The Royal Society of Chemistry 2013 Chemical Society Reviews, 2013, 0, 00–00 | 43
353. I. Tvaroška, M. Hricovíni and E. Perakova, Carbohydr Res, 1989,
189, 359-362.
354. N. W. Cheetham, P. Dasgupta and G. E. Ball, Carbohydr Res, 2003,
338, 955-962.
355. F. Cloran, I. Carmichael and A. S. Serianni, J Am Chem Soc, 2000, 5
122, 396-397.
356. V. Sychrovsky, Z. Vokacova, J. Sponer, N. Spackova and B.
Schneider, J Phys Chem B, 2006, 110, 22894-22902.
357. M. L. Munzarová and V. Sklenár, J Am Chem Soc, 2003, 125, 3649-
3658. 10
358. B. Schneider, Z. Morávek and H. M. Berman, Nucleic Acids Res,
2004, 32, 1666-1677.
359. B. Schneider, S. Neidle and H. M. Berman, Biopolymers, 1997, 42,
113-124.
360. R. B. Best, G. E. Jackson and K. J. Naidoo, J Phys Chem, 2001, 105, 15
4742-4751.
361. R. B. Best, G. E. Jackson and K. J. Naidoo, J Phys Chem, 2002, 106,
5091-5098.
362. S. Ilin, C. Bosques, C. Turner and H. Schwalbe, Angew Chem Int Ed
Engl, 2003, 42, 1394-1397. 20
363. S. Ravindranathan, C. H. Kim and G. Bodenhausen, J Biomol NMR,
2003, 27, 365-375.
364. E. Duchardt, C. Richter, O. Ohlenschlager, M. Gorlach, J. Wohnert
and H. Schwalbe, J Am Chem Soc, 2004, 126, 1962-1970.
365. S. Letardi, G. La Penna, E. Chiessi, A. Perico and A. Cesàro, 25
Macromolecules, 2002, 35, 286-300.
366. S. Furlan, G. La Penna, A. Perico and A. Cesaro, Macromolecules,
2004, 37, 6197-6209.
367. S. Furlan, G. La Penna, A. Perico and A. Cesaro, Carbohydr Res,
2005, 340, 959-970. 30
368. M. Zerbetto, D. Kotsyubynskyy, J. Kowalewski, G. Widmalm and A.
Polimeno, J Phys Chem B, 2012, 116, 13159-13171.
369. D. Kotsyubynskyy, M. Zerbetto, M. Soltesova, O. Engström, R.
Pendrill, J. Kowalewski, G. Widmalm and A. Polimeno, J Phys
Chem B, 2012, 116, 14541-14555. 35
370. A. G. Gerbst, N. E. Ustuzhanina, A. A. Grachev, N. S. Zlotina, E. A.
Khatuntseva, D. E. Tsvetkov, A. S. Shashkov, A. I. Usov and N. E.
Nifantiev, J Carbohydr Chem, 2002, 21, 313-324.
371. R. E. N. Shirmer, J.H.; Davis, J.P.; Hart, P.A., J Am Chem Soc, 1970,
92, 3266-3273. 40
372. A. G. Gerbst, N. E. Ustuzhanina, A. A. Grachev, E. A. Khatuntseva,
D. E. Tsvetkov, A. S. Shashkov, A. I. Usov, M. E. Preobrazhenskaya,
N. A. Ushakova and N. E. Nifantiev, J Carbohydr Chem, 2003, 22,
109-122.
373. D. A. Cumming and J. P. Carver, Biochemistry, 1987, 26, 6664-6676. 45
374. U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L.
Pedersen, J Chem Phys, 1995, 103, 8577-8593.
375. _, MestreLab Research. Mspin,
http://mestrelab.com/software/mspin/, Accessed 2013 May 15.
376. D. Neuhaus and M. P. Williamson, The nuclear Overhauser effect in 50
structural and conformational analysis, VCH Publishers, New York,
NY, 1989.
377. S. A. Smith, T. O. Levante, B. H. Meier and R. R. Ernst, J Magn
Reson A, 1994, 106, 75-105.
378. C. D. Blundell, M. A. Reed and A. Almond, Carbohydr Res, 2006, 55
341, 2803-2815.
379. H. O. Kalinowski, S. Berger and S. Braun, Carbon-13 NMR
spectroscopy, John Wiley & Sons Ltd., 1988.
380. D. N. Laikov and Y. A. Ustynyuk, Russ Chem Bull Int Ed, 2005, 54,
820-826. 60
381. _, ACD/Labs. ACD/NMR predictors,
http://www.acdlabs.com/products/adh/nmr/nmr_pred/, Accessed
2013 May 15.
65
44 | Chemical Society Reviews, 2013, 0, 00–00 This journal is © The Royal Society of Chemistry 2013
Philip Toukach (Ph.D. 2001, associate professor rank 2010) has been Senior Scientist at Zelinsky Institute of Organic Chemistry (since 2005), International Scientist of the Year (2004), Guest Scientist at Borstel Biochemical Research Center (2005-2007) and German Cancer Research center (2008-2011), Associate Professor at Moscow Academy of Fine Chemical Technology (since 2008). His major scientific interests are carbohydrate databases and NMR-based carbohydrate structure prediction. Further information can be found at his web-site http://toukach.ru/nmr.htm
Valentine Ananikov (Ph.D. 1999; Habilitation 2003) was appointed Professor and Laboratory Head at Zelinsky Institute of Organic Chemistry (2005), Elected Member of Russian Academy of Sciences (2008) and Professor of Chemistry Department of Moscow State University (2012). He was a recipient of the Russian State Prize for Outstanding Achievements in Science and Technology (2004), an Award of the Science Support Foundation (2005), a Medal of the Russian Academy of Sciences (2000), Liebig Lecturer by German Chemical Society (2010), and Balandin Prize (2010). International Advisory Boards membership: Advanced Synthesis & Catalysis, Organometallics and Chemistry An Asian Journal.
top related