qm/mm tutorial

CP/MM Tutorial Jülich, 21/05/2013

Emiliano Ippoliti: [email protected]

Introduction We want to study the effect of the solvent (water) and the temperature on the dipole moment of an acetone molecule. To accomplish that we will resort to several different software (MOLDEN, CPMD, VMD, Gaussian09, Amber, GROMOS, etc) in order to let you taste the everyday working in the biophysical field. The procedure can be summarized in the following steps:

1. Calculate the dipole moment Dv of acetone in vacuum 2. Perform a QM/MM molecular dynamics simulation for the acetone in a

classical water box at room temperature. 3. Calculate the dipole moment D for some random snapshots of this simulation

and take the mean value 4. Calculate the difference D - Dv

Each file mentioned in this tutorial (in bold) can be found on the grsjuc local cluster in the folder:

/work/Tutorials/ACETONE

under the subfolder whose name starts with the number of the corresponding section of this tutorial. To do your tests and exercises, work in your own folder in $WORK filesystem by creating a folder there by the command:

mkdir $WORK/QMMM_Tutorial Then to go to your folder:

cd $WORK/QMMM_Tutorial

1 - Where to start? Acetone is the organic compound with the formula (CH3)2CO. This colorless, mobile, flammable liquid is the simplest example of the ketones:

Figure 1. Acetone

To obtain an initial structure we use the MOLDEN program.1 As for all the programs mentioned in this tutorial, the first time in each session you want to use molden you need to initialize the environment in order to be able to recall it. This can be done with the “module” command2:

module load molden

Then, you can run molden by simply typing:

molden &

In the Molden Control panel at the right, deselect the “Shade” button and press on the “ZMAT Editor” button that is a tool to create and/or manipulate structures on screen and to optimize the structures on a force field level. The use of the editor is rather intuitive.3 In the following a synthetic list of the steps to create the acetone molecule:

• Press “Add line” • Select Carbon from the Periodic Table • Press “Substitute atom by Fragment” and without releasing the mouse button

select the –HC=O • Press “Add line” • Select Hydrogen from the Periodic Table • Click, in the Main screen, on the 4 atoms H,C,O,H in sequence to define the

connectivity of the new H atom in the molecule • Highlight a Hydrogen by pressing on the atom in the Main screen or in the list

of atoms in Zmat Editor • Press “Substitute atom by Fragment” and without releasing the mouse button

select the –CH3 • Do the same for the other Hydrogen

Then, we can save the structure in the xyz format by writing the name of the file in the field “File name?”, choosing the “XYZ” format in the menu which appear when you press on the square “Cartesian” at the right bottom and finally pressing “Write Z-Matrix”. Close the ZMAT Editor by clicking the “Close” button on the right top and then by pressing on the button with a skull.

2 - Dipole moment in vacuum 1 http://www.cmbi.ru.nl/molden/molden.html 2 The module command instructions can be found in the email you received with the credentials to access the grsjuc cluster. 3 An online general guide for the ZMAT Editor can be found at the web address:

http://www.cmbi.ru.nl/molden/zmat/zmat.html#add

We will use CPMD to calculate the dipole moment of acetone in gas phase. To initialize the environment and be able to use CPMD:

module load cpmd

Any program can be run interactively on the login node. In the following we report the command to do that. However, if your job is supposed to run for more than 5 minutes this is NOT a good practice: longer jobs should be run on the computational node by using the batch system. This can be accomplished by writing a simple batch script where you insert the information about your job and then submit the request to the batch system through the command:

qsub <your_batch_script>

Some batch script templates called launch_XXX.job can be found in:

/work/Tutorials/ACETONE

If you open them you will found some initial lines starting with #PBS word that represents commands for the batch system: #PBS -N Template ß Batch system’s job name #PBS -o output ß Filename of the batch system’s output messages #PBS -e error ß Filename of the batch system’s error messages #PBS -l nodes=2:ppn=8 ß Number of cores (2*8=16) to be reserve for the job #PBS -l walltime=0:15:10 ß Runtime limit: after the expiration of this time (here 15’) the job will be killed #PBS -q route ß the queue to be used (route is a keyword to let the system decide automatically)

followed by some lines which are the commands that the batch system has to run on the computational nodes and that are similar to the ones you use in the interactive way. Refer to the PBS user guide for a more detailed description of all the useful commands available for the batch script. To check the status of your batched jobs you can use the command:

qstat

and to remove a job from the batch list you can use the command:

qdel <job_id>

where the job_id can be read from the list got with the qstat command. To run CPMD with for example 2 cores (everything on the same line):

mpirun –np 2 cpmd.x <input_file> /home/ippoliti/PROGRAMS/archive/CPMD/PP > <output_file> &

/home/ippoliti/PROGRAMS/archive/CPMD/PP is the location where CPMD will look for the pseudopotentials files that will be specified in the input file. All the pseudopotentials we will use are already present there. In order to calculate the dipole moment of the acetone in vacuum we first have to optimize the geometry of the molecule we have constructed in the previous step. So, we need to write an appropriate input file according the syntax of CPMD.4 By your favorite editor, write down the opt_geo.inp text file: &INFO Acetone molecule Geometry optimization &END &CPMD OPTIMIZE GEOMETRY XYZ CONVERGENCE ORBITALS 1.0d-7 CONVERGENCE GEOMETRY 7.0d-4 &END &SYSTEM ANGSTROM SYMMETRY ORTHORHOMBIC CELL ABSOLUTE 10.6 10.0 9.8 0.0 0.0 0.0 CUTOFF 70. &END &DFT FUNCTIONAL BLYP &END &ATOMS *C_MT_BLYP.psp KLEINMAN-BYLANDER LMAX=D 3 0.000000 0.000000 0.000000 1.367073 0.000000 -0.483333 -0.683537 -1.183920 -0.483333 *O_MT_BLYP.psp KLEINMAN-BYLANDER LMAX=D 1 0.000000 0.000000 1.220000 *H_MT_BLYP.psp KLEINMAN-BYLANDER LMAX=P 6 1.367073 0.000000 -1.572333 1.880433 0.889165 -0.120333 1.880433 -0.889165 -0.120333 -0.683537 -1.183920 -1.572333 -0.170177 -2.073085 -0.120333

4 CPMD manual: http://www.cpmd.org/manual.pdf

-1.710256 -1.183920 -0.120333 &END

CPMD Input file

Any CPMD input file is organized in sections that start with &NAME and end with &END. Everything outside those sections is ignored. Also all keywords have to be in upper case or else they will be ignored. The sequence of the sections does not matter, nor does the order of keywords (except in some special case reported in the manual). A minimal input file must have a &CPMD, &SYSTEM, and an &ATOMS section. This input file starts with a (optional) &INFO section. This section allows you to put comments about the calculation into the input file and they will be repeated in the output file. This can be very useful to identify and match your input and output files. The first part of &CPMD section instructs the program to do a geometry optimization (XYZ suboption specify you want the final structure also in xyz format in a file called GEOMETRY.xyz and a ’trajectory’ of the optimization in a file named GEO_OPT.xyz) with a tight wavefunction and geometry convergence criterions respectively (default 10-5 and 5*10-4). The &SYSTEM section contains various parameters related to the simulation cell and the representation of the electronic structure. The keywords SYMMETRY, CELL and CUTOFF are required and define the (periodic) symmetry, shape, and size of the simulation cell, as well as the plane wave energy cutoff (i.e. the size of the basis set), respectively. The keyword ANGSTROM specifies that the atomic coordinates and the supercell parameters and several other parameters are read in Ångströms (pay attention: default is atomic units (a.u.) which are always used internally). We define a primitive orthorhombic cell with the lattice constants obtained by adding 7 Å in each direction to the dimension of the system: the cell has to be large enough to avoid significant interactions of the acetone molecule and its electron structure with its periodic neighbors. In CPMD all calculations are periodic. Do always a convergence test looking for example at the total energy value when you increase the box size! The &DFT section is used to select the density functional (FUNCTIONAL) and related parameters. In this case we use the gradient corrected BLYP functional5 (local density approximation is the default). Finally, the &ATOMS section is needed to specify the atom coordinates and the pseudopotential(s), that are used to represent them. The coordinates are taken from the structure written by MOLDEN. The input for a new atom type is started with a ``*'' in the first column. This line further contains the file name where to find the pseudopotential information starting in column 2 and several labels as KLEINMAN-BYLANDER, in our case, which specifies the method to be used for the calculation of the nonlocal parts of the pseudopotential (this method is extremely efficient but also accurate).

5 A.D. Becke, J.Chem.Phys. 98 (1993) 5648-‐5652; C. Lee, W. Yang, R.G. Parr, Phys. Rev. B 37 (1988) 785-‐789.

The next line contains information on the nonlocality of the pseudopotential: you can specify the maximum l-quantum number with ``LMAX= l '' where l is S, P or D.6 On the following lines the coordinates for this atomic species have to be given. The first line gives the number of atoms of the current type. --- If you start the geometry optimization (better if you use the batch system):

mpirun -np 2 cpmd.x opt_geo.inp /home/ippoliti/PROGRAMS/archive/CPMD/PP > opt_geo.out

the calculation should be completed in less than a minute 5 minutes (use

tail –f opt_geo.out

to monitor the different steps of the calculation reported in the output file).

CPMD Output file

Look at the output file to understand all the information that we can find there. At the beginning there is the header where one can see, when the run was started, which version of CPMD was used, and when it was compiled:

PROGRAM CPMD STARTED AT: Tue Jun 1 19:32:59 2010

SETCNST| USING: CODATA 2006 UNITS

****** ****** **** **** ****** ******* ******* ********** *******

*** ** *** ** **** ** ** *** ** ** *** ** ** ** ** ** ** ******* ** ** ** ** *** ****** ** ** ** *** ******* ** ** ** ******* ****** ** ** ** ******

VERSION 3.13.2

COMPILED WITH GROMOS-AMBER QM/MM SUPPORT

COPYRIGHT IBM RESEARCH DIVISION

MPI FESTKOERPERFORSCHUNG STUTTGART

The CPMD consortium Home Page: http://www.cpmd.org

Mailing List: [email protected] E-mail: [email protected]

*** Jan 10 2011 -- 16:03:39 ***

Then, we find some technical information about the environment (machine, user, directory, input file, process id) where this job was run: 6 If this is the only input, the program assumes that LMAX is the l for the local potential. You can use another local function by specifying ``LOC= ''. In addition it is possible to assign the local potential to a further potential with ``SKIP= ''.

THE INPUT FILE IS: opt_geo.inp THIS JOB RUNS ON: grsjuc.ju.grs-sim.de THE CURRENT DIRECTORY IS: /work/Tutorials/ACETONE/2-CPMD THE TEMPORARY DIRECTORY IS: /work/Tutorials/ACETONE/2-CPMD THE PROCESS ID IS: 28735

Next are the contents of the &INFO section copied to the output: ****************************************************************************** * INFO - INFO - INFO - INFO - INFO - INFO - INFO - INFO - INFO - INFO - INFO * ****************************************************************************** * Acetone molecule * * Geometry optimization * ****************************************************************************** Next section contain a summary of some of the parameters read in from the &CPMD section, or their respective default settings; for example the convergence threshold for wavefunction optimization (set manually) or the maximum number of iterations (default): OPTIMIZATION OF IONIC POSITIONS PATH TO THE RESTART FILES: ./ GRAM-SCHMIDT ORTHOGONALIZATION MAXIMUM NUMBER OF STEPS: 10000 STEPS MAXIMUM NUMBER OF ITERATIONS FOR SC: 10000 STEPS PRINT INTERMEDIATE RESULTS EVERY 10001 STEPS STORE INTERMEDIATE RESULTS EVERY 10001 STEPS STORE INTERMEDIATE RESULTS EVERY 10001 SELF-CONSISTENT STEPS NUMBER OF DISTINCT RESTART FILES: 1 TEMPERATURE IS CALCULATED ASSUMING EXTENDED BULK BEHAVIOR FICTITIOUS ELECTRON MASS: 400.0000 TIME STEP FOR ELECTRONS: 5.0000 TIME STEP FOR IONS: 5.0000 CONVERGENCE CRITERIA FOR WAVEFUNCTION OPTIMIZATION: 1.0000E-06 WAVEFUNCTION OPTIMIZATION BY PRECONDITIONED DIIS THRESHOLD FOR THE WF-HESSIAN IS 0.5000 MAXIMUM NUMBER OF VECTORS RETAINED FOR DIIS: 10 STEPS UNTIL DIIS RESET ON POOR PROGRESS: 10 FULL ELECTRONIC GRADIENT IS USED CONVERGENCE CRITERIA FOR GEOMETRY OPTIMIZATION: 3.000000E-04 GEOMETRY OPTIMIZATION BY GDIIS/BFGS SIZE OF GDIIS MATRIX: 5 GEOMETRY OPTIMIZATION IS SAVED ON FILE GEO_OPT.xyz EMPIRICAL INITIAL HESSIAN (DISCO PARAMETRISATION) SPLINE INTERPOLATION IN G-SPACE FOR PSEUDOPOTENTIAL FUNCTIONS NUMBER OF SPLINE POINTS: 5000

The exchange correlation functionals are reported in the lines coming immediately after: EXCHANGE CORRELATION FUNCTIONALS LDA EXCHANGE: SLATER (ALPHA = 0.66667) LDA CORRELATION: LEE, YANG & PARR [C.L. LEE, W. YANG, AND R.G. PARR, PRB 37 785 (1988)] GRADIENT CORRECTED FUNCTIONAL DENSITY THRESHOLD: 1.00000E-08 EXCHANGE ENERGY [A.D. BECKE, PHYS. REV. A 38, 3098 (1988)] PARAMETER BETA: 0.004200 CORRELATION ENERGY [LYP: C.L. LEE ET AL. PHYS. REV. B 37, 785 (1988)]

At this point of the output you find which and how many atoms (and their coordinates in a.u.), electrons and states (we are doing a closed shell calculation, so there are only doubly occupied states) are in the system, and what pseudopotentials were used with which settings:

*** DETSP| SIZE OF THE PROGRAM IS 3700/ 119392 kBYTES ***

***************************** ATOMS **************************** NR TYPE X(bohr) Y(bohr) Z(bohr) MBL 1 C 0.000000 0.000000 0.000000 3 2 C 2.583394 0.000000 -0.913367 3 3 C -1.291698 -2.237285 -0.913367 3 4 O 0.000000 0.000000 2.305466 3 5 H 2.583394 0.000000 -2.971279 3 6 H 3.553503 1.680278 -0.227396 3 7 H 3.553503 -1.680278 -0.227396 3 8 H -1.291698 -2.237285 -2.971279 3 9 H -0.321588 -3.917563 -0.227396 3 10 H -3.231915 -2.237285 -0.227396 3

****************************************************************

NUMBER OF STATES: 12 NUMBER OF ELECTRONS: 24.00000 CHARGE: 0.00000 ELECTRON TEMPERATURE(KELVIN): 0.00000

OCCUPATION 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0

============================================================ | Pseudopotential Report Tue Jan 14 09:55:25 1997 | ------------------------------------------------------------ | Atomic Symbol : C | | Atomic Number : 6 | | Number of core states : 1 | | Number of valence states : 2 | | Exchange-Correlation Functional : | | Slater exchange : .6667 | | LDA correlation : Lee-Yang-Parr | | Exchange GC : Becke (1988) | | Correlation GC : Lee-Yang-Parr | | Electron Configuration : N L Occupation | | 1 S 2.0000 | | 2 S 2.0000 | | 2 P 2.0000 | | Full Potential Total Energy -37.702121 | | Trouiller-Martins normconserving PP | | n l rc energy | | 2 S 1.2300 -.49630 | | 2 P 1.2300 -.19186 | | 3 D .7159 -.19186 | | Number of Mesh Points : 615 | | Pseudoatom Total Energy -5.370516 | ============================================================

============================================================ | Pseudopotential Report Thu Nov 30 13:19:26 1995 | ------------------------------------------------------------ | Atomic Symbol : O | | Atomic Number : 8 | | Number of core states : 1 | | Number of valence states : 2 | | Exchange-Correlation Functional : | | Slater exchange : .6667 | | LDA correlation : Lee-Yang-Parr | | Exchange GC : Becke (1988) | | Correlation GC : Lee-Yang-Parr | | Electron Configuration : N L Occupation | | 1 S 2.0000 | | 2 S 2.0000 | | 2 P 4.0000 | | Full Potential Total Energy -75.023693 | | Trouiller-Martins normconserving PP | | n l rc energy | | 2 S 1.0500 -.87404 | | 2 P 1.0500 -.33186 | | 3 D 1.0500 -.33186 |

| Number of Mesh Points : 631 | | Pseudoatom Total Energy -15.775323 | ============================================================

============================================================ | Pseudopotential Report Thu Nov 30 13:17:19 1995 | ------------------------------------------------------------ | Atomic Symbol : H | | Atomic Number : 1 | | Number of core states : 0 | | Number of valence states : 1 | | Exchange-Correlation Functional : | | Slater exchange : .6667 | | LDA correlation : Lee-Yang-Parr | | Exchange GC : Becke (1988) | | Correlation GC : Lee-Yang-Parr | | Electron Configuration : N L Occupation | | 1 S 1.0000 | | Full Potential Total Energy -.462611 | | Trouiller-Martins normconserving PP | | n l rc energy | | 1 S .5000 -.24002 | | 2 P .5000 -.24002 | | Number of Mesh Points : 511 | | Pseudoatom Total Energy -.462591 | ============================================================

**************************************************************** * ATOM MASS RAGGIO NLCC PSEUDOPOTENTIAL * * C 12.0112 1.2000 NO KLEINMAN S NONLOCAL * * P NONLOCAL * * D LOCAL * * O 15.9994 1.2000 NO KLEINMAN S NONLOCAL * * P NONLOCAL * * D LOCAL * * H 1.0080 1.2000 NO KLEINMAN S NONLOCAL * * P LOCAL * ****************************************************************

Then, a section about how the calculation is distributed through the cores: here you can understand if something wrong happened and you run for example only a serial job! Here below an example of this section by using 8 cores:

PARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARA NCPU NGW NHG PLANES GXRAYS HXRAYS ORBITALS Z-PLANES

0 4333 34682 13 246 976 1 1 1 4335 34684 14 246 976 2 1 2 4336 34639 13 243 975 1 1 3 4336 34676 14 244 976 2 1 4 4334 34682 13 244 976 1 1 5 4332 34670 14 244 976 2 1 6 4338 34676 13 244 976 1 1 7 4342 34680 14 244 976 2 1

G=0 COMPONENT ON PROCESSOR : 2 PARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARAPARA

*** LOADPA| SIZE OF THE PROGRAM IS 10260/ 120760 kBYTES ***

OPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPEN NUMBER OF CPUS PER TASK 1 OPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPENMPOPEN

*** RGGEN| SIZE OF THE PROGRAM IS 11492/ 121976 kBYTES ***

The following part of the output we see a summary of the settings read in from the &SYSTEM section of the input file or their corresponding defaults and some derived parameters (density cutoff, number of plane waves):

************************** SUPERCELL *************************** SYMMETRY: ORTHORHOMBIC LATTICE CONSTANT(a.u.): 20.03110

CELL DIMENSION: 20.0311 0.9434 0.9245 0.0000 0.0000 0.0000 VOLUME(OMEGA IN BOHR^3): 7010.16987 LATTICE VECTOR A1(BOHR): 20.0311 0.0000 0.0000 LATTICE VECTOR A2(BOHR): 0.0000 18.8973 0.0000 LATTICE VECTOR A3(BOHR): 0.0000 0.0000 18.5193 RECIP. LAT. VEC. B1(2Pi/BOHR): 0.0499 0.0000 0.0000 RECIP. LAT. VEC. B2(2Pi/BOHR): 0.0000 0.0529 0.0000 RECIP. LAT. VEC. B3(2Pi/BOHR): 0.0000 0.0000 0.0540 REAL SPACE MESH: 108 108 100 WAVEFUNCTION CUTOFF(RYDBERG): 70.00000 DENSITY CUTOFF(RYDBERG): (DUAL= 4.00) 280.00000 NUMBER OF PLANE WAVES FOR WAVEFUNCTION CUTOFF: 34686 NUMBER OF PLANE WAVES FOR DENSITY CUTOFF: 277389 ****************************************************************

Here we see how CPMD generates the initial guess for the wavefunction optimization. In this case it uses a superposition of atomic wavefunctions using an (internal) minimal Slater basis: *** RINFORCE| SIZE OF THE PROGRAM IS 16052/ 127264 kBYTES *** *** FFTPRP| SIZE OF THE PROGRAM IS 19764/ 128656 kBYTES *** GENERATE ATOMIC BASIS SET C SLATER ORBITALS 2S ALPHA= 1.6083 OCCUPATION= 2.00 2P ALPHA= 1.5679 OCCUPATION= 2.00 O SLATER ORBITALS 2S ALPHA= 2.2458 OCCUPATION= 2.00 2P ALPHA= 2.2266 OCCUPATION= 4.00 H SLATER ORBITALS 1S ALPHA= 1.0000 OCCUPATION= 1.00 INITIALIZATION TIME: 0.39 SECONDS *** GMOPTS| SIZE OF THE PROGRAM IS 24656/ 141780 kBYTES *** *** PHFAC| SIZE OF THE PROGRAM IS 24764/ 152468 kBYTES *** *** ATOMWF| SIZE OF THE PROGRAM IS 25784/ 154300 kBYTES *** ATRHO| CHARGE(R-SPACE): 24.000000 (G-SPACE): 24.000000 **************************************************************** * ATOMIC COORDINATES * **************************************************************** 1 C 0.000000 0.000000 0.000000 2 C 2.583394 0.000000 -0.913367 3 C -1.291698 -2.237285 -0.913367 4 O 0.000000 0.000000 2.305466 5 H 2.583394 0.000000 -2.971279 6 H 3.553503 1.680278 -0.227396 7 H 3.553503 -1.680278 -0.227396 8 H -1.291698 -2.237285 -2.971279 9 H -0.321588 -3.917563 -0.227396 10 H -3.231915 -2.237285 -0.227396 ****************************************************************

From the following statement in the output you can see that the Hessian has been initialized from a simple guess assuming a molecule with some specified bonds. This behavior can be controlled with the keyword HESSIAN. For bulk systems or complicated molecules, it may be better to start from a unit Hessian instead: INITIALIZE EMPIRICAL HESSIAN <<<<< ASSUMED BONDS >>>>> 2 <--> 1 3 <--> 1 4 <--> 1 5 <--> 2 6 <--> 2 7 <--> 2 8 <--> 3 9 <--> 3 10 <--> 3 <<<<< HYDROGEN BONDS >>>>> 5 <--> 1 5 <--> 3 6 <--> 1 6 <--> 4 7 <--> 1 7 <--> 3 7 <--> 4 8 <--> 1 8 <--> 2 9 <--> 1 9 <--> 2 9 <--> 4 10 <--> 1 10 <--> 4 TOTAL NUMBER OF MOLECULAR STRUCTURES: 1 **************************************************************** * ATOMIC COORDINATES *

**************************************************************** 1 C 0.000000 0.000000 0.000000 2 C 2.583394 0.000000 -0.913367 3 C -1.291698 -2.237285 -0.913367 4 O 0.000000 0.000000 2.305466 5 H 2.583394 0.000000 -2.971279 6 H 3.553503 1.680278 -0.227396 7 H 3.553503 -1.680278 -0.227396 8 H -1.291698 -2.237285 -2.971279 9 H -0.321588 -3.917563 -0.227396 10 H -3.231915 -2.237285 -0.227396 ****************************************************************

Now the program is ready to start the geometry optimization. You can follow the progress of the optimization in the output file: CPU TIME FOR INITIALIZATION 0.81 SECONDS ================================================================ = GEOMETRY OPTIMIZATION = ================================================================ NFI GEMAX CNORM ETOT DETOT TCPU EWALD| SUM IN REAL SPACE OVER 1* 1* 1 CELLS 1 3.015E-02 4.345E-03 -35.648938 -3.565E+01 0.51 2 1.889E-02 1.442E-03 -36.307687 -6.587E-01 0.46 3 1.904E-02 7.725E-04 -36.379166 -7.148E-02 0.46

… 26 1.294E-06 1.055E-07 -36.415776 -2.343E-09 0.48 27 9.933E-07 7.580E-08 -36.415776 -1.233E-09 0.48 RESTART INFORMATION WRITTEN ON FILE ./RESTART.1 ATOM COORDINATES GRADIENTS (-FORCES) 1 C 0.0000 0.0000 0.0000 -4.443E-02 7.697E-02 -5.004E-02 2 C 2.5834 0.0000 -0.9134 -3.769E-02 -4.835E-02 4.876E-02 3 C -1.2917 -2.2373 -0.9134 6.088E-02 8.661E-03 4.903E-02 4 O 0.0000 0.0000 2.3055 3.158E-02 -5.469E-02 -6.270E-02 5 H 2.5834 0.0000 -2.9713 -1.090E-02 -5.915E-05 8.858E-03 6 H 3.5535 1.6803 -0.2274 -9.255E-03 -3.027E-03 -2.712E-04 7 H 3.5535 -1.6803 -0.2274 6.917E-04 3.287E-03 -1.830E-03 8 H -1.2917 -2.2373 -2.9713 5.515E-03 9.402E-03 8.862E-03 9 H -0.3216 -3.9176 -0.2274 -3.189E-03 1.061E-03 -1.824E-03 10 H -3.2319 -2.2373 -0.2274 7.229E-03 6.496E-03 -2.472E-04 FILE GEO_OPT.xyz EXISTS, NEW DATA WILL BE APPENDED **************************************************************** *** TOTAL STEP NR. 27 GEOMETRY STEP NR. 1 *** *** GNMAX= 7.696852E-02 ETOT= -36.415776 *** *** GNORM= 3.227866E-02 DETOT= 0.000E+00 *** *** CNSTR= 0.000000E+00 TCPU= 12.89 *** **************************************************************** 1 2.130E-02 5.311E-03 -35.580799 8.350E-01 0.46 2 1.574E-02 1.566E-03 -36.324351 -7.436E-01 0.46 3 5.503E-03 8.729E-04 -36.400043 -7.569E-02 0.46 ...

A geometry optimization is not much else than repeated wavefunction optimizations, where the positions of the atoms are updated according to the forces acting on them. In the first part above you can see, the wavefunction optimization. The columns have the following meaning:

• NFI step number (number of finite iterations) • GEMAX largest off-diagonal component • CNORM average of the off-diagonal components • ETOT total energy DETOT change in total energy

• TCPU time used for this step One can see that the calculation stops after the convergence criterion of 1.0d-6 has been reached for the GEMAX value. After printing the positions and forces of the atoms you see a small report block and then another wavefunction optimization starts. The numbers for GNMAX, GNORM, and CNSTR stand for the largest absolute component of the force on any atom, average force on the atoms, and the largest absolute component of a constraint force on the atoms respectively. They allow you to monitor the progress of the convergence of the geometry optimization. Finally, at the end of the geometry optimization, you can see that the forces and the total energy have decreased from their initial values as it is to be expected:

**************************************************************** *** TOTAL STEP NR. 3737 GEOMETRY STEP NR. 214 *** *** GNMAX= 2.827646E-04 [3.64E-02] ETOT= -36.489908 *** *** GNORM= 1.395703E-04 DETOT= -1.769E-05 *** *** CNSTR= 0.000000E+00 TCPU= 9.63 *** **************************************************************** ================================================================ = END OF GEOMETRY OPTIMIZATION = ================================================================

and we have the final summary of the results; the atom coordinates and a breakdown of the total energy into the various components:

RESTART INFORMATION WRITTEN ON FILE ./RESTART.1

**************************************************************** * * * FINAL RESULTS * * * ****************************************************************

ATOM COORDINATES GRADIENTS (-FORCES)

1 C 0.3033 -0.3710 0.4996 2.238E-04 -3.750E-05 -1.867E-04 2 C 2.8157 0.1207 -0.8206 -2.828E-04 1.186E-04 1.534E-04 3 C -1.4005 -2.3884 -0.6592 -1.866E-05 -5.129E-05 1.748E-04 4 O -0.3055 0.7755 2.4178 1.488E-05 -9.553E-05 2.593E-04 5 H 2.5171 0.5750 -2.8280 7.045E-06 -1.029E-04 -2.289E-04 6 H 3.8187 1.6721 0.1124 -6.949E-05 1.507E-04 -3.226E-05 7 H 3.9948 -1.5943 -0.7676 1.188E-04 -1.722E-05 1.350E-04 8 H -1.5584 -2.1576 -2.7186 1.045E-05 -8.026E-05 -1.718E-04 9 H -0.5710 -4.2692 -0.3267 -5.208E-05 -4.544E-05 5.413E-05 10 H -3.2775 -2.3193 0.2103 -1.609E-04 1.615E-04 -2.245E-04

****************************************************************

ELECTRONIC GRADIENT: MAX. COMPONENT = 8.03517E-07 NORM = 6.81662E-08

NUCLEAR GRADIENT: MAX. COMPONENT = 2.82765E-04 NORM = 1.39570E-04

TOTAL INTEGRATED ELECTRONIC DENSITY IN G-SPACE = 24.000000 IN R-SPACE = 24.000000

(K+E1+L+N+X) TOTAL ENERGY = -36.48990756 A.U. (K) KINETIC ENERGY = 27.70023623 A.U. (E1=A-S+R) ELECTROSTATIC ENERGY = -27.60801450 A.U. (S) ESELF = 29.92067103 A.U. (R) ESR = 1.71581582 A.U. (L) LOCAL PSEUDOPOTENTIAL ENERGY = -27.66180959 A.U. (N) N-L PSEUDOPOTENTIAL ENERGY = 1.82151808 A.U.

(X) EXCHANGE-CORRELATION ENERGY = -10.74183777 A.U. GRADIENT CORRECTION ENERGY = -0.58344734 A.U.

****************************************************************

The last section of the output reports the memory allocation and the timing of the run:

================================================================ BIG MEMORY ALLOCATIONS

SCR 785159 CATOM 190652 PME 520040 PSI 330270 YF 330270 GNL 240960 ATWFR 301200 SCR 785149 XF 330270 RHOE 165135 ---------------------------------------------------------------- [PEAK NUMBER 99] PEAK MEMORY 4872541 = 39.0 MBytes ================================================================

**************************************************************** * * * TIMING * * * **************************************************************** SUBROUTINE CALLS CPU TIME ELAPSED TIME INVFFTN 44858 3172.00 323.62 FFT-G/S 269168 2733.40 274.36 FWFFT 18690 1730.00 175.40 I NVFFT 14953 1726.20 173.92 GCENER 3738 1265.70 128.33 FWFFTN 22436 1242.70 125.49 FFTCOM 33643 817.30 127.30 VPSI 3747 593.00 58.45 GRADEN 3738 460.30 47.00 RHOOFR 3737 448.90 45.11 N-FFTCOM 67294 389.10 84.91 ODIIS 3737 381.20 38.66 XCENER 3738 353.30 35.86 FNONLOC 3747 233.70 23.72 PHASE 33643 219.10 22.22 VOFRHOA 3738 172.50 17.50 VOFRHOB 3738 131.30 12.75 EICALC 3738 94.90 9.90 RNLSM1 3747 66.90 6.84 RGS 3950 38.60 3.57 RNLSM2 664 35.50 3.48

---------------------------------------------------------------- TOTAL TIME 16305.60 1738.41 ****************************************************************

CPU TIME : 4 HOURS 33 MINUTES 47.00 SECONDS

ELAPSED TIME : 0 HOURS 30 MINUTES 10.50 SECONDS *** CPMD| SIZE OF THE PROGRAM IS 41068/ 152348 kBYTES ***

PROGRAM CPMD ENDED AT: Tue Jan 10 20:03:09 2011

================================================================ = COMMUNICATION TASK AVERAGE MESSAGE LENGTH NUMBER OF CALLS = = SEND/RECEIVE 43407. BYTES 36127. = = BROADCAST 321. BYTES 15557. = = GLOBAL SUMMATION 360. BYTES 79725. = = GLOBAL MULTIPLICATION 0. BYTES 1. = = ALL TO ALL COMM 876851. BYTES 100937. = = PERFORMANCE TOTAL TIME = = SEND/RECEIVE 5675.900 MB/S 0.276 SEC = = BROADCAST 94.488 MB/S 0.053 SEC = = GLOBAL SUMMATION 8.258 MB/S 10.435 SEC = = GLOBAL MULTIPLICATION 0.000 MB/S 0.001 SEC = = ALL TO ALL COMM 417.428 MB/S 212.029 SEC = = SYNCHRONISATION 0.025 SEC = ================================================================

If you look in your directory, you will see the following files:

Name Content GEOMETRY Final ionic positions and velocities (in a.u.) GEOMETRY.xyz Final ionic positions and velocities in xyz format GEO_OPT.xyz All the ionic positions along the geometry optimization HESSIAN Approximate Hessian used in geometry optimization LATEST Info on the last restart file: filename and # times written RESTART.1 Restart binary file from which it is possible restart a new

calculation Now we want to calculate the dipole of the acetone in the “zero temperature” optimized structure just found by using the RESTART.1 file to retrieve the information. We can do that modifying the input file this way (see dipole.inp file): &CPMD PROPERTIES RESTART WAVEFUNCTION COORDINATES LATEST &END &PROP DIPOLE MOMENT &END …

PROPERTIES keyword tells CPMD to look at the section &PROP where it is specifies which properties it has to calculate. RESTART keyword tells CPMD to read WAVEFUNCTION and COORDINATES from the restart file written in the LATEST file (which was generated in the previous calculation). The rest of the input file is the same of the previous one: the coordinates will be not read (since are taken from the restart file) but some number as to put anyway to avoid the program stops writing some error! Run the job (better if you use the batch system):

mpirun -np 2 cpmd.x dipole.inp

/home/ippoliti/PROGRAMS/archive/CPMD/PP > dipole.out

In the output file you can see that the following new section: CALCULATE DIPOLE MOMENT **************************************************************** * PROPERTY CALCULATIONS * **************************************************************** RV30| WARNING! NO WAVEFUNCTION VELOCITIES RESTART INFORMATION READ ON FILE ./RESTART.1 *** PHFAC| SIZE OF THE PROGRAM IS 26772/ 142336 kBYTES *** CENTER OF INTEGRATION (CORE CHARGE): 0.37726 -0.65932 0.02482 DIPOLE MOMENT X Y Z TOTAL

0.35426 -0.61017 -0.98996 1.21566 atomic units 0.90044 -1.55090 -2.51623 3.08990 Debye

3 – The electrostatic grid The next step will be to create a box with an acetone molecule and several water molecules. We will use for that the xleap program belonging to the Amber117 molecular dynamics package. This package has been devised to deal with biological systems (aminoacids, nucleic acids, etc) and so our acetone has not recognized by it. This is also the case when we want to study some drug that interacts with “traditional” biological systems. In particular, xleap do not know how to assign the partial charges to each atom of acetone. To do that we will use a standard procedure which has been directly suggested by the developers of the Amber force fields and which they use to tune that parameters which is not possible to determine experimentally. The procedure is based on calculating at quantum level the electrostatic potential on a grid around the molecule and then to determine the so-called RESP charges to assign at each atom in order to reproduce the electrostatic field. According the recipe of Amber developers we will use the Gaussian09 package8 to calculate the electrostatic grid by using the 6-31G(d,p) localized basis set and B3LYP exchange correlation functional. First, we need to optimize the geometry again: the level of theory is different and in principle the optimized structures could be different! Here it is the input file opt_geom.com for Gaussian: %NProcShared=2 %chk=./opt_geom.chk %mem=600MB #p opt b3lyp/6-31G(d,p) nosymm iop(6/7=3) gfinput Acetone geometry optimization 0 1 C 0.000000 0.000000 0.000000 O 0.000000 0.000000 1.220000 C 1.367073 0.000000 -0.483333 C -0.683537 -1.183920 -0.483333 H 1.367073 0.000000 -1.572333 H 1.880433 0.889165 -0.120333 H 1.880433 -0.889165 -0.120333 H -0.683537 -1.183920 -1.572333 H -0.170177 -2.073085 -0.120333 H -1.710256 -1.183920 -0.120333

Note: Remember to leave two blank lines at the end of the file (do not ask me why J) Run the optimization:

1. Initialize the environment (this time we use the module command since the program was installed by the staff of the cluster and not by us):

7 You can find the manual at the web address: http://ambermd.org/doc11/ 8 Gaussian09 online manual: http://www.gaussian.com/g_tech/g_ur/g09help.htm

module load gaussian

2. Run Gaussian (better if you use the batch system):

g09 opt_geom.com

3. After about one minute the job will end and you can check that the optimization has been successfully this way:

grep -B 1 Maximum opt_geom.log

Use Molden to retrieve the coordinates of the last optimized configuration:

molden opt_geom.log &

in the Molden Control panel press the “Geom. conv.” button and the Geom Convergence windows will appear. In that window you can select any step of the optimization procedure; select the last one and save the structure. Now, use Gaussian again to calculate the electrostatic grid by the electrostatic_grid.com input file: %NProcShared=8 %chk=./electrostatic_grid.chk %mem=600MB #p b3lyp/6-31G(d,p) nosymm iop(6/33=2) pop(chelpg,regular) guess=read gfinput Acetone electrostatic grid calculation 0 1 C 0.1066280 -0.1846610 0.1346160 O -0.3060700 0.5300980 1.0273310 C 1.4973060 -0.0081640 -0.4519990 C -0.7415970 -1.3007560 -0.4520680 H 1.4366050 0.1927160 -1.5279620 ß Optimized coordinates H 2.0063930 0.8173940 0.0467940 H 2.0826090 -0.9274690 -0.3338980 H -0.8854310 -1.1476190 -1.5280030 H -0.2379310 -2.2672580 -0.3343160 H -1.7110070 -1.3291270 0.0468420 The electrostatic grid will be written in the log file: look for the string “ESP fit Center”.

4 – Partial charges The program Antechamber inside the Amber10 and Amber11 package9 is able to read the electrostatic grid in a log file of Gaussian and calculate the RESP charges associated to the atoms of your molecule. To use Antechamber and every other program of the Amber11 suite, we firstly need to initialize the environment:

module load amber/11

9 The current version (07-‐2012) of Antechamber in Amber12 is buggy. Do not use Amber12 for these calculations.

Copy the log file of Gaussian with the electrostatic grid information on a new directory and use the following command:10 antechamber -i electrostatic_grid.log -fi gout -c resp -nc 0 -m 1 -o

acetone.resp.prep -fo prepi -rn ACET

Actually, Antechamber recalls several programs in sequence that produce many different intermediate files in your directory. When the process ends correctly you should see the file acetone.resp.prep that is the file with the partial RESP charges that we will use in the next step (copy it in a new directory and move to it).

5 – Water box It is time to create the water box containing the acetone and the files necessary to run a short classical Molecular Dynamics (MD) run. In fact, we want then to perform a classical pre-equilibration of the system, in order to relax it near to a “stable state”.11 For this aim we will use the Graphical User Interface Xleap of Amber11:

xleap -f $AMBERHOME/dat/leap/cmd/leaprc.ff99SB &

where the file leaprc.ff99SB is used to load all libraries which contains the parameters of the Amber force field 99SB. Such a force field like any others has been tuned to describe aminoacids, nucleic acids, etc but there is no information about acetone. For this reason we have calculated the partial charges, which now we can load in Xleap by typing the following command (in the Xleap main window):12

loadamberprep acetone.resp.prep

and for the same reason some other parameters are still missing and have to be added manually. Most of them can be retrieved by the “general Amber force field for organic molecules GAFF”:

loadamberparams $AMBERHOME/dat/leap/parm/gaff.dat

other ones, however, have to provide manually. It is not difficult to calculate them but here we will no explain how to do13. We have already prepared a file with such missing parameters (above all bond and angles ones) and they can be load in Xleap by the following command:

loadamberparams /work/Tutorials/ACETONE/5-XLEAP/acetone.frcmod

10 Run simply the command “antechamber” to have a short description of the meaning of all the option recognized by antechamber. 11 Remember the time step for a classical MD run is ~ps, while the time step for a quantum MD is ~fs. So, the classical pre-‐equilibration is essential since typical relaxation times are of the order of 10s-‐100s of ps! 12 The syntax for any Xleap command can be recall by typing the command without any arguments. 13 Refer to the Amber manual, for example.

You can now give a look at your system by the command edit:

edit ACET

You can also check if all the parameters for the acetone have been loaded:

check ACET

Finally, you can save the topology and coordinates files for your ACET unit by typing:

saveamberparm ACET acetone.top acetone.rst

However, we want to study the acetone in water and therefore we need to solvate the system, before:

solvatebox ACET TIP3PBOX 14

TIP3PBOX specifies that we solvate with the classical water model molecule “TIP3P”14; an orthorhombic box whose walls are at least 14 Å from any acetone’s atoms has been created:

edit ACET

From the new window you can also look at the partial charges and other parameters of your system:

• Select the atoms • Edit/Edit selected atoms

Finally, we save the topology and coordinates files for our solvated system:

saveamberparm ACET acetone_solv.top acetone_solv.rst

Copy the two files to another folder. By the VMD software15 you can visualize your system:

vmd –parm7 acetone_solv.top –rst7 acetone_solv.rst

VMD is also installed on your laptop. You can transfer the files needed to VMD on your laptop and use them locally: this strategy could be worth in the cases your network connection was really slow.

6 – Pre Equilibration

14 http://en.wikipedia.org/wiki/Water_model 15 VMD User’s Guide: http://www.ks.uiuc.edu/Research/vmd/current/ug/

We need now to equilibrate the system at force field level in order to later start the QM/MM simulation from a configuration “not so far” from an equilibrium one (at the QM/MM level). To run MD simulation we will use the “sander” program of the AMBER11 package. We proceed this way:16

1. First, classical minimization of the system restraining the acetone molecule to the initial position: this step is performed since the initial Xleap solvation is not physically reasonable (no hydrogen bonds, etc) and this way we favor water molecules to orient correctly around the acetone (better if you use the batch system):

mpirun -np 2 sander.MPI -O -i 1-restraint.inp -o eq_restraint.out –c acetone_solv.rst -p acetone_solv.top -r eq_restraint.rst –ref

acetone_solv.rst -inf eq_restraint.info

Note: the command “tail eq_restraint.info” can be used to monitor the minimization and verify that it does not reach the max number of steps but stop before satisfying the convergence criteria.

2. Then, a not constrained minimization is performed (better if you use the batch system):

mpirun -np 2 sander.MPI -O -i 2-minimization.inp –o eq_minimization.out -c eq_restraint.rst -p acetone_solv.top -r

eq_minimization.rst -inf eq_minimization.info

3. Now, we take the system at 300 K with MD at constant volume and a linear

heating. In this step we constrain weakly acetone to the initial position so that water can spread all around the acetone without forming “holes”. Since the liquid water relaxation time is order of 10 ps we need to perform a > 10 ps MD (better if you use the batch system):

mpirun -np 2 sander.MPI -O -i 3-heating.inp -p acetone_solv.top -c eq_minimization.rst -ref eq_minimization.rst -o eq_heating.out -r

eq_heating.rst -x eq_heating.crd -e eq_heating.en -inf eq_heating.info

4. Finally, we couple our system simultaneously to a thermostat at 300 K and a barostat at 1 atm and perform an NPT simulation to let the density of the system to reach the equilibrium value at room condition (~ 1g/cm3 since the most is formed by water) (better if you use the batch system):

mpirun -np 2 sander.MPI -O -i 4-eq_density.inp -p

acetone_solv.top -c eq_heating.rst -o eq_density.out -r eq_density.rst -x eq_density.crd -e eq_density.en –inf

eq_density.info

16 All the input file for sander can be found in the folder:

/work/Tutorials/ACETONE/6-SANDER

Note: when the last step has ended, you can fast analyze the behavior of all the physically relevant quantities of your system (like the density for example) by using the perl script “process_mdout.perl”: Box Density • mkdir analysis • cd analysis • process_mdout.perl ../eq_density.out • xmgrace summary.DENSITY

Why is density smaller than 1g/cm3?

7 – Reimaging Let’s give a look at the last configuration you obtained from the classical molecular dynamics equilibration:

cd ..

vmd -parm7 acetone_solv.top -rst7 eq_density.rst

What did happen to our orthorhombic box?

The image shows your system without applying the periodic boundary conditions (PBC) that sander and all the other programs of Amber11 use. So, molecules drift over time and may span multiple periodic cells; this is normal when you are working on Amber11 or on some other molecular dynamics package. However, now we want to move to CPMD in order to perform a QM/MM MD simulation, and CPMD does not apply “automatically” PBC. Consequently, we need to “reimage” the coordinates

into the primary unit cell. We can use the “ptraj” program17 of the Amber11 suite to accomplish this task. Move the topology and coordinates files to another folder and then create the input file eq_density.ptraj for ptraj: trajin eq_density.rst ß coordinates file to read trajout eq_density_reimaged.rst restart ß output file in the same format as the input center :1 ß center the box to the geometric center of residue 1 image center ß force all the molecules into the primary unit cell

Run ptraj according this syntax:

ptraj acetone_solv.top < eq_density.ptraj

Verify that the imaging has been correctly done:

vmd -parm7 acetone_solv.top -rst7 eq_density_reimaged.rst.1

8 – Convert MD files

Copy the topology and the “reimaged” coordinates files to a new directory and typing the following command:

module load cpmd

Conv_7.x acetone_solv.top eq_density_reimaged.rst.1 solvate

Conv_7.x is an in-house program written some years ago to convert the Amber MD files in the GROMOS format. This because the QM/MM interface of CPMD works with such a format. The option “solvate” let us specify that the water molecules should be treated as solvent ones: this is useful only if you are interested to read in the log files of CPMD energies and other quantities partitioned in solute and solvent components. If you get an error like this: PGFIO-F-231/formatted read/unit=12/error on data conversion.

17 http://ambermd.org/doc10/AmberTools.pdf

File name = acetone_solv.top formatted, sequential access record = 2538 In source file Conv.f, at line number 280

the reason for that is because you are using an Amber version later of 2010. As of the end of 2010, xleap writes in the topology files two short sections:

%FLAG SCEE_SCALE_FACTOR and

%FLAG SCNB_SCALE_FACTOR

introduced to allow you to set the scaling coefficients for individual elements. These sections are not understood by our converter and should be removed from our topology file before processing it with Conv7.x. This can be simply done by opening the topology file with a text editor:

vi acetone_solv.top This procedure can also be done automatically by using the perl script Conv7.pl in place of Conv7.x

Conv_7.pl acetone_solv.top eq_density_reimaged.rst.1 solvate

The presence of these two sections is completely negligible for our aims, including performing classical molecular dynamics. The converter will produce 5 files:

• gromos.top the Gromos topology file for our system • gromos.inp the Gromos input file • gromos.g96 the coordinates file in g96 format • gromacs.top the Gromacs topology file • ffgro96.itp the itp file for the Gromacs software

We need only the first 3 files, but some changes18 have to be done in order to correctly set up a QM/MM MD simulation:

Changes in gromos.inp

1 - In the section SYSTEM the two numbers should be in sequence: Number of (identical) solute (not necessarily the QM part!) molecules Number of (identical) solvent (not necessarily the MM part!) molecules Such information can be retrieved, for example, by typing on the prompt command line:

18 see also http://www.cpmd.org/manual/node267.html

tail gromacs.top

2 - In the section BOUNDARY:

The first number should be 0 for isolated system; >0 if PBC in parallelepiped box has been used; <0 if PBC in octahedric box has been used. The following 3 numbers are the sizes of the box that can be read at the end of gromos.g96 file. 90.0 is the angle between the x and z axis of the box. The last number is ignored by CPMD in QM/MM simulations.

3 - In the section SUBMOLECULES the numbers in sequence should be: Number of (different) solute molecules. Index of the last atom of the first solute molecule. Index of the last atom of the second solute molecule. ... Such data can be read from the file gromos.g96:

vi gromos.g96

4 - In the section PRINT you could want to modify the first number which give

you the number of steps after that CPMD write info of the energy in the output file (20 is usually enough).

5 - In the section FORCE, under the line of 1's (which turns the various force components on), we have to put: The number of different layers, usually 2 (Solute and solvent) Index of the last atom of layer 1 Index of the last atom of layer 2 ...

Changes in gromos.top

In the section ATOMTYPENAME replace the names of the types of the atoms, from the standard generic force field library "gaff":

vi $AMBERHOME/dat/leap/parm/gaff.dat

to the Amber force field library:

vi $AMBERHOME/dat/leap/parm/parm99.dat

The correctly modified files gromos_mod.top and gromos_mod.inp can be found in:

/work/Tutorials/ACETONE/8-Conv_7

9 – QM/MM MD

Copy the modified files and the grooms.g96 file to a new directory. At this point, we have all the files that describe the MM part and its interactions; the last step will be to write the input file for CPMD with the instructions for the simulation and the definition of the QM part. A CPMD input file for a QM/MM simulation is similar to the CPMD input file for a standard QM calculation that has been described in paragraph 2 (section &INFO, &CPMD, &SYSTEM, &ATOMS, &DFT). However, there are 6 main differences that should always be taken into account when you deal with the QM/MM interface of CPMD:

1. In the &CPMD section the keyword QMMM has to be added. 2. A new &QMMM section, which we will explain in detail below, is

mandatory. 3. The QM atoms are specified in the &ATOMS section similar to normal

CPMD calculations. Instead of explicit coordinates one has to provide the atom index as given in the Gromos topology and coordinates files:

vi gromos.g96

4. The keyword ANGSTROM in the &SYSTEM section cannot be used, so any length has to be specified in a.u.

5. The option ABSOLUTE in the keyword CELL cannot be used. Therefore, the correct syntax for the size of the rectangular box AxBxC is

A B/A C/A 0 0 0

6. The QM system in a QM/MM calculation can only be dealt as isolated system,

i.e. without explicit PBC since there is the MM environment all round it. Even though we are requesting an isolated system calculation (SYMMETRY keyword with the option ISOLATED SYSTEM or 0), the calculation is, in fact, still done in a periodic cell (we are still using plane wave basis set!). Since acetone has a dipole moment, we have to take care of the long range interactions between periodic images and there are methods (activated with the keyword POISSON SOLVER in the &SYSTEM section) implemented in CPMD to compensate for this effect. We will choose the TUCKERMAN Poisson solver19 since it has been proven to be the most effective one with typical systems studied in biology. Decoupling of the electrostatic images in the Poisson solver requires to increase the box size over the dimension of the molecule: practical experience shows that 3.5 Å space between the outmost atoms and the box is usually sufficient for typical biological systems.

&QMMM section

In this paragraph we will review the most relevant keywords to specify in the &QMMM section of the CPMD input file. See for example 20 for a complete list:

19 G.J. Martyna and M. E. Tuckerman, J. Chem. Phys. 110, 2810 (1999). 20 http://www.cpmd.org/manual/node258.html

TOPOLOGY: On the next line the name of a Gromos topology file has to be given.

COORDINATES: On the next line the name of a Gromos96 format coordinate file has to be given.

INPUT: On the next line the name of a Gromos input file has to be given.

AMBER: An Amber functional form for the classical force field is used.

ELECTROSTATIC

COUPLING: The electrostatic interaction of the quantum system with the classical system is explicitly kept into account for all classical atoms at a distance r ≤ RCUT_NN from any quantum atom and for all the MM atoms at a distance of RCUT_NN < r ≤ RCUT_MIX and a charge larger than 0.1e (NN atoms). MM-atoms with a charge smaller than 0.1e and a distance of RCUT_NN < r ≤ RCUT_MIX and all MM-atoms with RCUT_MIX < r ≤ RCUT_ESP are coupled to the QM system by a ESP coupling Hamiltonian (EC atoms). If the additional LONG RANGE keyword is specified, the interaction of the QM-system with the rest of the classical atoms is explicitly kept into account via interacting with a multipole expansion for the QM-system up to quadrupolar order. A file named MULTIPOLE is produced. If LONG RANGE is omitted the quantum system is coupled to the classical atoms not in the NN-area and in the EC-area list via the force-field charges. If the keyword ELECTROSTATIC COUPLING is omitted, all classical atoms are coupled to the quantum system by the force-field charges (mechanical coupling): computational expensive calculation!

RCUT_NN: The cutoff distance for atoms in the nearest neighbor

region from the QM-system is read from the next line. We will use the default value of 10 a.u.

RCUT_MIX: The cutoff distance for atoms in the intermediate region is read from the next line. We will use the value of 15 a.u.

RCUT_EXP: The cutoff distance for atoms in the ESP-area is read from the next line. We will use the value of 20 a.u.

UPDATE LIST: On the next line the number of MD steps between updates of the various lists of atoms for ELECTROSTATIC COUPLING is given. At every list update a file INTERACTING_NEW.pdb is created (and overwritten).

SAMPLE

INTERACTING: The sampling rate for writing a trajectory of the interacting subsystem is read from the next line. With the additional keyword OFF or a sampling rate of 0, those trajectories are not written.

ARRAYSIZES: Parameters for the dimensions of various internal arrays can be given in this block. The syntax is one label and the according dimension per line. The suitable parameters can be estimated using the script estimate_gromos_size.sh:

estimate_gromos_size.sh gromos_mod.top

This section of the input has to be terminated by a line

containing END ARRAYSIZES.

Now, we have all the elements to write a CPMD input file for a QM/MM simulation. But which steps do we need to perform a stable Car-Parrinello MD? We have the equilibrated coordinates at room conditions obtained from a classical MD: this is a good starting point, however we still need to equilibrate the system at QM/MM level since the two levels of theory are different.

Therefore, as we have done in the second paragraph, we should firstly optimize the geometry of the system. Unfortunately, all the optimizer algorithms in CPMD do not work together the QM/MM interface. Consequently, we have to use some “trick” to find a minimal energy structure (at QM/MM level). In particular in this tutorial we will perform a simulated annealing (keyword ANNEALING IONS), i.e. we run a CP-dynamics where and gradually removing kinetic energy from the nuclei by multiplying velocities with a factor (in our case it is set to 0.99, so 1% of the kinetic energy will be removed in every step). Here it is the annealing.inp file which perform this preliminary step:

&QMMM TOPOLOGY gromos_mod.top COORDINATES gromos.g96 INPUT gromos_mod.inp ELECTROSTATIC COUPLING LONG RANGE RCUT_NN 10 RCUT_MIX 15 RCUT_ESP 20 UPDATE LIST 100 SAMPLE_INTERACTING 0 AMBER ARRAYSIZES MAXATT 16 MAXAA2 11 MAXNRP 20 MAXNBT 15 MAXBNH 16 MAXBON 13 MAXTTY 14 MXQHEH 22 MAXTH 13

MAXQTY 10 MAXHIH 10 MAXQHI 10 MAXPTY 14 MXPHIH 28 MAXPHI 11 MAXCAG 11 MAXAEX 20024 MXEX14 22 END ARRAYSIZES &END &CPMD QMMM MOLECULAR DYNAMICS CP ISOLATED MOLECULE QUENCH BO ANNEALING IONS 0.99 TEMPERATURE 300 EMASS 600. TIMESTEP 5.0 MAXSTEP 10000 TRAJECTORY SAMPLE 0 STORE 100 RESTFILE 1 &END

&SYSTEM POISSON SOLVER TUCKERMAN SYMMETRY 0 CELL 18.61 1.11 0.95 0 0 0 CUTOFF 70. CHARGE 0.0 &END

&DFT FUNCTIONAL BLYP &END

&ATOMS *H_MT_BLYP.psp KLEINMAN-BYLANDER LMAX=S 6 4 5 6 8 9 10 *C_MT_BLYP.psp KLEINMAN-BYLANDER LMAX=P 3 2 3 7 *O_MT_BLYP.psp KLEINMAN-BYLANDER LMAX=P 1 1 &END

Some comments on the keywords in the &CPMD section which are not explained, yet: MOLECULAR DYNAMICS CP: Perform a molecular dynamics run.

CP stands for a Car-Parrinello type of MD. ISOLATED MOLECULE: Calculate the ionic temperature assuming that

the system consists of an isolated molecule or cluster.

QUENCH BO: the wavefunction is converged at the beginning of the MD run.

TEMPERATURE: The initial temperature for the atoms in Kelvin is read from the next line: we start from 300 K since it is the temperature at which we equilibrate the system classically.

EMASS: The fictitious electron mass in atomic units for the CP dynamics is read from the next line. We choose 600 a.u. but ideally a careful set of tests should be done to verify that adiabaticity conditions to be met21: this and the following one are the only parameters to tune in order to decouple the electronic and ionic degrees of freedom and minimize their energy transfer.

TIMESTEP: The time step in atomic units is read from the next line. We use the default time step of 5 a.u. ~ 0.12 fs.

MAXSTEP: The maximum number of steps for molecular dynamics to be performed. The value is read from the next line.

TRAJECTORY SAMPLE: Store the atomic positions, velocities and optionally forces at N every time step on the TRAJECTORY file. N is read from the next line. If N=0 the trajectory file will not be written.

STORE: The RESTART file is updated every N steps. N is read from the next line. Default is at the end of the run.

RESTFILE: The number of distinct RESTART files generated during CPMD runs is read from the next line. The restart files are written in turn. Default is 1.

Moreover, here it is a simple procedure to obtain the sizes of the box to insert under the CELL keyword by using standard bash commands:

• Create a file with the coordinate of the atoms belong to the QM part:

grep CET gromos.g96 > QM.coor

• Reorder the lines in the column 5 (6,7) for the coordinate x (y,z) with an

increasing numerical (-n) order:

sort -k 5 -n QM.coor

21 http://www.theochem.ruhr-‐uni-‐bochum.de/research/marx/marx.pdf

• Take the last value (L) of the column (in nm), subtract it to the first (F) one, add 0.7 nm (i.e. 2 * 3.5 Å for the Poisson solver’s requirements) and then convert it to a.u.:

echo "(L - F + 0.7)*10/0.529" | bc –l

• Finally remember that under the CELL keyword of the CPMD input file the

size of the quantum cell will be inserted according to the syntax:

size_x size_y/size_x size_z/size_x 0 0 0

1 - ANNEALING

We are now ready to run (better through the batch system) the annealing (copy all the file in a folder called ANNEALING): mpirun -np 2 cpmd.x annealing.inp /home/ei250498/PROGRAMS/SRC/cpmd/PP

> annealing.out &

While the simulation runs you can monitor the decreasing temperature (third column named TEMPP) this way:

tail -f annealing.out

When the temperature reaches about 2-4 K we can “softly” stop the calculation (that is in order to make it write a RESTART file) by typing on the prompt line:

touch EXIT

The final configuration will be stored in the RESTART.1 file. More files will be generated during a QM/MM CP simulation: QMMM_ORDER The first line specifies the total number of atoms (NAT) and the

number of quantum atoms (NATQ). The subsequent NAT lines contain, for every atom, the gromos atom number, the internal CPMD atom number, the CP species number isp and the number in the list of atoms for this species NA(isp). The quantum atoms are specified in the first NATQ lines.

CRD_INI.g96 Contains the positions of all atoms in the first frame of the

simulation in Gromos extended format (g96). CRD_FIN.g96 Contains the positions of all atoms in the last frame of the

simulation in Gromos extended format (g96). INTERACTING.pdb Contains (in a non-standard PDB-like format) all the QM atoms

and all the MM atoms in the electrostatic coupling NN list. The 5-th column in this file specifies the gromos atom number as defined in the topology file and in the coordinates file. The 10-th column specifies the CPMD atom number as in the TRAJECTORY file. The quantum atoms are labeled by the residue name QUA.

INTERACTING_NEW.pdb The same as before, but it is created if the file

INTERACTING.pdb is detected in the current working directory of the CPMD run.

MM_CELL_TRANS The QM system (atoms and wavefunction) is always re-

centered in the given supercell. This file contains, the trajectory of the re-centering offset for the QM-box. The first column ist the frame number (NFI) followed by the x-, y-, and z-component of the cell-shift vector.

ENERGIES Contains all the energies along the trajectory. Let’s give a closer look at the output file annealing.out which present some new sections: CAR-PARRINELLO MOLECULAR DYNAMICS PATH TO THE RESTART FILES: ./ ITERATIVE ORTHOGONALIZATION MAXIT: 30 EPS: 1.00E-06 MAXIMUM NUMBER OF STEPS: 10000 STEPS MAXIMUM NUMBER OF ITERATIONS FOR SC: 10000 STEPS PRINT INTERMEDIATE RESULTS EVERY 10001 STEPS STORE INTERMEDIATE RESULTS EVERY 100 STEPS STORE INTERMEDIATE RESULTS EVERY 10001 SELF-CONSISTENT STEPS NUMBER OF DISTINCT RESTART FILES: 1 TEMPERATURE IS CALCULATED ASSUMING AN ISOLATED MOLECULE

In CPMD, atoms are frequently referred to as ions, which may be confusing. This is due to the pseudopotential approach, where you integrate the core electrons into the (pseudo)atom which then could be also described as an ion. See for example the following output segment: FICTITIOUS ELECTRON MASS: 600.0000 TIME STEP FOR ELECTRONS: 5.0000 TIME STEP FOR IONS: 5.0000 QUENCH SYSTEM TO THE BORN-OPPENHEIMER SURFACE SIMULATED ANNEALING OF IONS WITH ANNERI = 0.990000 ELECTRON DYNAMICS: THE TEMPERATURE IS NOT CONTROLLED ION DYNAMICS: THE TEMPERATURE IS NOT CONTROLLED

This part of the output tells us, that the TIMESTEP keyword was recognized as well as the output option and that there will be no temperature control, i.e. we will do a microcanonical (NVE-ensemble) simulation. Then, several sections devoted to describe in detail the QM/MM interface and its data immediately follow: INITIALIZATION TIME: 1.05 SECONDS

*** MDPT| SIZE OF THE PROGRAM IS 95528/ 288728 kBYTES *** *** PHFAC| SIZE OF THE PROGRAM IS 97076/ 300980 kBYTES *** *** ATOMWF| SIZE OF THE PROGRAM IS 98100/ 302796 kBYTES *** ATRHO| CHARGE(R-SPACE): 24.000000 (G-SPACE): 24.000000

RE-CENTERING QM SYSTEM AT EVERY TIME STEP BOX TOLERANCE [a.u.] 7.00000000000000

BOX SIZE [a.u.] QM SYSTEM SIZE [a.u.] X DIRECTION: CELLDIM = 18.6100; XMAX-XMIN= 5.3773

Y DIRECTION: CELLDIM = 20.6571; YMAX-YMIN= 7.3868 Z DIRECTION: CELLDIM = 17.6795; ZMAX-ZMIN= 4.3595 >>>>>>>> QUENCH SYSTEM TO THE BORN-OPPENHEIMER SURFACE <<<<<<<< *** QUENBO| SIZE OF THE PROGRAM IS 112180/ 308536 kBYTES *** *** MM_ELSTAT| SIZE OF THE PROGRAM IS 112288/ 308536 kBYTES *** !!!!!!!!!!!!!!!!!! WARNING !!!!!!!!!!!!!!!!!!! THE QM SYSTEM DOES NOT HAVE AN INTEGER CHARGE. A COMPENSATING CHARGE OF 0.000040 HAS BEEN DISTRIBUTED OVER THE NN ATOMS. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! …

0.000040 e of charge to compensate: for all practical purposes this is zero! After the force initialization section finally the MD begins: NFI EKINC TEMPP EKS ECLASSIC EHAM EQM DIS TCPU 1 0.00006 297.0 -54.34715 -51.01094 -51.01088 -36.46342 0.218E-04 1.53 2 0.00056 293.9 -54.34671 -51.04477 -51.04421 -36.46309 0.863E-04 1.54 3 0.00136 290.9 -54.34661 -51.07860 -51.07724 -36.46309 0.191E-03 1.52 4 0.00201 287.9 -54.34663 -51.11193 -51.10992 -36.46319 0.335E-03 1.52 ... The individual columns have the following meaning:

NFI Step number (number of finite iterations) EKINC Fictitious kinetic energy of the electronic degrees of freedom TEMPP Temperature (= kinetic energy / degrees of freedom) for atoms

(ions) EKS Kohn-Sham energy; equivalent to the potential energy in

classical MD ECLASSIC The total energy in a classical MD, but not the conserved

quantity for CP-dynamics (ECLASSIC = EHAM - EKINC). EHAM Energy of the total CP-Hamiltonian; the conserved quantity. EQM Energy of QM part (electrons + nuclei contribution) DIS Mean squared displacement of the atoms from the initial

coordinates. TCPU Time needed for this step

Finally we get a summary of some averages and root mean squared deviations for some of the monitored quantities. This is quite useful to detect unwanted energy drifts or too large fluctuations in the simulation: RESTART INFORMATION WRITTEN ON FILE ./RESTART.1 **************************************************************** * AVERAGED QUANTITIES * ****************************************************************

MEAN VALUE +/- RMS DEVIATION <x> [<x^2>-<x>^2]**(1/2) ELECTRON KINETIC ENERGY 0.000867 0.300539E-03 IONIC TEMPERATURE 37.2640 52.8318 DENSITY FUNCTIONAL ENERGY -56.788301 0.907159 CLASSICAL ENERGY -56.369667 1.48329 CONSERVED ENERGY -56.368800 1.48357 NOSE ENERGY ELECTRONS 0.000000 0.00000 NOSE ENERGY IONS 0.000000 0.00000 CONSTRAINTS ENERGY 0.000000 0.00000 RESTRAINTS ENERGY 0.000000 0.00000 ION DISPLACEMENT 0.306287 0.932230E-01 CPU TIME 1.5641

2 – TEST

To verify that the reached configuration is physically "reasonable", a good test is to run a simulation in a NVE ensemble and monitoring the temperature (column 3) and the physical energy (column 5): if after some steps these two quantities stabilize (usually oscillating around a value of temperature smaller than 100 K) then we can be confident that the RESTART.1 file previously obtained is a good minimal energy structure. On the other hand, if energy and/or temperature continuously increase that means we have not a good structure and another annealing procedure is required by starting from another point (for example after heating the system at 300 K in order to move the system away from that “wrong” energy potential basin). The test can be accomplished by the following procedure:

• Create a new folder:

mkdir TEST

cd TEST

• Copy the following files from the previous calculation:

cp ../gromos* .

cp ../RESTART.1 RESTART

cp ../annealing.inp test.inp

• Modify the test.inp file in order to change the &CPMD section so as to

appear:

&CPMD RESTART COORDINATES VELOCITIES WAVEFUNCTION QMMM MOLECULAR DYNAMICS CP ISOLATED MOLECULE EMASS 600. TIMESTEP 5.0 MAXSTEP 3000 TRAJECTORY SAMPLE 0 &END

Note: in the RESTART line there is no LATEST option: this way CPMD expects to read data from a file named exactly “RESTART”.

• Run the test (better through the batch system):

mpirun -np 8 cpmd.x test.inp /home/ippoliti/PROGRAMS/archive/CPMD/PP > test.out

• Monitor the simulation:

tail –f test.out

• When it ends you can plot on a graph the temperature and the physical energy

by gnuplot: gnuplot p 'ENERGIES' u 1:3 w l p 'ENERGIES' u 1:5 w l quit

3 – HEATING

If the test went well, we can come back to the configuration obtained by annealing and start heating the system to room temperature. There are several methods implemented in CPMD to do that. We choose to increase the target temperature of a thermostat (coupled to the system) linearly at each step by performing a usual CP dynamics. A simple Berendsen-type thermostat22 will be applied. Two more keywords are required in &CPMD section with respect the previous input file:

1. TEMPERATURE with the option RAMP; 3 numbers has to be specified on the line below the keyword: initial and target temperature in K and the ramping speed in K per atomic time unit (to get the change per timestep you have to multiply it with the value of TIMESTEP). Read the initial temperature from the output file of the annealing procedure.

2. BERENDSEN with the option IONS; 2 numbers has to be specified on the line below the keyword: the target temperature (the initial one in our case) and the time constant τ in a.u. (0.12 ps is a reasonable value).

The procedure to accomplish the heating can be summarized this way: 22 H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, A. DiNola, J. R. Haak J. Chem. Phys, 81, 3684 (1984).

• Create a new folder:

mkdir HEATING

cd HEATING

• Copy the following files from the previous calculation:

cp ../ANNEALING/gromos* . cp ../ANNEALING/RESTART.1 RESTART cp ../TEST/test.inp heating.inp

• Modify heating.inp according the rules above mentioned:

vi heating.inp

by adding the two following lines in the &CPMD section:

BERENDEN IONS 3.8 5000

TEMPERATURE RAMP 3.8 340.0 1

• Monitor the temperature:

tail –f heating.out

• If the temperature reach approximately the target temperature even before the

MAXSTEP number of steps are performed you can stops the simulation:

touch EXIT

4 – PRODUCTION RUN We are finally ready to run a CP molecular dynamics in room conditions. To do that, as usual, we will create a new folder:

mkdir PRODUCTION-RUN

cd PRODUCTION-RUN

and then we will copy the necessary files there in order to start the calculation from the last configuration got from the heating procedure:

cp ../HEATING/gromos* . cp ../HEATING/RESTART.1 RESTART cp ../HEATING/heating.inp cpmd.inp

To run CP molecular dynamics we need to modify the previous input file:

• We want to restart from previous wavefunction, coordinates and velocities since we want to conserve the temperature information from the RESTART file. Therefore we will preserve the option VELOCITIES in the RESTART keyword and we will remove TEMPERATURE keyword. We will replace the Berendsen thermostat with the Nose-Hoover chains23: this because the second kind of thermostat preserves the Maxwell distribution of the velocities and so it is more physically meaningful. In more technical words, it provides an NVT ensemble for a system in equilibrium. The keyword that turns it on is NOSE, and then you have to specify the degrees of freedom to which you want to apply it (IONS); the target temperature in Kelvin and the thermostat frequency in cm-1 are read from the next line: NOSE IONS 300 4000

For the choice of frequency at which the energy transfer happens, you have only to pay attention not to select a resonance vibrational frequency of your system.

• We will reintroduce the keyword MAXSTEP to perform 10000 steps (or more if you have time: typical CPMD trajectories nowadays are tens of ps long!)

• Finally, we want to save 10 restart files (i.e. configurations from which we will calculate the dipole moment of acetone) equally separated in time along the trajectory. We can do that by properly using the keyword RESTFILE and STORE: STORE 100 RESTFILE 10

This way CPMD will create ten restart files in sequence called RESTART.1, RESTART.2, …, RESTART.10 each one after 1000 steps of dynamics.

Running meaningful Car-Parrinello dynamics simulation requires adiabaticity conditions to be met, i.e. the separation of the electronic and ionic degrees of freedom. Theoretically such separation can be achieved by separating the power spectrum of the orbital classical fields from the phonon spectrum of the ions (the gap between the lowest electronic frequency and the highest ionic frequency should be large enough). Since the electronic frequencies depend on the fictitious electron mass EMASS one should carefully optimize its value and rise the lowest frequency appropriately. The adiabaticity can be observed by running test simulations and looking at the energy components. In particular, the fictitious kinetic energy of the electronic degrees of freedom (EKINC,

23 S. Nosé and M. L. Klein, Mol. Phys. 50, 1055 (1983); S. Nosé, Mol. Phys. 52, 255 (1984); S. Nosé, J. Chem. Phys. 81, 511 (1984); S. Nosé, Prog. Theor. Phys. Suppl. 103, 1 (1991); W. G. Hoover, Phys. Rev. A 31, 1695 (1985).

second column in the ENERGIES file) might have a tendency to grow. However, after an initial transfer of a little kinetic energy, the electrons should be much “colder" than the ions, since only then will the electronic structure remain close to the Born-Oppenheimer surface and thus the wavefunction and forces derived from it will be meaningful. Therefore, we must always monitor the behavior of the EKINC in order to verify the system keeps being in the adiabatic regime: TOTAL INTEGRATED ELECTRONIC DENSITY IN G-SPACE = 24.000000 IN R-SPACE = 24.000000 (K+E1+L+N+X+Q+M) TOTAL ENERGY = -57.87866446 A.U. (K+E1+L+N+X) TOTAL QM ENERGY = -36.47942711 A.U. (Q) TOTAL QM/MM ENERGY = 0.00000000 A.U. (M) TOTAL MM ENERGY = -21.38112962 A.U. DIFFERENCE = -0.01810773 A.U. (K) KINETIC ENERGY = 27.71754549 A.U. (E1=A-S+R) ELECTROSTATIC ENERGY = -27.62122216 A.U. (S) ESELF = 29.92067103 A.U. (R) ESR = 1.68558813 A.U. (L) LOCAL PSEUDOPOTENTIAL ENERGY = -29.41090607 A.U. (N) N-L PSEUDOPOTENTIAL ENERGY = 3.57237860 A.U. (X) EXCHANGE-CORRELATION ENERGY = -10.73722297 A.U. GRADIENT CORRECTION ENERGY = -0.58313547 A.U. NFI EKINC TEMPP EKS ECLASSIC EHAM EQM DIS TCPU 1 0.00001 302.5 -57.87866 -54.48084 -54.48082 -36.47943 0.538E-05 1.50 2 0.00011 302.3 -57.87721 -54.48093 -54.48082 -36.47936 0.214E-04 1.47 3 0.00027 302.2 -57.87578 -54.48109 -54.48082 -36.47937 0.480E-04 1.35 4 0.00037 302.0 -57.87431 -54.48118 -54.48081 -36.47937 0.849E-04 1.50 5 0.00039 301.9 -57.87280 -54.48119 -54.48080 -36.47937 0.132E-03 1.35 6 0.00039 301.8 -57.87125 -54.48118 -54.48079 -36.47937 0.190E-03 1.35 7 0.00037 301.6 -57.86966 -54.48116 -54.48078 -36.47937 0.258E-03 1.67 8 0.00037 301.5 -57.86803 -54.48115 -54.48078 -36.47936 0.336E-03 1.35 9 0.00037 301.3 -57.86636 -54.48114 -54.48077 -36.47936 0.425E-03 1.46 NBPML: 80733 ELEMENTS IN THE PAIRLIST 10 0.00037 301.2 -57.86466 -54.48114 -54.48077 -36.47935 0.523E-03 1.37 11 0.00036 301.0 -57.86291 -54.48113 -54.48077 -36.47934 0.632E-03 1.46 12 0.00035 300.9 -57.86112 -54.48111 -54.48076 -36.47932 0.752E-03 1.35 13 0.00035 300.7 -57.85927 -54.48107 -54.48073 -36.47931 0.881E-03 1.35 14 0.00034 300.5 -57.85740 -54.48106 -54.48072 -36.47929 0.102E-02 1.46 15 0.00033 300.4 -57.85551 -54.48106 -54.48073 -36.47928 0.117E-02 1.35 16 0.00032 300.2 -57.85357 -54.48105 -54.48073 -36.47927 0.133E-02 1.46 17 0.00031 300.0 -57.85159 -54.48103 -54.48072 -36.47926 0.150E-02 1.35 18 0.00030 299.8 -57.84956 -54.48101 -54.48070 -36.47925 0.169E-02 1.51 19 0.00030 299.7 -57.84751 -54.48099 -54.48070 -36.47924 0.188E-02 1.42

Ensuring adiabaticity of CP dynamics consists of decoupling the two subsystems and thus minimizing the energy transfer from ionic degrees of freedom to electronic ones. In this sense the system during CP dynamics simulation should be kept in a metastable state.

Hint: any time you notice something strange (and even if you do not!) I very useful suggestion is ALWAYS look at your simulations by some visualization tool: the most of the problem are immediately identified at a glance…

To visualize a CP-‐MM simulation you can use vmd:24

vmd -g96 CRD_INI.g96 -cpmd TRAJECTORY

5 – DIPOLE CALCULATION

As a last step we want to calculate the dipole moment for each snapshot saved in ten restarts file previously collected and then make the mean value in order to give this way an estimate of the temperature and entropic effects due to the environment formed by the solvent. Create a new folder and copy the restart files and the other needed files in this directory:

mkdir DIPOLE_CALCULATION

cp ../ PRODUCTION-RUN/RESTART.* .

cp ../ PRODUCTION-RUN/gromos* .

cp ../ PRODUCTION-RUN/cpmd.inp dipole.in

cp ../PRODUCTION-RUN/LATEST .

and modify the input file to calculate the dipole moment as you have learnt in the second paragraph: … &CPMD QMMM RESTART WAVEFUNCTION COORDINATES LATEST PROPERTIES RESTFILE 0 &END &PROP DIPOLE MOMENT &END … The keyword LATEST is inserted in order to perform the ten calculations in a comfortable way: in fact, if you want CPMD read the RESTART.X file you only need to replace that name in the first line of the file LATEST. The keyword RESTFILE with value 0 guarantees that CPMD does not write a RESTART file at the end of

24 Any command of vmd can be also execute by the GUI of the program: refer to its manual for details.

each calculation (we do not need it, now), so avoiding it overwrites the ones we have copied from the PRODUCTION-‐RUN folder. So, you can perform (better through the batch system) the following sequence of very fast calculations (less than 3 sec!), being careful to modify the name in the first line of the file LATEST each time: mpirun -np 2 cpmd.x dipole.inp /home/ei250498/PROGRAMS/SRC/cpmd/PP >

dipole1.out &

mpirun -np 2 cpmd.x dipole.inp /home/ei250498/PROGRAMS/SRC/cpmd/PP > dipole2.out &

mpirun -np 2 cpmd.x dipole.inp /home/ei250498/PROGRAMS/SRC/cpmd/PP >

dipole3.out &



dipole5.out &



dipole7.out &



dipole9.out &


Note: you could have skipped this procedure by using the keyword DIPOLE DYNAMICS SAMPLE 1000

in the previous CP calculation. Such a keyword makes CPMD save the coordinates of the dipole moments every 1000 steps on a file called DIPOLE. An example is saved in the folder:

/work/Tutorials/ACETONE/9-CPMD-MM/4-PRODUCTION-RUN/DIPOLE/

However, these values are polarizations (in a.u.): they have to be multiply by the volume to obtain the correct values of the dipole in a.u. At this point you have only to extract each value of the dipole moment:

grep -B 1 -A 4 "X Y" dipole1.out

grep -B 1 -A 4 "X Y" dipole2.out …

and calculate the arithmetic mean D. What is the difference with respect the value in vacuum we calculate in the second paragraph? What is the qualitative reason of such a difference? The solvent effect can be separated in two different components: the geometry variation of the solute with respect to the gas phase (temperature effect), and the different polarization of the wavefunction. How could you proceed if you wanted to identify the contribution of the two parts?

qm/mm tutorial

Documents