amber1

38
Chemistry 348 Spring 2014 RUNNING & VISUALIZING TRAJECTORIES WITH AMBER & VMD Developed by Ross Walker (one of the AMBER molecular simulation program developer) and adapted by D. Kohen INTRODUCTION This lab consists in a series of exercises that will allow you to become an AMBER (Assisted Model Building with Energy Refinement) and VMD (Visual Molecular Dynamics) novice user (but a user nevertheless). While doing this you will learn a little more about running Molecular Dynamics trajectories in general and in particular about the effects on the solvent on the behavior of a 10-mer A/T DNA molecule. As you work through the assignment you might need to consult the AMBER (http://ambermd.org/doc12/) and VDM (http://www.ks.uiuc.edu/Research/vmd/current/ug/ug.html) manuals. Other important and potentially useful links are http://ambermd.org/ (AMBER main web page), http://ambermd.org/tutorials/ (AMBER tutorials page), http://www.ks.uiuc.edu/Research/vmd/ (VMD main web page) and http://www.ks.uiuc.edu/Research/vmd/current/ug/node15.html (rapid introduction to VMD page). You will be running AMBER both at the mu3c cluster and locally (in your mac machine). When using your local machine you will use the terminal utility of your mac. The terminal utility is an UNIX environment (the operative system most often used to run scientific software). Most of this lab will be done only in your local machine, later – when you need it- I will let you know how to access and use AMBER in the cluster. To run a terminal you could go to the applications folder and find the terminal utility in the utilities folder. You will be typing most commands in that window, and thus you should assume that unless I tell you otherwise it is there where you should write your commands. Throughout this tutorial filenames and command lines will be written in courier or an equivalent monospace font while program names such as sander will be written in the same font but italicized. Input files will be colored in red and output files in green. Amber related instructions will be shadowed in green and command lines that you need to

Upload: duc-binh-nguyen

Post on 24-Dec-2015

223 views

Category:

Documents


6 download

DESCRIPTION

cxfdggchvbjnmbvcf

TRANSCRIPT

Page 1: Amber1

Chemistry 348 Spring 2014RUNNING & VISUALIZING TRAJECTORIES WITH AMBER & VMD

Developed by Ross Walker (one of the AMBER molecular simulation program developer) and adapted by

D. Kohen

INTRODUCTION

This lab consists in a series of exercises that will allow you to become an AMBER (Assisted Model

Building with Energy Refinement) and VMD (Visual Molecular Dynamics) novice user (but a user

nevertheless). While doing this you will learn a little more about running Molecular Dynamics trajectories

in general and in particular about the effects on the solvent on the behavior of a 10-mer A/T DNA

molecule.

As you work through the assignment you might need to consult the AMBER

(http://ambermd.org/doc12/) and VDM (http://www.ks.uiuc.edu/Research/vmd/current/ug/ug.html)

manuals.

Other important and potentially useful links are http://ambermd.org/ (AMBER main web page),

http://ambermd.org/tutorials/ (AMBER tutorials page), http://www.ks.uiuc.edu/Research/vmd/ (VMD main

web page) and http://www.ks.uiuc.edu/Research/vmd/current/ug/node15.html (rapid introduction to VMD

page).

You will be running AMBER both at the mu3c cluster and locally (in your mac machine). When

using your local machine you will use the terminal utility of your mac. The terminal utility is an UNIX

environment (the operative system most often used to run scientific software). Most of this lab will be

done only in your local machine, later –when you need it- I will let you know how to access and use

AMBER in the cluster. To run a terminal you could go to the applications folder and find the terminal

utility in the utilities folder. You will be typing most commands in that window, and thus you should

assume that unless I tell you otherwise it is there where you should write your commands.

Throughout this tutorial filenames and command lines will be written in courier or an

equivalent monospace font while program names such as sander will be written in the same font but

italicized. Input files will be colored in red and output files in green. Amber related instructions will be

shadowed in green and command lines that you need to type in yellow.

Before really starting you need to learn a few super useful unix commands by using them in your

terminal window. To change directory (folder in mac speak) to the desktop type: cd Desktop and then

hit return. Note that there are no spaces after Desktop, that unix is case sensitive and that you can copy

from this word document into your terminal window by first copying (apple c) in the word document

pasting (apple v) on your terminal window. Finally note that in unix there is a meaning behind the

letters: in this case cd stands for change directory – it is important to try to see the logic otherwise it is

almost impossible to remember any unix command. To see what is in the directory, type ls –ltr (ls: list,

- indicates that options are to follow, l is an option that make the list have a list form, t to indicate the

time that the file was made and r to indicate that the order should be reversed –last item in the list is the

last item that was created) and then hit return (you always need to hit return, I will not tell you any more

Page 2: Amber1

to do it). Make a new directory by typing mkdir my_amber_1 (mkdir: make directory) and then change

your directory to the new directory. (To go back up a directory type cd .. Note how the “prompt” is

displayed as something like “dkohen47010:my_amber_1 dkohen$”, first is the name of my computer but

after the colon is the name of the directory you are in followed by my user name. Now look in the

desktop and see how the folder (directory) you just created shows up. Another useful command is pwd

(pathway directory), it shows you the path of directory you are at (in my case

“/Users/dkohen/Desktop/my_amber_1”).

Also a cool feature of unix that I need to tell you is that if you hit the up arrow the last command shows,

and can then be executed by hitting return. You can even edit the line before hitting return!

Also, you need to tell your computer where the amber files are. To do that at the promt type export

AMBERHOME=/Applications/Amber/amber12. Now, whenever you type $AMBERHOME your computer will

interpret it as /Applications/Amber/amber12 which is the location of the amber files.

REPORTING

Your report will be a collection of figures and short answers. Those figures and

questions will be asked thought the hand–out (in pink!). Your report should also

include:

A list of all the words that you do not understand.

A flow-chart indicating which programs needed which input files and how the output

of those programs were used in the calculations that follow.

An outline of the handout. Use the headings that are numbered and provide a very

brief explanation of each sub-section’s point and list the commands issued (do copy

the commands) within it.

A list of all the programs within AMBER you used and a short description of each.

A complete flow chart (there is few names of output and input files missing).

Download the incomplete chart from the moodle page.

AMBER TUTORIALS

Before getting started with AMBER take a look at AMBER’s main page at http://ambermd.org/ and also

the portal for the tutorials at: http://ambermd.org/tutorials/

ADAPTED DNA TUTORIAL: SIMULATING A DNA POLYA-POLYT DECAMER

Copyright Ross Walker 2010

(http://ambermd.org/tutorials/basic/tutorial1/)

This tutorial will act as a basic introduction to LEaP, sander and ptraj, to build, solvate, run

molecular dynamics and analyze trajectories. It will also cover visualizing trajectories using VMD. This

tutorial is an adaptation of the main DNA tutorial provided with the AMBER 10 software. Its aim is to act

as a brief introduction to running classical molecular dynamics simulations using the AMBER software.

2

Page 3: Amber1

In this tutorial we will create a initial structure for a 10-mer of DNA and then we will run gas

phase, implicit and explicit solvent simulations on it. Finally we will look at a practical example of how MD

simulations can be used to investigate how A-DNA can convert to B-DNA.

(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite

to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do

not necessarily provide the optimal choice of parameters or methods for the particular application area.)

Pictured below is the average structure from a 1 nanosecond molecular dynamics simulation of a 10 base

pair poly(A)-polt(T) DNA duplex. The calculation was run in explicit solvent using periodic boundaries and

the particle mesh Ewald method of treating long range electrostatics. The average structure was

generated using ptraj by Root Mean Square (RMS) fitting all of the DNA atoms in 1,000 snapshots at 1

ps intervals and then averaging the coordinates.

The paragraph above describing the figure below has several words referring to MD that you just learned

about. List them.

1) Introduction

The purpose of this tutorial is to provide an initial introduction to setting up and running simulations using

the AMBER software (It is based on AMBER 10 and AmberTools 1.2 but should work reasonably well with

AMBER 9 and with newer versions of the AMBER software as they are released). In this tutorial we run a

series of simulations on a poly(A)-poly(T) decamer of DNA. We will first figure out how to generate a

starting structure and then use this structure to construct the necessary input files for running sander,

the main molecular dynamics engine supplied with AMBER 10.

In order to run a classical molecular dynamics simulation with Sander a number of files are required.

These are (using their default filenames):

3

Page 4: Amber1

• prmtop - a file containing a description of the molecular topology (this is the top part of the name) and

the necessary force field parameters (this is the prm part of the name).

• inpcrd (name derives from input coordinates) (or a restrt [name derives from restart] from a previous

run) - a file containing a description of the atom coordinates and optionally velocities and current

periodic box dimensions.

• mdin (name derives from molecular dynamics input) the sander input file consisting of a series of

namelists (lists of names, duh!) and control variables that determine the options and type of

simulation to be run.

In the first section of this tutorial we shall use the tools provided with AmberTools 1.2 to create prmtop

and inpcrd files for both in vacuo and solvated systems. We will then run sander to perform minimization

followed by molecular dynamics and eventually get to the point where we can reproduce the picture

shown above.

Since running these simulations using explicit solvent can be expensive, we will also use some models

that include solvent effects implicitly.

The approximate order of this tutorial will be as follows:

1 Create the prmtop and inpcrd files: This is a description of how to generate the initial structure and

set up the molecular topology/parameter and coordinate files necessary for performing minimization

or dynamics with sander.

2 An introduction to minimization and molecular dynamics. Run short MD simulations in-vacuo. Perform

basic analysis such as calculating RMSdeviation and plotting various energy terms as a function of

time and visualizing results with VMD.

3 Minimization and molecular dynamics in implicit solvent: Setting up and running equilibration and

production minimization and molecular dynamics simulations for our DNA model using the Born

implicit solvent model.

4 Minimization and molecular dynamics in explicit solvent: Setting up and running equilibration and

production simulations for our DNA model using TIP3P explicit water.

2) Setting up duplex DNA: polyA-polyT

The first stage of the tutorial is to create the necessary input files required by sander for performing

minimization and molecular dynamics.

There are a number of methods for creating these input files. In this example we will use a program

written specifically for this purpose: LEaP. This is a program that reads in force field, topology and

coordinate information and produces the files necessary for production calculations (i.e. minimization,

molecular dynamics, analysis). There are two versions of this program provided with AMBER 10 &

AmberTools 1.2. A graphical version called XLEaP and a terminal interface called tleap. (There are also

4

Page 5: Amber1

completely new versions being produced called gleap and sleap respectively that will ultimately replace

XLEaP and tleap but discussion of this is beyond the scope of this tutorial.) Since we want to "see"

graphical representations of our models we will use XLEaP in this tutorial. (If you were to do this step

remotely, in cluster, you would use tleap).

The approximate ordering of this section is as follows:

1 Where do I get the coordinates?

2 What representation should I use and what should I simulate? And discussion of issues to consider

before starting...

3 Building the prmtop and inpcrd files... This will be done using XLEaP.

2.1) Generating the coordinates of the model structure

The first step in any modeling project is developing the initial model structure. Although in principle, one

could use XLEaP to build a model structure by hand this is only practical for the smallest of systems. The

difficulty in both manipulating and predicting the structure of large biomolecular systems means that

building a structure by hand is not usually a sensible undertaking. Instead, experimentally determined

structures are used. These can be found by searching through databases of crystal or NMR structures

such as the Protein Data Bank or the Cambridge Structural Database. With nucleic acids, users can also

search the Nucleic Acid Database.

When experimental structures are not available all hope is not lost since there are a variety of programs

that facilitate building model structures using homology modeling and predictive techniques; the list of

possible sources is beyond the scope of this tutorial. However, it is worth mentioning that for nucleic acid

structure prediction Dave Case and Tom Macke, formerly of The Scripps Research Institute have

developed the NAB molecular manipulation language, which facilitates the building of complex nucleic

acid structures. NAB is now included as part of the AmberTools package (see http://www.ambermd.org/

for more info). Numerous methods also exist for predicting the structure of proteins but in general such

structure prediction is still in its infancy. Thus a good experimental structure is typically preferred. If,

however, a predicted protein starting structure is all that is available it should be noted that these

typically require more elaborate minimization and equilibration procedures prior to production dynamics

simulations than do structures found by experimental methods.

Hurdle #1: Note: as pointed out in the later tutorials, sometimes dealing with Brookhaven PDB files (as

provided by the Protein Data Bank) can be rather tricky due to variations in naming conventions. The

naming, and often formatting issues, that "break" the programs are the first hurdle that must be

overcome in order to perform detailed molecular simulation. Generally within AMBER, if you are using

standard amino acid or nucleic acid residues, at most all that is necessary is some slight massaging of

the PDB format and atom names. If, however, you wish to simulate complex carbohydrates, lipids or non-

standard protein residues then you may be required to develop parameters and topology information.

This, however, is a more advanced issue that will be deferred until later tutorials.

5

Page 6: Amber1

2.1.1) Creating our DNA duplex using NAB

Since there is not an experimentally determined structure for our 10-mer DNA duplex, one has to be

created from scratch. A good resource is w3dna.rutgers.edu, which allows one to create a wide variety of

nucleic acid structures. Here, we will use the fd_helix() routine in NAB (part of AmberTools).

A note on force fields

A number of different force fields are supplied with AMBER. In AMBER v5.0 and v6.0 the default field was

the Cornell et al. (1995) or parm94.dat force field (referred to as FF94 in AMBER v8.0 and later). The

AMBER v10.0 force fields recommended for the simulation of proteins and nucleic acids in explicit solvent

are either the FF99SB or FF03 force fields which contain several improvements over the FF94 force field.

The most notable changes are new partial charges for proteins based on DFT quantum calculations in

continuum solvent (FF03), as well as updated torsion terms for Phi-Psi angles (FF99SB) which improve the

over estimation of alpha helices that occurs when using the FF94, FF96 and FF99 force fields. It should be

noted, however, that the FF99SB/FF03 force fields do not introduce any new changes for nucleic acids.

The charges are still based on HF gas phase ab initio quantum calculations and the bond angle and

dihedral parameters are the same as the FF99 force field hence FF99SB and FF99 can be considered

equivalent in this context. In this tutorial we will be using the FF99SB all-atom force field. More details on

the available force fields can be found in the AMBER manual and the papers referenced within.

With the FF99SB force field, phosphates are considered to be part of the nucleotide (as is standard).

Because the terminal groups of DNA are different (i.e. 3'- and 5'- terminal residues), the names have to

be different to associate the appropriate topology (i.e. list of bonds, atoms, etc.) and parameters. For

DNA, the residue names used are DA5, DA and DA3 for a 5' terminal adenine, a non terminal adenine, or

a 3' terminal adenine respectively. [A single isolated adenine nucleotide is DAN.]

So, with this in mind we can construct the input file required by NAB to build our 10-mer polyA-polyT DNA

duplex in the Arnott B-DNA canonical structure. This input file is given below. Basically, this is building

two strands of A-T paired DNA. For more specific information about the various options, see the manual

at http://ambermd.org/doc12/

molecule m;

m = fd_helix( "abdna", "aaaaaaaaaa", "dna" );

putpdb( "nuc.pdb", m, "-wwpdb");

You should copy this text to a file named nuc.nab in your my_amber_1 directory. To do this open

TextWrangler applications (in your applications folder). We will use TextWrangler because it allows you to

save in a format that is compatible with unix. Make sure that the blinking cursor on the yellow line is the

line following the last character you typed (this should be the case for all the documents you make as

6

Page 7: Amber1

input for Amber) Note that all text that you will need to copy to a file is within a box.

Running NAB

To run NAB, you just type this:

$AMBERHOME/bin/nab nuc.nab (and then hit return! – this is the only time I will remind you to do so).

Then type

./a.out

The first line uses nab (which is located in $AMBERHOME/bin/, actually nab is located in

/Applications/Amber/amber12/bin but $AMBERHOME/ is a shortcut for as /Applications/Amber/amber12)

to compile the “script” described in your nuc.nab input file and creates a little program that is executed

when typing the second line.

These commands produce a nuc.pdb file, which is the model structure of our DNA duplex. It contains the

Cartesian coordinates of all of the atoms in the duplex in locations determined from fibre-diffraction data.

The file should consist of 640 lines with a single line for each atom, 638 atoms in total as well as two lines

containing the word TER, one after the first 10mer chain and one at the end:

From TextWrangler open nuc.pdb. (I know, it is not very exciting…). Make sure you close the document.

2.1.2) Loading the structure into Leap

The next step is to take a look at the model structure. It is always a good idea to look at the models

before trying to use them. In this way problems can often be identified before running expensive

calculations. XLEaP works fine for displaying models assuming that the appropriate residue definition files

are loaded into XLEaP and the residue names in the pdb file are consistent with what XLEaP expects (in

other words, it does not always work). Alternatively a range of freely available and commercial packages

exist for viewing pdb files. A very good program, although not the only choice, is VMD

(http://www.ks.uiuc.edu/Research/vmd/), which is freely available for academic research and should be

installed on your computers. For the moment we will stick to using XLEaP since this is what we will be

using to create the input files for our simulations. Other methods of visualization will be covered in later

sections of this tutorial.

Let's take a look at our DNA model. The first step is to start up the graphical version of XLEaP, to do this

while in my_amber_1 type after the prompt

$AMBERHOME/bin/XLEaP -s -f $AMBERHOME/dat/leap/cmd/leaprc.ff99SB

($AMBERHOME points the computer to where the AMBER files are, - s –f precedes the name of the force 7

Page 8: Amber1

file files, in this case ff99SB)

This should load XLEaP and you should see something like this appear:

A quick note on the command line used to start XLEaP: The command line shown above contains a

couple of options, which are worthy of comment at this point. When XLEaP loads it initially has to open a

series of library and parameter files that define the force field parameters to be used and the residue

maps etc. Since AmberTools ships with a range of different force fields, each suited to different types of

simulation, it is important to tell XLEaP which force field we wish to use. This is what the command line

options used above are for. The "-s" switch tells XLEaP to ignore any user defined defaults which might

otherwise override our selection, while the "-f $AMBERHOME/dat/leap/cmd/leaprc.ff99SB" switch tells

XLEaP to execute the start-up script for the FF99SB force field. This script contains commands that cause

XLEaP to load all of the configuration files required for the AMBER FF99SB force field. If you look in the

$AMBERHOME/dat/leap/cmd/ directory you will find a number of different leaprc files such as leaprc.ff03

(FF03 force field), leaprc.ff02ep (FF02 Polarizable force field with lone pairs) etc.

XLEAP MENU BUG: Note if you find that the menu's in XLEaP are not working please check that your

numlock light is turned off. For some reason having numlock on prevents the XLEaP menus from

operating correctly.

To load a pdb into XLEaP we will use the loadpdb command. This will create a new unit in XLEaP and load

the specified pdb into that unit. We can then subsequently view and edit the new unit using the edit

8

Page 9: Amber1

command.

So, to load the pdb file into a new unit called "model" type the following in the XLEaP window (make sure

the XLEaP window is highlighted and your mouse cursor is within the window):

model

then in the XLEaP menu bar choose File/Load PDB file and select nuc.pdb

The following informational message should appear:

To see the structure, in XLEaP, use the edit command and specify the unit you wish to edit, in this case

the one we just created "model": edit model

This should open the XLEaP editor window and display something similar to the following (the initial

molecular orientation may differ from that shown below):

If you have a mouse within this window the LEFT mouse button allows atom selection (drag and drop, and

brake! Be careful), the RIGHT mouse button translates it, and control-RIGHT mouse button zooms it. To

rotate press the control key and hold your mouse (but do not use the mouse buttons). If you selected

9

Page 10: Amber1

something and then broke your DNA, go to UNIT and close the window, and the remake model (using

loadpdb again). Make sure to close the window by selecting Unit -> Close from the top menu [Do not use

the X button as it will quit all of XLEaP]. (If the menu's don't work turn off NumLock).

If you played with selecting atoms using the left mouse button you can unselect a region by holding down

the shift key while drawing the selection rectangle. To select everything, double click the LEFT button

(and to unselect, do the same while holding down the shift key).

Take a look at the structure and copy a pretty view into your report. Write a nice caption for your figure

(you should do this every time you copy a picture – this is the only time that I will ask you to do so

explicitly). You should be able to see the perfect symmetry of canonical B-form geometry DNA. The

perfect symmetry of canonical duplexes is based on analysis of long fibers of DNA. Real nucleic acids

don't necessarily adopt this perfect symmetry as will become apparent once we start to carry out

molecular dynamics on this 10-mer.

Note: for a list of available XLEaP commands type "help" in the main XLEaP window. For help on a

specific command type "help command". E.g. for help with loadpdb you should get the following on

typing help loadpdb:

Now, let's keep going and assume everything was set up properly; the distraction above was simply to

give you a little insight to what goes on behind the scenes... Remember, using software like it is a black

box is dangerous, especially in research.

2.2) What level of simulation am I going to attempt?

Once you have got a suitable model structure the next step is to decide what level of simulation realism

is to be used. The complexity of the calculation centers on the evaluation of the pair-wise non-bonded

and Coulombic interactions. Extra complexity is introduced by using periodic boundaries and Ewald 10

Page 11: Amber1

methods to treat long ranged electrostatics and/or by evaluating non-additive effects such as induced

polarization.

Water is an integral part of nucleic acid structure and thus some representation of solvent effects is fairly

critical. Simulations in vacuo have been performed where the screening of solvent is modeled by distance

dependent or sigmoidal dielectric functions (the latter of which is not implemented in AMBER 10.0).

Additionally tricks have been applied to keep the base pairs from fraying, through the addition of Watson-

Crick base pair restraints and the reduction of the charges on the phosphate groups. Newer versions of

AMBER (6.0 and above) contain the generalized Born model for implicit solvation which, although more

expensive, provides a much better implicit solvent representation than simply using a distance

dependent dielectric constant.

However, even with advances in computer power and methodological improvements, such as application

of Ewald methods, which allow routine simulations of nucleic acids with explicit solvent and counterions

in the nanosecond time range, there is still dependence of the results on the molecular mechanical force

field. It is therefore important to understand the inadequacies of the force field being used.

Now, back to the point of this discussion, if you can afford it, include the solvent and the explicit net-

neutralizing counterions. Also pay attention to the force field applied and be aware of its limitations. Use

methods that properly treat long-range electrostatic interactions, such as Ewald methods, if you can.

However, remember that adding explicit water is expensive. While a nanosecond or so of in vacuo DNA

simulation can take only minutes on a 3GHz P4, adding a periodic box of water that surrounds the DNA

by roughly 10 angstroms, extends the simulation to several days. Given that it is normally necessary to

run the simulation a couple of times (due to errors, sampling issues, etc.), these simulations can get very

costly.

2.2.1) The types of simulations to be run in this tutorial

For the purpose of this tutorial we will build 3 different DNA models. The first will be an in vacuo model of

the poly(A)-poly(T) structure (named polyAT_vac), an in vacuo model of the poly(A)-poly(T) structure with

explicit counterions (named polyAT_cio), and a TIP3P (water) solvated model of the poly(A)-poly(T)

structure in a periodic box (named polyAT_wat). The in vacuo model will be applied in simulations to get

a feel for MD and then the solvated model will be used for periodic boundary simulations using a particle

mesh Ewald treatment. The in vacuo model with the explicit ions will not be used for simulation but it is a

good idea to build it in case it is needed for later analysis.

11

Page 12: Amber1

In order simplify post simulation analysis of the trajectories it is useful to have all three sets of prmtop

files. This is because often in the analysis of the trajectory displaying the solvent is not normally

necessary and the visualization packages will run much faster if the solvent is removed from the

trajectory file before loading. Obviously the water is necessary for calculating radial distribution

functions, analyzing water structure, and other properties, however it isn't necessary for calculating

helicoidal parameters, determining average structures, etc. Therefore, to minimize disk space usage, and

speed the analysis, we often strip the water and/or counterions. The three separate prmtop files are

useful to have around since you need to use a prmtop that matches the structure of your (possibly

stripped) trajectory for programs such as ptraj, rdparm, VMD etc.

2.3) Building the prmtop and inpcrd files

Now that we have a starting pdb (nuc.pdb) and an understanding of some of the issues surrounding

different types of classical MD simulations we are ready to start building the input files necessary for the

MD engine in AMBER 10.0, sander.

The first step is the building of residues. Many proteins contain coenzymes as well as standard amino

acids. These coenzymes are not normally pre-defined in the AMBER database and so are considered to be

non-standard residues. It is necessary to provide structural information and force field parameters for all

of the non-standard residues that will be present in your simulation before you can create the sander

input files. Fortunately, if you are using standard nucleic acid or amino acid residues, as we will be in this

tutorial, this step is not necessary since all of the residues are pre-built in the AMBER database. Later

tutorials will cover what to do if you have non-standard residues.

2.3.1) LEaP

To create our first prmtop and inpcrd files, we simply issue the following command in the main XLEaP

window:

(you will need to type the following line in XLEaP, copy and paste do not work!)

12

Page 13: Amber1

saveamberparm model polyAT_vac.prmtop polyAT_vac.inpcrd

(that is save amber parameters for our vacuum “model” by creating a parameter -polyAT_vac.prmtop-

and input coordinate -polyAT_vac.inpcrd- file)

This command should create the files: polyAT_vac.prmtop and polyAT_vac.inpcrd and give you the

following output (the warning concerns the fact that we did not neutralize our system - more on this

later):

To remind you about the inpcrd and prmtop files:

• prmtop: The parameter/topology file. This defines the connectivity and parameters for our current

model. This information is static, or in other words, it doesn't change during the simulation. The

prmtop we created above is called polyAT_vac.prmtop.

• inpcrd: The coordinates (and optionally box coordinates and velocities). This is data is not static and

changes during the simulations (although the file is unaltered). Above we created an initial set of

coordinates called polyAT_vac.inpcrd.

Now we want to create a topology that has explicit net neutralizing counterions. There are a number of

different ways to add ions to a structure. In this example we shall use the addions command

implemented in XLEaP. This method works by constructing a Coulombic potential on a 1.0 angstrom grid

and then placing counterions one at a time at the points of lowest/highest electrostatic potential. The

command to do this is as follows (the '0' means 'neutralize'): addions model Na+ 0

This should add a total of 18 sodium anions to counteract the -18 charge of the DNA chain. The output

from this command should be similar to the following:

13

Page 14: Amber1

Note: you should always check that the number of ions you were expecting have actually been added. It

is also a good idea to view the new structure to ensure that the charges have been placed as intended.

By editing the "model" we can see where the ions have been added: edit model

14

Page 15: Amber1

Copy a cute view of the DNA and its couterions to your report

Now we are once again ready to write the prmtop and inpcrd files, this time for our neutralized system:

saveamberparm model polyAT_cio.prmtop polyAT_cio.inpcrd

Output files: polyAT_cio.prmtop, polyAT_cio.inpcrd

(note how the names of the files include now a cio for counter ions as opposed to vac)

The final input files to create are for solvated DNA with explicit counterions. We have our "model" unit

already built with counterions so the next step is to solvate it with explicit water. This is done with the

command "solvatebox". For our DNA, we will put an 8 angstrom buffer of TIP3P water around the DNA in

each direction. In this way all atoms in the DNA starting structure will be no less than 8 angstroms from

the edge of the water box. Before we do this, however, for reasons that will become clear later we should

create a copy of our model and call it model2: model2 = copy model

To create a rectangular box of water around the DNA type: solvatebox model TIP3PBOX 8.0

This results in the following output (exact numbers may be slightly different due to round off differences

between different computer architectures),

15

Page 16: Amber1

editing the "model" (edit model) should show you the DNA in a water box:

The above output shows us that XLEaP added a total of 2638 water molecules to form a rectangular box

of 44.5 x 46.0 x 58.9 angstroms (120593.3 angstroms3). This is not cubic since DNA is a cylindrical

molecule. An issue here is that the long axis of DNA could rotate (via self diffusion) such that the long

axis was along the short box dimension that will, since this box will be infinitely repeated in space by the

16

Page 17: Amber1

periodic boundary method, bring the ends of the DNA near their periodic images. One way to get around

this would be to make the box cubic, or 58.9 x 58.9 x 58.9 angstroms, by specifying a list of numbers to

the solvateBox command to force this to be cubic. However, this will add significantly more water to the

calculation and slow it down tremendously. Alternatively we can use a different shape box of water. While

a rectangular box is the obvious choice for tessellating in 3 dimensional space it is not the only shape

that can be replicated in 3 dimensions. A more efficient shape to use, in terms of reducing the problem of

solute rotation, and the one we will be using for this tutorial, is a truncated octahedron:

To add a truncated octahedral box of water around our DNA we use the solvateoct command. Since in

the course of this demonstration we have already solvated our "model" with a rectangular box of water

we shall use the copy we made "model2". Enter the following in XLEaP to create the water box:

solvateoct model2 TIP3PBOX 8.0

This should give the following output:

Editing "model2" allows you to view the truncated octahedron water box: edit model2

17

Page 18: Amber1

There you have it, a truncated octahedral shaped ice cube...

Copy a cute view of the solvated DNA with its counter ions in the octahedral cell to your report

Once again we save our AMBER parmtop and inpcrd files: saveamberparm model2 polyAT_wat.prmtop

polyAT_wat.inpcrd

Output files: polyAT_wat.prmtop, polyAT_wat.inpcrd

Now we have our input files we can progress to the next section that introduces running minimization

and molecular dynamics.

But before doing that go back to the terminal window (phew, now you can copy and paste if you want)

and list your files. (if you do not remember how to do this go back to the beginning for a hint! Or simply

open the folder you made). Copy the list into your report and describe all the files in your directory in one

or two sentences.

18

Page 19: Amber1

3) Running Minimization and Molecular Dynamics (in vacuo)

This section will introduce sander and show how it can be used for minimization and molecular dynamics

of our previously created DNA models. We will initially run our simulations on the in vacuo model and

analyze the results before moving on to running simulations with implicit and explicit solvent. For this

section of the tutorial we shall be using the in vacuo prmtop and inpcrd files we created previously

(polyAT_vac.prmtop and polyAT_vac.inpcrd).

This section of the tutorial will consist of 3 stages:

1 Relaxing the system prior to MD

2 Molecular dynamics at constant temperature

3 Analyzing the results

3.1) Relaxing the System Prior to MD

In the previous section we used NAB to build us a starting structure. Since this "default" geometry may

not correspond to the actual minima in the force field we are using and may also result in conflicts and

overlaps with atoms in other residues, it is always a good idea to minimize the locations of these atoms

before commencing molecular dynamics. Failure to successfully minimize these atoms may lead to

instabilities when we run MD.

So, given the in vacuo prmtop (polyAT_vac.prmtop) and inpcrd (polyAT_vac.inpcrd) files we created,

we will now use sander to conduct a short minimization run. Since we just want to "fix up" the positions

of the atoms in order to remove any bad contacts that may lead to unstable molecular dynamics we will

run a short (500 steps) minimization. This will take us towards the closest local minima. Minimization with

sander will only ever take you to the nearest minima, it cannot cross transition states to reach lower

minima. This is fine for our purposes, however, since all we want to do is remove the largest strains in the

system.

The basic usage for sander is as follows:

sander [-O] -i mdin -o mdout -p prmtop -c inpcrd -r restrt

[-ref refc] [-x mdcrd] [-v mdvel] [-e mden] [-inf mdinfo]

• Arguments in []'s are optional

• If an argument is not specified, the default name will be used.

• -O    overwrite all output files (the default behavior is to quit if any output files already exist)

• -i      the name of the input file (which describes the simulation options), mdin by default.

• -o     the name of the output file, mdout by default.

19

Page 20: Amber1

• -p     the parameter/topology file, prmtop by default.

• -c     the set of initial coordinates for this run, inpcrd by default.

• -r     the final set of coordinates from this MD or minimization run, restrt by default.

• -ref  reference coordinates for positional restraints, if this option is specified in the input file, refc by

default.

• -x    the molecular dynamics trajectory file (if running MD), mdcrd by default.

• -v    the molecular dynamics velocities file (if running MD), mdvel by default.

• -e    a summary file of the energies (if running MD), mden by default.

• -inf  a summary file written every time energy information is printed in the output file for the current

step of the minimization of MD, useful for checking on the progress of a simulation, mdinfo by default.

3.1.1) Looking at the mdin input file

Now that we have the prmtop and inpcrd files from XLEaP, all we need to run sander is the mdin file

which specifies the myriad of possible options for this run.

Note (but do not worry) that the run time input to control sander is via "namelist" variables (for more info

see the manual) specified in the mdin file. For example &cntrl is the name of the list that contains the

input value for the variable IMIN, NCYC and MAXCYC:

Test run 1

 &cntrl

     IMIN = 1, NCYC = 250, MAXCYC = 500

 /

In the absence of a variable specification in the input file, default values are chosen; every specified

variable, except the last one, needs to be followed by a comma. The sander manual describes all of these

inputs, for each of the possible namelists. Which namelist is used depends on the specification above,

such as &cntrl (see example above) or &ewald. At a minimum the &cntrl namelist must be specified. Also

notice the space or empty first column before specification of the namelist control variable; this is

necessary. It is also necessary to end each namelist with a forward slash /. Other namelists (such as the

&cntrl namelist above) are optional. After the namelist some other information may be specified, such as

"GROUP" input, which allows atom selections for restraints. Note that the input variable and namelists

have changed somewhat from earlier versions of AMBER. Refer to sander manual for more information.

Next we will build a minimal input file for performing minimization of our DNA. In theory it would

probably be best to run a dual stage minimization where we initially use position restraints on all the

heavy atoms so that in the first stage of minimization only the hydrogens that XLEaP added are

minimized. Then in the second stage we allow minimization of all atoms in the system. Since our system 20

Page 21: Amber1

is fairly small and simplistic it should be fine to skip the first stage and just minimize everything. An

example of such a two-stage minimization approach will be given when we run simulations on our more

complex solvated model in the next section.

Use textwrangler to create an input file (within your amber folder) by copying the text below to a file

named polyAT_vac_init_min.in. This is the minimal input file for performing minimization of our DNA:

polyA-polyT 10-mer: initial minimization prior to MD

&cntrl

imin = 1,

maxcyc = 500,

ncyc = 250,

ntb = 0,

igb = 0,

cut = 12

/

Copy this file it into your report. Now let’s talk about it. To turn on minimization, we specify IMIN = 1. We

want a fairly short minimization since we don't actually need to reach the minima, just move away from

any local maxima, so we select 500 steps of minimization by specifying MAXCYC = 500. Sander supports

two different algorithms for minimization, steepest descent and conjugate gradient. The steepest descent

algorithm is good for quickly removing the largest strains in the system but converges slowly when close

to a minima. Here the conjugate gradient method is more efficient. The use of these two algorithms can

be controlled using the NCYC flag. If NCYC < MAXCYC sander will use the steepest descent algorithm for

the first NCYC steps before switching to the conjugate gradient algorithm for the remaining (MAXCYC -

NCYC) steps. In this case we will run an equal number of steps with each algorithm so we set NCYC =

250. Since sander assumes that the system is periodic by default we need to explicitly turn this off (NTB

= 0). In this simulation we will be using a constant dielectric and not an implicit (or explicit) solvent

model so we set IGB = 0 (no generalized born solvation model), this is the default so we strictly don't

need to specify this but I will include it here so that we can see what differences in the input file we have

when we switch on implicit solvent later. We also need to choose a value for the non-bonded cut off. A

larger cut off introduces less error in the non-bonded force evaluation but increases the computational

complexity and thus calculation time. 12 angstroms is a good tradeoff so that is what we will use (CUT =

12). So, now to run sander for minimization:

3.1.2) Running sander for the first time

To run sander, we simply execute the following:

$AMBERHOME/bin/sander -O -i polyAT_vac_init_min.in -o polyAT_vac_init_min.out -c

polyAT_vac.inpcrd -p polyAT_vac.prmtop -r polyAT_vac_init_min.rst

21

Page 22: Amber1

(remember that you can copy and paste into your terminal window). This should run very quickly (a

minute, maybe)

Copy this command into your report. Translate the command into words (make sure to include in your

explanation the file names of all the input and output files)

Note how the files that were green before are now red, we produced them as our output but are now

being use as input!

Take a look at the output file produced during the minimization (polyAT_vac_init_min.out). You will see

that the energy dropped considerably between the first and last steps:

In spite of this, however, the structure did not change very much as you will see soon. This is because, as

already mentioned, minimization will only find the nearest local minima.

3.1.3) Creating PDB files from the AMBER coordinate files

You will want to generate a new pdb file so you can look at the structure using the minimized coordinates

so you can confirm that the structure did indeed not change much. You will do this starting from

polyAT_vac_init_min.rst. A pdb file can be created from the parm topology and coordinates (inpcrd

or restrt) using the program ambpdb.

$AMBERHOME/bin/ambpdb -p polyAT_vac.prmtop < polyAT_vac_init_min.rst >

polyAT_vac_init_min.pdb

This will take the specified prmtop (in this case polyAT_vac.prmtop) and either a inpcrd file or a rst

file (in this case the final structure from the minimization, the polyAT_vac_init_min.rst file) and

create (on stdout, standard output) a pdb file. In this case we redirected stdout to the file -

polyAT_vac_init_min.pdb.

The pdb will look like the picture that follows and soon you will look at it using VMD. You will get a good

idea of how to use VMD by following the rapid introduction to VMD page link on the first page of this

document, but you will do that AFTER you send the runs that follow. After you do the “quick tutorial”

copy a figure like the one that follows in your report

22

Page 23: Amber1

Also create a PDB file from the start (polyAT_vac.inpcrd) structure and use VMD to generate a figure

with both structures superimposed, just like the one below (and copy it to your report)

Initial structure = Green, Minimized structure = Blue

In general, users should carefully inspect any starting structures and minimized structures; specifically

check to make sure the hydrogens were placed where you thought they should be, histidines are in the

correct protonation state, the terminal residues are properly terminated, stereochemistry is reasonable,

etc. There is nothing worse than finding out after you've run a nanosecond of solvated dynamics that an

H1' atom was on the wrong side and that you have simulated some strange anomer of DNA!

3.2) Running MD in-vacuo

23

Page 24: Amber1

In section 3.1 we used sander to minimize our system in order to remove any bad contacts introduced by

the hydrogenation step in XLEaP. This led to the creation of the coordinate file named

polyAT_vac_init_min.rst. For the next section of this tutorial we will use this coordinate file as the

starting structure for our in-vacuo MD simulation. The following in-vacuo simulation is designed to give a

flavor of how MD simulations are run. In general one would not run an in-vacuo simulation unless the

system was inherently gas phase. For liquid phase systems, such as our DNA, one would normally include

solvent either explicitly or implicitly. This type of simulation will be covered in section 4. For the time

being we will stick to gas phase simulations since they are simple and quick to run.

To make the calculations tractable for the purposes of this tutorial, we can only run short simulations of

the order of 100 ps. Although these are "short" simulations (in terms of what is likely required to answer

specific research questions), you will see that they are still rather costly, depending on what type of

machine you using. However, to get a better idea of the issues involved and potential artifacts of the

various models that will be used (for the simulation of DNA) it is probably useful to run through these

examples, either running them yourself or just looking at the output files and trajectories supplied. You

may also want to try some different examples from the two given below, such as a distance dependent

dielectric constant, or see what happens if you increase the dielectric constant to say 80.0.

For this tutorial we will run two in-vacuo simulations and compare the results. The simulations we will run

will be:

1 polyAT_vac_md1_12Acut.in: 12.0 angstrom long range cutoff, dielectric = 1

2 polyAT_vac_md1_nocut.in: no long range cutoff, dielectric = 1

To run a molecular dynamics simulation with sander, we need to turn off minimization (IMIN=0). Since we

are running in-vacuo we also need to disable periodicity (NTB=0) and set IGB=0 since we are not using

implicit solvent. For these two examples we will write information to the output file and trajectory

coordinates file every 100 steps (NTPR=100, NTWX=100). For most simulations writing to the coordinate

file every 100 steps is too frequently, it both consumes excess disk space and can impact performance,

however, this simulation here will show some interesting behavior over a short time scale that I would

like you to see and so we will use a value of 100 to ensure that this is suitably sampled. For multiple

nano-second simulations of stable systems a more suitable value for NTWX would be in the 1000 to 2000

range. The CUT variable specifies the cut-off range for the long-range non-bonded interactions. Here two

different values for the cutoff will be used, one run will be with a cut off of 12 angstroms (CUT = 12.0)

and one run will be without a cutoff. To run without a cutoff we simply set CUT to be larger than the

extent of the system (e.g. CUT = 999). For temperature regulation we will use the Langevin thermostat

(NTT=3) to maintain the temperature of our system at 300 K. This temperature control method uses

Langevin dynamics with a collision frequency given by GAMMA_LN. This temperature control method is

24

Page 25: Amber1

significantly more efficient at equilibrating the system temperature than the Berendsen temperature

coupling scheme (NTT=1) that was the recommended method for older versions of AMBER. The biggest

problem with the Berendsen method is that the algorithm simply ensures that the kinetic energy is

appropriate for the desired temperature; it does nothing to ensure that the temperature is even over all

parts of the molecule. This can lead to the phenomenon of hot solvent, cold solute. To avoid this,

elaborate temperature scaling techniques for slowly heating the molecule over the course of the

simulation were recommended. The Langevin system is much more efficient, however, at equilibrating

the temperature and is now the recommended choice. The efficiency is such that if we have a reasonably

good structure, which we do in this case, we can actually start the system at 300 K and avoid the need to

slowly heat over say 20 ps from 0 K to room temperature. Thus for this simulation we shall set NTT=3

with GAMMA_LN=1. We shall also set the initial and final temperatures to 300 K (TEMPI=300.0,

TEMP0=300.0), which will mean our system's temperature should remain around 300 K. In these two

examples we will run a total of 100,000 steps each (NSTLIM=100000) with a 1 fs time step (DT=0.001)

giving simulation lengths of 100 ps (100,000 x 1fs). Create the following two input files:

polyAT_vac_md1_12Acut.in polyAT_vac_md1_nocut.in

10-mer DNA MD in-vacuo, 12 angstrom cut off

&cntrl

imin = 0, ntb = 0,

igb = 0, ntpr = 100, ntwx = 100,

ntt = 3, gamma_ln = 1.0,

tempi = 300.0, temp0 = 300.0

nstlim = 100000, dt = 0.001,

cut = 12.0

/

10-mer DNA MD in-vacuo, infinite cut off

&cntrl

imin = 0, ntb = 0,

igb = 0, ntpr = 100, ntwx = 100,

ntt = 3, gamma_ln = 1.0,

tempi = 300.0, temp0 = 300.0

nstlim = 100000, dt = 0.001,

cut = 999

/

Copy them to your report and use the explanation above to translate into words three of the parameters

in each input file (a total of six “letter combinations” I do not want to call NSTLIM a word ).

So, to run the two jobs we issue the following two commands. Note that you should run them sequentially

since you are using your local mac machine. Also note we use the restrt file from the minimization as

our starting structure:

$AMBERHOME/bin/sander -O -i polyAT_vac_md1_12Acut.in -o polyAT_vac_md1_12Acut.out -c

polyAT_vac_init_min.rst -p polyAT_vac.prmtop -r polyAT_vac_md1_12Acut.rst -x

polyAT_vac_md1_12Acut.mdcrd

$AMBERHOME/bin/sander -O -i polyAT_vac_md1_nocut.in -o polyAT_vac_md1_nocut.out -c

25

Page 26: Amber1

polyAT_vac_init_min.rst -p polyAT_vac.prmtop -r polyAT_vac_md1_nocut.rst -x

polyAT_vac_md1_nocut.mdcrd

Note that we now have an additional option, the -x option that specifies the name of the output file for

the MD trajectory. This file will contain a snapshot of the entire system's Cartesian coordinates every

NTWX (100) steps.

Please be aware that the nocut simulation will likely stop after around 22,900 steps. This is not a bug, it

is a problem with simulating DNA in vacuo which will become clear when we visualize the trajectory files.

More on this later.

These will take considerably longer to run than the minimization, so it is probably a good time to go off

and get a cup of coffee. You can follow the progress of the job by opening another terminal window and

following the output file with the following command: tail -f polyAT_vac_md1_12Acut.out

While you wait you should go ahead and learn how to use VMD.

3.3) Analyzing the results

So, now we've run the simulations what do we want to look at?

We definitely want to look at movies (which is fun :-) ). We will use vmd to load up the "parm and crd"

(i.e. AMBER prmtop and MD trajectory) and use the animation controls to view it, more on this later. We

also want to calculate RMSd (root mean square deviation) vs. time and also extract the various energies

as a function of time to plot total, kinetic, potential energies etc.

3.3.1) Visualizing the trajectories with VMD

Let's take a look at it in vmd:

Run vmd (based on vmd 1.9.1):

Open our 12 Å cutoff trajectory file. Select: File -> New Molecule.

VMD supports multiple trajectory files for a single molecule. Thus a molecule is defined by loading the

associated prmtop file. So, start by selecting Browse and finding the polyAT_vac.prmtop file.

Then under the heading where it states "Determine file type:" check that amber7 parm appeared.

Then hit 'Load'.

Next we need to choose which trajectory to load, in this case we will load the 12 Å cutoff trajectory file

polyAT_vac_md1_12Acut.mdcrd you created. Select browse again, browse for the file and then under

'determine file type' make sure you have AMBER Coordinates Note: if this were a periodic boundary

simulation then we would select "crdbox" since the trajectory file would also contain information on the

box size. In this situation, however, there is no box information since this is a non-periodic in vacuo

simulation. When you hit "Load" again you should see all of the frames loaded into the main molecule

26

Page 27: Amber1

window. 1,000 frames in total.

We can now use the playback tools in the "VMD main" control panel to play our movie:

You should have a go at playing the video of the trajectory. Notice how the DNA holds its secondary

structure there is considerable movement in the extremities but the overall structure is preserved.

Is this the correct answer though? Just because a trajectory is stable doesn't necessarily mean it is

correct. Is a strand of charged DNA in vacuum really likely to be stable? In solvent such electrostatic

repulsions are shielded by the solvent but in vacuum there is no such shielding and no external forces to 27

Page 28: Amber1

help hold the chains together. Lets take a look at the trajectory obtained from our no cutoff simulation.

Stop displaying the existing molecule by double clicking on the D associated with the 12 Å cutoff

trajectory and repeat the process as described above but this time select the no cut off trajectory file

polyAT_vac_md1_nocut.mdcrd. This time when you hit "Load" VMD should load the trajectory up until the

point where the simulation crashed (229 frames of 0.1ps each = 22.9 ps). Have a look at the trajectory;

the difference from the last simulation should be obvious. The instability of the DNA dimer is clear.

View the DNA dimer from the top during the movie of the MD trajectory since it is slightly easier to see

what is happening, the large repulsive charge on the two chains is causing them to uncoil and move

away from each other. You can see that the two chains have, due to the large electrostatic repulsion

forces, rapidly separated from each other and have started to drift apart. The simulation stopped when

the distance between the two strands exceeded pre-determined parameters within the code.

So, which simulation is the correct one?. Well, since this simulation was in vacuo and we had no

neutralizing ions the conditions did not really represent laboratory conditions. Indeed in this "harsh"

environment, with no clustered water or ions, it is likely that the DNA 10-mer is going to be unstable and

so the behavior shown by our no cut off simulation is most likely the closest to reality.

Let's think about what is going on more carefully. Our DNA molecule consists of two chains held together

by hydrogen bonds. Each chain has a net charge of -9 electrons. So, our chains have a large electrostatic

repulsion as shown by the value of EELEC in the no cutoff simulation (EELEC = 1349.1697). Thus the two

chains are repelling each other. For the system to be stable the interaction between the chains, largely

due to hydrogen bonding, must counteract this large electrostatic repulsion. Now, in the case where we

used a 12 Å cutoff, all charges beyond 12 Å distance were considered to have zero energy. Since the

average distance between the negative charges on the DNA backbone, due to the phosphate groups, is

15 Å the repulsion between the phosphate atoms on opposite chains was ignored when the 12 Å cutoff

was employed. The electrostatic attraction due to hydrogen bonding between opposite bases is due to a

much closer interaction, of the order of 2 angstroms. Thus while the main electrostatic repulsion

interaction was excluded when we used the 12 angstrom cutoff, the main attractive force between the

two chains was included. Thus during the simulation the chain held together and we obtained a stable

trajectory.

The take home lesson here is that you should think very carefully about what you are simulating, Are you

really simulating realistic conditions, how are the parameters you have chosen biasing your results? A

cutoff can be a good way to increase the speed of a simulation, but you need to be aware that it can

introduce very large artifacts into your simulation. So, think very carefully, and try out several scenarios

before you try to reach firm conclusions.

Summarize in your report the differences in the results of both runs (include one or two pictures from

each) and the reasons behind these results.

28

Page 29: Amber1

One way to improve considerably on our in vacuo simulations is to make our physical model of DNA much

closer to reality, i.e. include explicit neutralizing ions and also to include solvent effects, either implicitly

within our model or via the use of explicit solvent. This is the subject of the next week tutorial.

3.3.2) Calculating the RMSd vs. time

The next step in analyzing our results is to calculate the RMSd (root mean square deviation) as a function

of time. This will give us a quantitative measure of how much a molecule is changing its structure as the

simulation progresses. We will use RSM trajectory tool (an analysis program provided with VMD). The

precise parameter we will be calculating in this example is the mass weighted RMSd (Root Mean Square

deviation) fit between each successive structure and the first structure of our trajectory.

To do this select Extensions -> Analysis ->

RSMd Trajectory tool. Start by changing the

word protein to all (you want to calculate the

RSMd for all the atoms) and also unclick noh (in

the selection modifiers). Also check the On/Off

box in the weights and then choose mass in the

Field drop menu -we want it to calculate a MASS

weighted. Finally make sure that in the

trajectory field, the On/Off boxed if ticked and

that the frame ref is set to 0 - we want to do the

RMS fit to the FIRST structure.

Finally press RMSD.

To look at your results on the upper left drop

menu (the one where you can only see an F....)

choose Plot Data.

Copy this graph into your report.

Hopefully in your plot you can see the problem. While the 12 angstrom cutoff simulation has a largely

constant RMSd around 2.2 angstroms the no cutoff simulation shows a steadily increasing RMSd which

after 20 ps is already over 30 angstroms!!! This would suggest that something has gone wrong with the

simulation since our system has essentially "blown up." However, now that you saw the movies and

thought about them for a tiny bit you also now that the RSMd results mask a larger problem with running

a simulation in vacuum. The use of a cutoff here result in a unrealistic stable trajectory while no cutoff

(which should in theory be more accurate) leads to a "blow up."

29

Page 30: Amber1

3.3.3) Extracting the energies, etc. from the mdout file

You will use a perl script to pull out the energies, etc. on your machine by processing mdout files and

creating a series of files with the various different pieces of information in them. process_mdout.perl, a

perl script, should be downloaded from the moodle page and saved to your working folder. (It is also in

the chem348 common folder). Before using this file you need to type chmod +x process_mdout.perl so

it becomes an executable script. The script uses a default output filename so it is best to create a sub

directory for each of your output files and move to there before running the script e.g.

mkdir polyAT_vac_md1_12Acut

cd polyAT_vac_md1_12Acut

../process_mdout.perl ../polyAT_vac_md1_12Acut.out

(here the ../ is to take care that the files are in the parent directory or folder) You should repeat this

process for the nocut output file, even though the simulation did not complete all 100,000 steps. (Make

sure you go back to your working directory by typing cd .. before you make the new directory.)

In any case, this script will take a whole series of mdout files and will create a whole series, leading off

with the prefix "summary." such as "summary.EPTOT", of output files. These files are just columns of the

time vs. the value for each of the energy components.

You can plot the summary files with excel.

A plot of Energy vs time (from summary.EPTOT) should look like:

30

Page 31: Amber1

Note that this figure does not have units, yours should have! The energy unit is kcal/mol but you might

need to look into the .out files to learn the time unit. (copy your plot to your report) The black line

represents the potential energy for the 12Å cutoff simulation vs. time while the red line represents the

potential energy for the simulation without a cutoff. As we can see the 12 Å cutoff simulation seems to be

fairly stable, the potential energy is fluctuating around a constant mean value. The simulation without a

cutoff, however, is significantly different. To begin with the potential energy is some 3,000 kcal/mol

higher than the 12Å cutoff simulation. The potential energy is so large in fact that it is actually positive.

The potential energy also decreases rapidly over the first 10 ps of simulation suggesting that a large

structural change is occurring. The simulation then ends abruptly after 21.9ps with the following error:

Why the difference in potential energy? Well, by looking at our two output files this can be tracked down

to the difference in the electrostatic energy. For the first step we have:

12 Å cutoff

no cutoff

Notice how the value of EELEC is so much larger in the no cutoff simulation.

Summarize your understanding of what is happening referring to all the results you analyzed in sections

3.3.1-3. (there should be specific references to your results)

31