dl_poly: software and applications i.t. todorov & w. smith arc group & cc group csed, stfc...

39
DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire, England, UK

Upload: jeffrey-king

Post on 13-Jan-2016

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY: Software and Applications

I.T. Todorov & W. SmithARC Group & CC Group

CSED, STFC Daresbury Laboratory, DaresburyWarrington WA4 1EP, Cheshire, England, UK

Page 2: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Where is Daresbury?

Page 3: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Molecular Dynamics: Definitions

• Theoretical tool for modelling the detailed microscopic behaviour of many different types of systems, including gases, liquids, solids, surfaces and clusters.• In an MD simulation, the classical equations of motion governing the microscopic time evolution of a many body system are solved numerically, subject to the boundary conditions appropriate for the geometry or symmetry of the system.• Can be used to monitor the microscopic mechanisms of energy and mass transfer in chemical processes, and dynamical properties such as absorption spectra, rate constants and transport properties can be calculated.• Can be employed as a means of sampling from a statistical mechanical ensemble and determining equilibrium properties. These properties include average thermodynamic quantities (pressure, volume, temperature, etc.), structure, and free energies along reaction paths.

Page 4: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY Project Background

• General purpose parallel (classical) MD simulation software • It was conceived to meet the needs of CCP5 - The Computer Simulation of Condensed Phases (academic collaboration community)• Written in modularised Fortran90 (NagWare & FORCHECK compliant) with MPI2 (MPI1+MPI-I/O) fully self-contained

• 1994 – 2011: DL_POLY_2 (RD) by W. Smith & T.R. Forester

(funded for 6 years by EPSRC at DL) -> DL_POLY_CLASSIC• 2003 – 2011: DL_POLY_3 (DD) by I.T. Todorov & W. Smith

(funded for 4 years by NERC at Cambridge) -> DL_POLY_4

• Over 11,000 licences taken out since 1994• Over 1000 registered FORUM members since 2005• Available free of charge (under licence) to University researchers (provided as code) and at cost to industry

Page 5: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY_DD Development Statistics

Page 6: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY_DD Licence Statistics

Page 7: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY Licence Statistics

Page 8: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY Licence Statistics

Page 9: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY Licence Statistics

Page 10: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY Project Current State

• January 2011: DL_POLY_2 -> DL_POLY_CLASSIC on a BSD type Licence (BS retired but supporting GUI and fixes)• October 2010: DL_POLY_3 -> DL_POLY_4 still under STFC Licence, over 1300 licences taken out since November 2010

• Rigid Body dynamics• Parallel I/O & netCDF I/O – NAG dCSE (IJB & ITT)• CUDA+OpenMP port (source, ICHEC) & MS Windows port (installers)• SPME processor grid freed from 2^N decomposition – NAG dCSE (IJB)

• Load Balancer development (LJE, finished 30/03/2011)• Continuous Development of DL_FIELD (pdb to DLP I/O, CY)

Page 11: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

• DL_POLY_4 (version 1.2)– Dynamic Decomposition parallelisation, based on

domain decomposition but with dynamic load balancing– limits up to ≈2.1×109 atoms with inherent

parallelisation.– Full force field and molecular description with rigid

body description– Free format (flexible) reading with some fail-safe

features and basic reporting (but fully fool-proofed)• DL_POLY Classic (version 1.6)

– Replicated Data parallelisation, limits up to ≈30,000 atoms with good parallelisation up to 64 (system dependent) processors (running on any processor count)

– Full force field and molecular description– Hyper-dynamics: Temperature Accelerated Dynamics &

Biased Potential Dynamics, Solvation Dynamics – Spectral Shifts, Metadynamics, Path Integral MD

– Free format reading but somewhat strict

Current Versions

Page 12: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Supported Molecular Entities

Point ionsand atoms

Polarisableions (core+shell)

Flexiblemolecules

Rigidbonds

Rigidmolecules

Flexiblylinked rigidmolecules

Rigid bondlinked rigidmolecules

Page 13: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Force Field Definitions – I

• particle: rigid ion or atom (charged or not), a core or a shell of a polarisable ion(with or without associated degrees of freedom), a massless charged site. A particle is a countable object and has a global ID index.

• site: a particle prototype that serves to defines the chemical & physical nature (topology/connectivity/stoichiometry) of a particle (mass, charge, frozen-ness). Sites are not atoms they are prototypes!

• Intra-molecular interactions: chemical bonds, bond angles, dihedral angles, improper dihedral angles, inversions. Usually, the members in a unit do not interact via an inter-molecular term. However, this can be overridden for some interactions. These are defined by site.

• Inter-molecular interactions: van der Waals, metal (EAM, Gupta, Finnis-Sinclair, Sutton-Chen), Tersoff, three-body, four-body. Defined by species.

Page 14: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Force Field Definitions – II

• Electrostatics: Standard Ewald*, Hautman-Klein (2D) Ewald*, SPM Ewald (3D FFTs), Force-Shifted Coulomb, Reaction Field, Fennell damped FSC+RF, Distance dependent dielectric constant, Fuchs correction for non charge neutral MD cells.

• Ion polarisation via Dynamic (Adiabatic) or Relaxed shell model.

• External fields: Electric, Magnetic, Gravitational ,Oscillating & Continuous Shear, Containing Sphere, Repulsive Wall.

• Intra-molecular like interactions: tethers, core shells units, constraint and PMF units, rigid body units. These are also defined by site.

• Potentials: parameterised analytical forms defining the interactions. These are always spherically symmetric!

• THE CHEMICAL NATURE OF PARTICLES DOES NOT CHANGE IN SPACE AND TIME!!!

Page 15: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Force Field by Sums

iN

1iexternal

N

ijishell-coreshell-core

N

i0tttethertether

N

idcbainversinvers

N

idcbadiheddihed

N

icbaangleangle

N

ibabondbond

N'

ji,

N'

ji,jiij

N

ijipairmetal

N'

nk,j,i,nkjibody4

N'

kj,i,kjibody3

N'

kj,i,kjiTersoff

N'

ji, ji

ji

0

N'

ji,jipairN21

rΦ|rr|,iUr,r,iU

r,r,r,r,iUr,r,r,r,iU

r,r,r,iUr,r,iU

)|rr|(ρF|)rr(|Vε

r,r,r,rUr,r,rUr,r,rU

|rr|

qq

1|)rr(|U)r,.....,r,rV(

-shellcore

-shellcore

tether

tether

invers

invers

dihed

dihed

angle

angle

bond

bond

Page 16: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Ensembles and Algorithms

Integration:

Available as velocity Verlet (VV) or leapfrog Verlet (LFV) generating flavours of the following ensembles• NVE

• NVT (Ekin) Evans

• NVT Andersen^, Langevin^, Berendsen, Nosé-Hoover• NPT Langevin^, Berendsen, Nosé-Hoover, Martyna-Tuckerman-Klein^

• NT/NPnAT/NPnT Langevin^, Berendsen, Nosé-Hoover, Martyna-Tuckerman-Klein^

Constraints & Rigid Body Solvers: • VV dependent – RATTLE, No_Squish, QSHAKE*• LFV dependent – SHAKE, Euler-Quaternion, QSHAKE*

Page 17: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY is designed for homogeniousdistributed parallel machines

M1 P1

M2 P2

M3 P3

M0 P0 M4P4

M5P5

M6P6

M7P7

Assumed Parallel Architecture

Page 18: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

InitializeInitialize

ForcesForces

MotionMotion

StatisticsStatistics

SummarySummary

InitializeInitialize

ForcesForces

MotionMotion

StatisticsStatistics

SummarySummary

InitializeInitialize

ForcesForces

MotionMotion

StatisticsStatistics

SummarySummary

InitializeInitialize

ForcesForces

MotionMotion

StatisticsStatistics

SummarySummary

AA BB CC DD

Replicated Data

Page 19: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Molecular Molecular force fieldforce fielddefinitiondefinition

Glo

bal Fo

rce F

ield

Glo

bal Fo

rce F

ield

PP00LocalLocalforceforcetermsterms

PP11LocalLocalforceforcetermsterms

PP22LocalLocalforceforcetermsterms

Pro

cess

ors

Pro

cess

ors

Bonded Forces within RD

Page 20: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

U. Essmann, L. Perera, M.L. Berkowtz, T. Darden, H. Lee, L.G. Pedersen, J. Chem. Phys., (1995), 103, 8577

1. Calculate self interaction correction2. Initialise FFT routine (FFT – 3D FFT)3. Calculate B-spline coefficients4. Convert atomic coordinates to scaled fractional units5. Construct B-splines6. Construct charge array Q7. Calculate FFT of Q array8. Construct array G9. Calculate FFT of G array10. Calculate net Coulombic energy11. Calculate atomic forces

RD Scheme for long-ranged part of SPME

Page 21: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

21

AA BB

CC DD

Domain Decomposition

Page 22: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Glo

bal fo

rce fi

eld

Glo

bal fo

rce fi

eld

PP00LocalLocalatomicatomicindicesindices

PP11LocalLocalatomicatomicindicesindices

PP22LocalLocalatomicatomicindicesindices

Pro

cess

or

Dom

ain

sPro

cess

or

Dom

ain

s

Tricky!Tricky!Molecular Molecular force fieldforce fielddefinitiondefinition

Bonded Forces within DD

Page 23: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

U. Essmann, L. Perera, M.L. Berkowtz, T. Darden, H. Lee, L.G. Pedersen, J. Chem. Phys., 103, 8577 (1995)

1. Calculate self interaction correction2. Initialise FFT routine (FFT – IJB’s DaFT: 3M2 1D FFT)3. Calculate B-spline coefficients4. Convert atomic coordinates to scaled fractional units5. Construct B-splines6. Construct partial charge array Q7. Calculate FFT of Q array8. Construct partial array G9. Calculate FFT of G array10. Calculate net Coulombic energy11. Calculate atomic forces

I.J. Bush, I.T. Todorov, W. Smith, Comp. Phys. Commun., 175, 323 (2006)

DD Scheme for long-ranged part of SPME

Page 24: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

0 200 400 600 800 1000

0

200

400

600

800

1000

max load 700'000 atoms per 1GB/CPUmax load 220'000 ions per 1GB/CPUmax load 210'000 ions per 1GB/CPU

Solid Ar (32'000 atoms per CPU) NaCl (27'000 ions per CPU) SPC Water (20'736 ions per CPU)

21 million atoms

28 million atoms

33 million atoms

Sp

ee

d G

ain

Processor Count

Performance Weak Scaling on IBM p575 2005-2011

Page 25: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Rigid Bodies versus Constraints450,000 particles with DL_POLY_4

0

1

2

3

4

5

6

7

8

9

10

0 100 200 300 400 500 600

step

s per

sec

ond

Np

Scaling

ICE7

ICE7_CB

Page 26: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

I/O Weak Scaling on IBM p5752005-2007

0 200 400 600 800 1000

0

200

400

600

800

Solid Ar NaCl SPC Water

Tim

e [s

]

Processor Count

dashed lines show shut-down timessolid lines show start-up times

Page 27: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Benchmarking BG/L Jülich 2007

2000 4000 6000 8000 10000 12000 14000 16000

2000

4000

6000

8000

10000

12000

14000

16000

14.6 million particle Gd2Zr

2O

7 system

Sp

ee

d G

ain

Processor count

Perfect MD step total Link cells van der Waals Ewald real Ewald k-space

Page 28: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Benchmarking XT4/5 UK 2010

1000 2000 3000 4000 5000 6000 7000 8000

1000

2000

3000

4000

5000

6000

7000

8000

14.6 million particle Gd2Zr

2O

7 system

Processor count

Sp

ee

d G

ain

Perfect MD step total Link cells van der Waals Ewald real Ewald k-space

Page 29: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Benchmarking on Various Platforms

0 500 1000 1500 2000

0

1

2

3

4

5

6

7

8

9

3.8 million particle Gd2Zr

2O

7 system

Eva

lua

tio

ns [s-1]

Processor count

CRAY XT4 SC CRAY XT4 DC CRAY XT3 SC / IBM P6+ CRAY XT3 DC BG/L BG/P IBM p575 3GHz Woodcrest DC

Page 30: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Importance of I/O - I

Types of MD studies most dependent on I/O• Large length-scales (109 particles), short time-scale such as screw deformations• Medium big length-scales (106–108 particles), medium time-scale (ps-ns) such as radiation damage cascades• Medium length-scale (105–106 particles), long time-scale (ns-s) such as membrane and protein processes

Types of I/O: portable human readable loss of precision size• ASCII + + – –• Binary – – + +• XDR Binary + – + +

Page 31: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Importance of I/O - II

Example: 15 million system simulated with 2048 MPI tasks

MD time per timestep ~0.7 (2.7) seconds on Cray XT4 (BG/L)

Configuration read ~100 sec. (once during the simulation)

Configuration write ~600 sec. for 1.1 GB with the fastest I/O method – MPI-I/O for Cray XT4 (parallel direct access for BG/L).

BG/L 16,000 MPI tasks – MD time per timestep 0.5 sec. with a configuration write a frame ~18,000 sec.

I/O in native binary is only 3-5 times faster and 3-7 times smaller

Some unpopular solutions• Saving only the important fragments of the configuration• Saving only fragments that have moved more than a given distance between two consecutive dumps• Distributed dump – separated configuration in separate files for each MPI task (CFD)

Page 32: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

I/O Solutions in DL_POLY_4

1. Serial read and write (sorted/unsorted) – where only a single MPI task, the master, handles it all and all the rest communicate in turn to or get broadcasted to while the master completes writing a configuration of the time evolution.

2. Parallel write via direct access or MPI-I/O (sorted/unsorted) – where ALL / SOME MPI tasks print in the same file in some orderly manner so (no overlapping occurs using Fortran direct access printing. However, it should be noted that the behaviour of this method is not defined by the Fortran standard, and in particular we have experienced problems when disk cache is not coherent with the memory).

3. Parallel read via MPI-I/O or Fortran

4. 4. Serial NetCDF read and writeSerial NetCDF read and write using NetCDF libraries using NetCDF libraries for machine-independent data formats of array-based, for machine-independent data formats of array-based, scientific data (widely used by various scientific scientific data (widely used by various scientific communities).communities).

Page 33: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Performance for 216,000 Ions of NaClon XT5

Page 34: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

3.09 3.10 3.09 3.10Cores I/O Procs Time/s Time/s Mbyte/s Mbyte/s

32 32 143.30 1.27 0.44 49.7864 64 48.99 0.49 1.29 128.46

128 128 39.59 0.53 1.59 118.11256 128 68.08 0.43 0.93 147.71512 256 113.97 1.33 0.55 47.60

1024 256 112.79 1.20 0.56 52.472048 512 135.97 0.95 0.46 66.39

MPI-I/O Write Performance for216,000 Ions of NaCl on XT5

Page 35: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

3.10 New 3.10 NewCores I/O Procs Time/s Time/s Mbyte/s Mbyte/s

32 16 3.71 0.29 17.01 219.7664 16 3.65 0.30 17.28 211.65

128 32 3.56 0.22 17.74 290.65256 32 3.71 0.30 16.98 213.08512 64 3.60 0.48 17.53 130.31

1024 64 3.64 0.71 17.32 88.962048 128 3.75 1.28 16.84 49.31

MPI-I/O Read Performance for216,000 Ions of NaCl on XT5

Page 36: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY Project Background

• Rigid body dynamics and decomposition freed SPME

• no topology and calcite potentials

• Fully parallel I/O: reading and writing in ASCII, optionally including netCDF binary in AMBER format

• CUDA (ICHEC) and Windows ports

• New GUI (Bill Smith)

• Over 1,300 licences taken out since November 2010

• DL_FILED field builder (Chin Yong) – 300 licencesc

Page 37: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

xyz,PDB

DL_FIELD

‘black box’FIELD CONFIG

DL_FILED

• AMBER & CHARM to DL_POLY

• OPLSAA & Drieding to DL_POLY

Protonated

Page 38: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

DL_POLY Roadmap

• August 2011 – March 2012: PRACE-1IP-WP7 funds effort by ICHEC towards CUDA+OpenMP port, SC@WUT towards OpenCL+OpenMP port, and FZ Julich for FMP library testing

• October 2011 – October 2012: EPSRC’s dCSE funds effort by NAG Ltd.

• OpenMP within MPI vanilla• Beyond 2.1 billion particles

• October 2011 – September 2012: 2 Temperature Thermostat Models, Fragmented I/O, On-the-Fly properties

• November 2011 – September 2013: MMM@HPC, Gentle thermostat, Hyperdynamics

Page 39: DL_POLY: Software and Applications I.T. Todorov & W. Smith ARC Group & CC Group CSED, STFC Daresbury Laboratory, Daresbury Warrington WA4 1EP, Cheshire,

Acknowledgements

Thanks to

• Bill Smith (retired)• Ian Bush (NAG Ltd.)• Christos Kartsaklis (ORNL), Ruairi Nestor (ICHEC)

http://www.ccp5.ac.uk/DL_POLY/