machine learning the born-oppenheimer potential energy surface: from...
TRANSCRIPT
Machine learning the Born-Oppenheimer potential energy surface:
from molecules to materialsGábor Csányi
Engineering Laboratory
Interatomic potentials for molecular dynamics
Transferability biomolecular force fields
(biochemistry)
Reactive solid state materials
(physics & materials)
Accuracy small molecules
in gas phase (quantum chemistry)
AMBER CHARMM AMOEBA
GROMACSOPLS
...
Tersoff Brenner
EAM BOP
...
Bowman Szalewicz
Paesani ...
Ingredients
• Representation of atomic neighbourhood
• Interpolation of functions
• Databaseof configurations
smoothness, faithfulness, continuity
flexible but smooth functional form, few sensible parameters
predictive power non-domain specific
Many-body (cluster) expansion
= + + + +
+ + + +
higher multi-body terms...+
+ analytic long range termsE =X
i
"(q(i)1 , q2,(i) . . . , q(i)M )
qi are descriptors of local atomic neighbourhood
Wea
kSt
rong
Interpolating the solution of an eigenproblem
�i~@ @t
= H Hel =1
2
X
i
r2ri +
X
ij
Zj(�1)
|Rj � ri| +X
ij
1
|ri � rj |Lowest eigenstate: E0 ⌘ Vel(R1, R2, . . .) has spatial locality
+ analytic long range terms
Parametrisation of the H2O dimer
• Standard representation: 6 atoms ⇒ 15 interatomic distances : x = {rij}
• Symmetrize kernel function: S: symmetry group of molecules
• Use analytic forces if available:
K̃(x, x0) =X
p2S
K(p(x), x0)
K
0(x, x0) =@
@x
K(x, x0)
H2O dimer corrections
Roo
Water hexamers - relative energies - no ZPC
CCSD(T) BLYP DMC Bowman BLYP+GAP
0.00 0.00 0.00 0.00 0.00
0.24 -0.59 0.33 0.46 0.15
0.70 -2.18 0.56 2.39 0.52
1.69 -2.44 1.53 4.78 1.77
[kcal / mol]
GAP error < 0.03 kcal/mol / H2O(B3LYP optimised geometries from
Tschumper)
Ice polymorphs
[meV / H2O] Expt DMC±5 PBE0+vdWTS BLYP+GAP±3
Ih 0 0 0 0
II 1 -4 6 -5
IX 4 3 -3
VIII 33 30 76 30
Relative energies
error < 0.15 kcal/mol / H2O
Ice polymorphs
[meV / H2O] Expt DMC ±5
PBE0+vdWTS BLYP+GAP ±3
Ih 610 605 672 667
II 609 609 666 672
IX
VIII 577 575 596 637
Absolute binding energies
BLYP+GAP overbinds uniformly by ~ 60 meV ⇒ 3-body terms!
Many-body (cluster) expansion
= + + + +
+ + + +
higher multi-body terms...+
+ analytic long range termsE =X
i
"(q(i)1 , q2,(i) . . . , q(i)M )
qi are descriptors of local atomic neighbourhood
Wea
kSt
rong
Interpolating the solution of an eigenproblem
�i~@ @t
= H Hel =1
2
X
i
r2ri +
X
ij
Zj(�1)
|Rj � ri| +X
ij
1
|ri � rj |Lowest eigenstate: E0 ⌘ Vel(R1, R2, . . .) has spatial locality
+ analytic long range terms
Quantum mechanics is many-body
-1
0
1
2
3
4
5
6
7
12 14 16 18 20 22
Ener
gy [e
V / a
tom
]
Volume [A3 / atom]
Tungsten, Finnis-Sinclairbccfccsc
bccfccsc
DFT
Tungsten: DFT, Embedded Atom ModelCarbon: tight binding
Quantum mechanics has some locality
force F
r
neighbourhood
far field
Force errors around Si self interstitial
Force errors around O in water with QM/MM
r
L̂i = qi + pi ·⇥i + . . .Vel
(R1
, R2
, . . .) ⇡atomsX
i
"(R1
�Ri, R2
�Ri, . . .) +1
2
X
ij
L̂iL̂j1
Rij+
�ij
|Rij |6
Finite range atomic energy function
Is there a finite range atomic energy function that gives the correct total energy and forces?
Traditional ideas for functional forms
• Pair potentials: Lennard-Jones, RDF-derived, etc.
• Three-body terms: Stillinger-Weber, MEAM, etc.
• Embedded Atom (no angular dependence)
• Bond Order Potential (BOP) Tight-binding-derived attractive term with pair-potential repulsion
• ReaxFF: kitchen-sink + hundreds of parameters
These are NOT THE CORRECT functions. Limited accuracy, not systematic
"i =1
2
X
j
V2(|rij |)
"i = ��P
j ⇢(|rij |)�
given byGOAL: potentials based on quantum mechanics
Representation is implicit
Representing the atomic neighbourhood in strongly bound materials
• What are the arguments of the atomic energy ε ? Need a representation, i.e. a coordinate transformation
- Exact symmetries: • Global Translation
• Global Rotation
• Reflection
• Permutation of atoms
- Faithful: different configurations correspond to different representations
- Continuous, differentiable, and smooth (i.e. slowly changing with atomic position) (“Lipschitz diffeomorphic”)
• Rotational invariance by itself is easy: q ≡ Rij = ri ⋅ rj (Weyl)
- Complete, but not invariant permutationally
- Not continuous with changing number of neighbours
r1
r2
r3
r4
Matrix of interatomic distances
• Equivalent to Weyl matrix
• Rotationally invariant: no alignment needed
• Permutation problem: Weyl matrix is not permutation invariant
2
6664
r1 · r1 r1 · r2 · · · r1 · rNr2 · r1 r2 · r2 · · · r2 · rN
......
. . ....
rN · r1 rN · r2 · · · rN · rN
3
7775
Drop atom ordering ?
• Permutation: too costly for ~ 20 neighbours
• Unordered list of distances? Distance histograms?
Not unique!
Atomic neighbour density function
• Translation ✓
• Permutation ✓
• Need rotationally invariant features of ρ(r)
⇥i(r) = s�(r) +�
j
�(r� rij)fcut(|rij |)
"(r1 � ri, r2 � ri, . . .) ⌘ "[⇢i(r)]
⇥(r, �,⇤) =⇥�
n=1
⇥�
l=0
l�
m=�l
cml,ngn(r)Y m
l (�, ⇤)
Rotational Invariance
Transformation of coefficients:
• Wigner matrices are unitary:
• Construct invariants:
• Equivalent to Steinhardt bond order Q2 , Q4 , Q6 … parameters
• Higher order invariants possible, (generalisation of Steinhardt W)
D�1 = D†
First few terms of power spectrum for 3 atoms
• Polynomials of Weyl invariants
• Highly oscillatory for large l : not smooth
• Different l channels uncoupledf1 = Y22 + Y2−2 + Y33 + Y3−3
f2 = Y21 + Y2−1 + Y32 + Y3−2
Uniqueness: environment reconstruction tests
Number of neighbours
Error in distinguishing
1. Select target configuration
2. Randomise neighbours
3. Try to reconstruct target by matching descriptors
RM
SD
Construct smooth similarity kernel directly
• Overlap integral
k(⇢i, ⇢i0) =X
n,n0,l
p(i)nn0lp(i0)nn0l
• After LOTS of algebra: SOAP kernel pnl = c†nlcnl
pnn0l = c†nlcn0l
S(⇢i, ⇢i0) =
Z⇢i(r)⇢i0(r)dr,
k(⇢i, ⇢i0) =
Z ���S(⇢i, R̂⇢i0)���2dR̂ =
ZdR̂
����Z
⇢i(r)⇢i0(R̂r)dr
����2
cutoff: compact support• Integrate over all 3D rotations:
4 6 8 10 12 14 16 18 20
10−2
10−1
100
101
n
Average
dREF
SOAPlmax= 2
l max= 4
l max= 6
Smooth Overlap of Atomic Positions (SOAP)
Number of neighbours
Erro
r in
dis
tingu
ishi
ng
RM
SD
Stability of fit: elastic constants of Si
Order of angular momentum expansion
How to generate databases?
• Target applications: large systems
• Capability of full quantum mechanics (QM): small systems
QM MD on “representative” small systems:
sheared primitive cell ➝ elasticitylarge unit cell ➝ phononssurface unit cells ➝ surface energygamma surfaces ➝ screw dislocationvacancy in small cell ➝ vacancy vacancy @ gamma surface ➝ vacancy near dislocation
Iterative refinement
1. QM MD ➝ Initial database2. Model MD3. QM ➝ Revised database
What is the acceptable validation protocol? How far can the domain of validity be extended?
Existing potentials for tungsten
DFT reference
Error
0
15%
C11 C12 C44
50%
Elastic const.
Vacancy
energy Surface energy
(100) (110) (111) (112)
BOPMEAM
FS
GAP
Building up databases for tungsten (W)
Improvement for tungsten (W)
Peierls barrier for screw dislocation glide
Vacancy-dislocation binding energy
(~100,000 atoms in 3D simulation box)
Outstanding problems
• Accuracy on database accuracy in properties?
• Database contents region of validity ?
• Alloys - permutational complexity? Chemical variability?
• Systematic treatment of long range effects
• Electronic temperature