![Page 1: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/1.jpg)
ANTON
D.E Shaw Research
![Page 2: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/2.jpg)
Force Fields: Typical Energy Functions
20
20
12 6
1( )
2
1( )
2
[1 cos( )]2
( )
[ ]
rbonds
angles
n
torsions
improper
i j
elec ij
ij ij
LJ ij ij
U k r r
k
Vn
V improper torsion
q q
r
A B
r r
Bond stretches
Angle bending
Torsional rotation
Improper torsion (sp2)
Electrostatic interaction
Lennard-Jones interaction
![Page 3: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/3.jpg)
Molecular Dynamics
• Solve Newton’s equation for a molecular system:
amF
![Page 4: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/4.jpg)
Integrator: Verlet Algorithm
)()()()(2)( 42 tOtatttrtrttr
)()(2
1)()()( 32 tOtatttvtrttr
)()(2
1)()()( 32 tOtatttvtrttr
Start with {r(t), v(t)}, integrate it to {r(t+t), v(t+t)}:
{r(t), v(t)}
{r(t+t), v(t+t)}
The new position at t+t:
Similarly, the old position at t-t:
(1)
(2)
Add (1) and (2):
Thus the velocity at t is:
(3)
)())()((2
1)()( 2tOttrttr
ttrtv
(4)
![Page 5: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/5.jpg)
Molecular Dynamics
IterateIterate
Iterate... and iterate
Iterate... and iterate
Integrate Newton’s laws of motion
![Page 6: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/6.jpg)
Two Distinct Problems
Problem 1: Simulate many short trajectories
Problem 2: Simulate one long trajectory
![Page 7: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/7.jpg)
Simulating Many Short Trajectories
Can answer surprising number of interesting questions
Can be done using– Many slow computers– Distributed processing approach– Little inter-processor communication
E.g., Pande’s Folding at Home project
![Page 8: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/8.jpg)
Simulating One Long Trajectory
Harder problem
Essential to elucidate many biologically interesting processes
Requires a single machine with– Extremely high performance– Truly massive parallelism– Lots of inter-processor communication
![Page 9: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/9.jpg)
DESRES Goal
Single, millisecond-scale MD simulations (long trajectories)
– Protein with 64K or more atoms
– Explicit water molecules
Why?
– That’s the time scale at which many biologically interesting things start to happen
![Page 10: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/10.jpg)
Image: Istvan Kolossvary & Annabel Todd, D. E. Shaw Research
Protein Folding
![Page 11: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/11.jpg)
What Will It Take to Simulate a Millisecond?
We need an enormous increase in speed– Current (single processor): ~ 100 ms / fs– Goal will require < 10 s / fs
Required speedup:
> 10,000 x faster than current single-processor speed
~ 1,000x faster than current parallel implementations
Can’t accept >10,000x the power (~5 Megawatts)!
![Page 12: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/12.jpg)
What Takes So Long?
Inner loop of force field evaluation looks at all pairs of atoms (within distance R)
On the order of 64K atoms in typical system
Repeat ~1012 times
Current approaches too slow by several orders of magnitude
What can be done?
![Page 13: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/13.jpg)
• Parallelization• (getting an idea of the level of computation
needed)
• For every time step, every atom must communicate within its cutt-off radius with every other atom.
• 2) A lot of inter-processor communication that can be scaled well is needed.
MD Simulator Requirements
![Page 14: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/14.jpg)
• Parallelization• (getting an idea of the level of computation
needed)
• Whole System is broken down into boxes (processing nodes)
• Each node handles the bonded interactions within
• NT method for non-bonded interactions (much more common).
• NT method for Atom Migration
MD Simulator Requirements
![Page 15: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/15.jpg)
• 1) Need a huge number of arithmetic processing elements
• 2) A lot of inter-processor communication that can be scaled well is needed.
• 3) Memory is not an issue– With 25,000 atoms (64bytes
each) total=1.6MB over 512 nodes=3.2KB/node which is < most L1
Why Specialized Hardware?
Memory
Communication
Computation
Needs
![Page 16: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/16.jpg)
• Consider Moore’s Law on 10X improvement in 5 years vs. Anton’s 1000X in 1 year.
• Can great discoveries wait?• Can use custom pipelines
with more precision, increased datapath logic speed, over less silicon area.
• Have Tailored ISA’s for geometric calculations+• Programmability for accommodating various force fields and
integration algorithms• Dedicated memory for each particle to accumulate forces
Why Specialized Hardware?Memory
Communication
Computation
Needs
![Page 17: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/17.jpg)
ANTON Strategy
New architectures– Design a specialized machine– Enormously parallel architecture– Based on special-purpose ASICs– Dramatically faster for MD, but less flexible– Projected completion: 2008
New algorithms– Applicable to
• Conventional clusters• Our own machine
– Scale to very large # of processing elements
![Page 18: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/18.jpg)
Interdisciplinary Lab
Computational Chemists and Biologists
Computer Scientists and Applied Mathematicians
Computer Architects and Engineers
![Page 19: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/19.jpg)
Alternative Machine Architectures
Conventional cluster of commodity processors
General-purpose scientific supercomputer
Special-purpose molecular dynamics machine
![Page 20: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/20.jpg)
Conventional Cluster of Commodity Processors
Strengths:
– Flexibility
– Mass market economies of scale
Limitations
– Doesn’t exploit special features of the problem
– Communication bottlenecks
• Between processor and memory
• Among processors
– Insufficient arithmetic power
![Page 21: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/21.jpg)
General-Purpose Scientific Supercomputer
E.g., IBM Blue Gene
More demanding goal than ours– General-purpose scientific supercomputing– Fast for wide range of applications
Strengths:– Flexibility– Ease of programmability
Limitations for MD simulations– Expensive– Still not fast enough for our purposes
![Page 22: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/22.jpg)
Anton: Special-Purpose MD Machine
Strengths:
– Several orders of magnitude faster for MD
– Excellent cost/performance characteristics
Limitations:
– Not designed for other scientific applications
• They’d be difficult to program
• Still wouldn’t be especially fast
– Limited flexibility
![Page 23: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/23.jpg)
Anton System-Level Organization
Multiple segments (probably 8 in first machine)
512 nodes (each consists of one ASIC plus DRAM) per segment– Organized in an 8 x 8 x 8 toroidal mesh
Each ASIC equivalent performance to roughly 500 general purpose microprocessors– ASIC power similar to a single
microprocessor
![Page 24: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/24.jpg)
3D Torus Network
![Page 25: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/25.jpg)
Why a 3D Torus?
Topology reflects physical space being simulated:– Three-dimensional nearest neighbor
connections– Periodic boundary conditions
Bulk of communications is to near neighbors– No switching to reach immediate neighbors
![Page 26: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/26.jpg)
Source of Speedup on Our Machine
Judicious use of arithmetic specialization
– Flexibility, programmability only where needed
– Elsewhere, hardware tailored for speed• Tables and parameters, but not
programmable
Carefully choreographed communication
– Data flows to just where it’s needed
– Almost never need to access off-chip memory
![Page 27: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/27.jpg)
Two Subsystems on Each ASIC
SpecializedSubsystem
FlexibleSubsystem
Programmable, general-purpose
Efficient geometric operations
Modest clock rate
Pairwise point interactions
Enormously parallel
Aggressive clock rate
![Page 28: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/28.jpg)
28
Anton• 33M gate ASIC• Two computational
subsystems connected by communication ring
• Hardware datapaths compute over 25 billion interactions/s
• Full machine has 512 ASICs in a 3D torus
• 13 embedded processors
Flexible Subsystem
High-Throughput Interaction Subsystem
Me
mo
ry C
on
tro
ller
Host Interface
Channel Channel Channel
Ch
an
ne
l C
ha
nn
el
Ch
an
ne
l
Ro
ute
r
Router Router
Ro
ute
r Memory Controller
Ro
ute
r
Router
DRAM
DR
AM
+Z
+Y +X
-Z
-X
-Y Host Processor
Communication Ring
![Page 29: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/29.jpg)
Where We Use Specialized Hardware
Specialized hardware (with tables, parameters) where:
Inner loop
Simple, regular algorithmic structure
Unlikely to change
Examples:
Electrostatic forces
Van der Waals interactions
![Page 30: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/30.jpg)
Example: Particle Interaction Pipeline (one of 32)
![Page 31: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/31.jpg)
– Executes Non-bonded MD interaction calculations (Charge Spreading & Force Interpolation)
– Accumulates forces on each particle as data streams through.– ICB Controls flow of data through the HTIS, programmable ISA
extensions, acts as a buffering, pre-fetching, synchronization, and write back
controller
High-Throughput Interaction Subsystem
![Page 32: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/32.jpg)
Array of 32 Particle Interaction Pipelines
![Page 33: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/33.jpg)
Advantages of Particle Interaction Pipelines
Save area that would have been allocated to
– Cache
– Control logic
– Wires
Achieve extremely high arithmetic density
Save time that would have been spent on
– Cache misses,
– Load/store instructions
– Misc. data shuffling
![Page 34: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/34.jpg)
Where We Use Flexible Hardware
– Use programmable hardware where:• Algorithm less regular• Smaller % of total computation
- E.g., local interactions (fewer of them)• More likely to change
– Examples:• Bonded interactions• Bond length constraints• Experimentation with
- New, short-range force field terms- Alternative integration techniques
![Page 35: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/35.jpg)
Forms of Parallelism in Flexible Subsystem
The Flexible Subsystem exploits three forms of parallelism:
– Multi-core parallelism (4 Tensilicas, 8 Geometry Cores)
– Instruction-level parallelism
– SIMD parallelism – calculate on 3D and 4D vectors as single operation
![Page 36: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/36.jpg)
Overview of the Flexible Subsystem
GC = Geometry Core
(each a VLIW processor)
![Page 37: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/37.jpg)
Geometry Core(one of 8; 64 pipelined lanes/chip)
+
X
+ + + +
InstructionMemory
Decode
FromTensilica
Core
X X X
X Y Z W
PC
+
X
+ + + +
X X X
X Y Z W
DataMemory
f f f f f f f f
![Page 38: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/38.jpg)
But Communication is Still a Bottleneck
Scalability limited by inter-chip communication
To execute a single millisecond-scale simulation,
– Need a huge number of processing elements
– Must dramatically reduce amount of data transferred between these processing elements
Can’t do this without fundamentally new algorithms:
– A family of Neutral Territory (NT) methods that reduce pair interaction communication load significantly
– A new variant of Ewald distant method, Gaussian Split Ewald (GSE) which simplifies calculation and communication for distant interactions
– These are the subject of a different talk.
![Page 39: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/39.jpg)
39
Anton in Action
![Page 40: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/40.jpg)
• 500X NAMD 80-100X Desmond 100X Blue Matter
Simulation Evaluations
![Page 41: ANTON D.E Shaw Research. Force Fields: Typical Energy Functions Bond stretches Angle bending Torsional rotation Improper torsion (sp2) Electrostatic interaction](https://reader035.vdocument.in/reader035/viewer/2022062421/56649e445503460f94b380a4/html5/thumbnails/41.jpg)
GPU+FPGA ???
GPU
6*GDDR5
FPGA
HIGH SPEED SERIAL I/O
UP TO 2 Tbit/S
LVDS
FFT and LJ
16*PCIe