simulation of lattice qcd with a gpu cluster

49
Simulation of QCD with a GPU cluster GPU Computing Seminar January 22, 2010 Ting-Wai Chiu Physics Department and Center for Quantum Science and Engineering National Taiwan University, Taipei, Taiwan (趙挺偉)

Upload: others

Post on 26-Jan-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Simulation of Lattice QCD with a GPU Cluster

Simulation of QCD with a GPU cluster

GPU Computing Seminar

January 22, 2010

Ting-Wai Chiu

Physics Department and

Center for Quantum Science and Engineering

National Taiwan University, Taipei, Taiwan

(趙挺偉)

Page 2: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

2

High Energy Physics

Goal: To understand the fundamental constituents of matter

and their fundamental interactions

Fundamental Interactions: Gravitational Force

Electromagnetic Force

Weak Interaction

Strong Interaction

Fundamental Particles: Atom Electron, Nucleus

Proton, Neutron

Quarks

Page 3: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

3

Page 4: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

4

Fundamental Interactions

Page 5: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

5

Fundamental Particles

Page 6: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

6

Page 7: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

7

Large Hadron Collider, CERN, Switzerland

HEP Experiments Huge Accelerator

Page 8: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

8

Tunnel of the Large Hadron Collider

The tunnel is buried around 50 to 175 m. underground

Page 9: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

9

ATLAS Detector

Page 10: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

10

LHC experiments will produce roughly 15 petabytes of data annually.

CERN is collaborating with institutions in 33 different countries (including

Taiwan) to operate a distributed computing and data storage infrastructure

known as the LHC Computing Grid

Page 11: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

11

ABSTRACT

Quantum Chromodynamics (QCD) is the quantum field

theory of the strong interaction, describing the interactions of

the quarks and gluons making up hadrons (e.g., proton,

neutron, and pion).

Quantum Chromodynamics (QCD) accounts for the

nuclear energy inside an atom, and plays an important role in

the evolution of the early universe.

To solve QCD is a grand challenge among all sciences.

The most promising approach to solve QCD nonperturbatively

is to discretize the continuum space-time into a 4 dimensional

lattice (lattice QCD), and to compute physical observables

with Monte Carlo simulation.

Page 12: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

12

ABSTRACT(cont)

To solve QCD is a grand challenge among all sciences.

The most promising approach to solve QCD nonperturbatively

is to discretize the continuum space-time into a 4 dimensional

lattice (lattice QCD), and to compute physical observables

with Monte Carlo simulation.

Page 13: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

13

For lattice QCD with exact chiral symmetry,

it often requires supercomputers (e.g., 10 racks of IBM

BlueGene supercomputer) to perform the simulations.

ABSTRACT (cont)

10 racks of IBM BG/L at KEK, Japan (57 TFLOPS peak)

Page 14: Simulation of Lattice QCD with a GPU Cluster

TWQCD Collaboration is the first lattice QCD group

around the world to use a GPU cluster (128 TFLOPS) to

perform large-scale unquenched simulations of lattice

QCD with exact chiral symmetry.

ABSTRACT (cont)

14T.W. Chiu, GPU Computing Seminar, January 22, 2010

Page 15: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

Introduction

Optimal Domain-Wall Fermion

HMC Simulation with ODWF

GPU Strategies

Research Highlights

Conclusion and Outlook

Outlines

15

Page 16: Simulation of Lattice QCD with a GPU Cluster

is the quantum field theory for the strong interaction,

describing the interactions of the quarks and gluons making

up hadrons (e.g. proton, neutron, and pion). It accoun

nu

ts for

the clear energy

CQ D

evolution of the early u

inside the nucleus of an atom, and plays

an important role in the .

:

Gauge group gluons have self-interactions.

Asymptotic freed

(3)

( ) 0 a

n

om:

i

s

verse

Salient features

SU

g r r

15

.

IR slavory: quark/color confinement

No exact

0

( ) 1 a

analytic solu

t 10 m

tions

g r r

Quantum Chromodynamics (QCD)

16T.W. Chiu, GPU Computing Seminar, January 22, 2010

Page 17: Simulation of Lattice QCD with a GPU Cluster

Quarks

Quarks are spin fermions carrying color,

and there are 6 species (flavors) of quarks.

1

2

u c t

d s b

u c t

d s b

u c t

d s b

Hadrons are color singlets composed of quarks

antisym. in color, ProtonuudP

antisym. in color, NeutronuddN

, Piond uudu d

The nuclear force between nucleons emerges as

residual interactions of QCD

17T.W. Chiu, GPU Computing Seminar, January 22, 2010

Page 18: Simulation of Lattice QCD with a GPU Cluster

The Challenge of QCD

At the hadronic scale, , perturbation theory isincapable to extract any quantities from QCD, nor to tacklethe most interesting physics, namely, the spontaneouslychiral symmetry breaking and the color confinement

1g r

To extract any physical quantities from the first principlesof QCD, one has to solve QCD nonperturbatively.

A viable nonperturbative formulation of QCD was firstproposed by K. G. Wilson in 1974.

But, the problem of lattice fermion, and how to formulate exact chiral symmetry on the lattice had not been resolved until 1992-98 [Kaplan, Neuberger, Narayanan,…],

i.e., Lattice QCD with exact chiral symmetry.

18T.W. Chiu, GPU Computing Seminar,

January 22, 2010

Page 19: Simulation of Lattice QCD with a GPU Cluster

Basic notions of Lattice QCD

1. Perform Wick rotation: , then ,and the expectation value of any observable O

4t ix exp( ) exp( )EiS S

1

, , ESO dA d d O A eZ

ESZ dA d d e

2. Discretize the space-time as a 4-d lattice with latticespacing a. Then the path integral in QFT becomes a well-definedmultiple integral which can be evaluated via Monte Carlo

44L Na

1

, , E

i j

i j

S

k

k

eO dA d d O AZ

19

T.W. Chiu, GPU Computing Seminar, January 22, 2010

Page 20: Simulation of Lattice QCD with a GPU Cluster

Gluon fields on the Lattice

Then the gluon action on the lattice can be written as

where

ˆx a ˆˆx a a

x ˆx a

The color gluon field are defined on each linkconnecting and , through the link variable

A x 3SU

x ˆx a

ˆexp2

aU x iagA x

4

2plaquette

6 1 11 Re 0

3 2g pS U tr U a d x tr F x F x

g

† †ˆˆpU U x U x a U x a U x

20T.W. Chiu, GPU Computing Seminar, January 22, 2010

Page 21: Simulation of Lattice QCD with a GPU Cluster

Lattice QCD

,

The QCD action

where is the action of the gluon fields

( ) ( )

( )

( ) ( )

, , , , ,

fl v a r o

G

G

f f

a x f a x b y b y

S S U D U

S U

D U D U

f u d s c b t

sites

, 1,2,3

, =1,

index

color index

2,3,4

, 1

Dirac

, , =

inde

s

x

x y z t

a b

x y N N N N N

3

1

16 32

1,572,864 1,572,864

ite index

For example, on the lattice, is a complex matrix

of size

det( )

d

( , , ) ( , )( ,

et( ), )

G

G

SS

SS

dUd d U e dU D U eU

dUd d e d

D

U D e

D

21T.W. Chiu, GPU Computing Seminar,

January 22, 2010

Page 22: Simulation of Lattice QCD with a GPU Cluster

The Challenge of Lattice QCD

So far, the lightest u/d quark cannot be put on the lattice.

Rely on ChPT to extrapolate lattice results to physical ones.

To have lattice volume large enough such tha t 1.m L

l To have attice spacing small enough such that 1.qm a

3

The lattice size should be at least .

The computing power should be at least

To meet the above two conditions:

100 200

Petaflops year

22T.W. Chiu, GPU Computing Seminar,

January 22, 2010

Page 23: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

23

The Hardware

Page 24: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

One unit (1U) of Tesla S1070

4 GPUs, each with 4 GB

# of Tesla GPUs 4

# of Streaming Processor

Cores960 (240 per processor)

Frequency of processor

cores1.296 to 1.44 GHz

Single Precision floating

point performance (peak)3.73 to 4.14 TFlops

Double Precision floating

point performance (peak)311 to 345 GFlops

Floating Point Precision IEEE 754 single & double

Total Dedicated Memory 16GB

Memory Interface 512-bit

Memory Bandwidth 408GB/sec

Max Power Consumption 800 W

System Interface PCIe x16 or x8

Software Development

ToolsC-based CUDA Toolkit

24

Page 25: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

Two GTX285 on one motherboard

25

Page 26: Simulation of Lattice QCD with a GPU Cluster

The Hardware (cont)

16 units of Nvidia Tesla S1070 (total 64 GPUs, 64 x 4 GB)

connected to 16 servers (total 32 Intel QC Xeon, 16 x 32 GB)

Peak performance is 128 TFLOPS

Attaining 15 TFLOPS (sustained) with a price $220,000.

Developed efficient CUDA codes for unquenched lattice QCD.

64 Nvidia GTX285 (total 64 GPUs, 64 x 2/1 GB), connected to

32 servers (total 32 Intel i7, 32 x 12 GB)

Hard disk storage > 300 TB, Lustre cluster file system

26T.W. Chiu, GPU Computing Seminar,

January 22, 2010

140/100 Gflops per each GTX285/T10

Page 27: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

Optimal Domain-Wall Fermion

[ TWC, Phys. Rev. Lett. 90 (2003) 071601 ]

, , , 1 , 1 ,, ,, 1 ,

odwf

odwf

sN

x s w s s w s s s s x sx x x xs x

s s

s x

I D I D P PA

D

4

0 01

, 0,2wD t W m m

†, ,

1, ( )

2x x x xt x x U x U x

4

†, , ,

1

1, 2

2x x x x x xW x x U x U x

0

,0 ,2

q

s

mP x P x N

m

with boundary conditions

0

, 1 ,12

q

s

mP x N P x

m

27

Page 28: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

The weights are fixed such that the effective 4D Dirac operator

possesses the optimal chiral symmetry,

s

2 2

min

11 ; , 1, ,s s ssn v s N

where is the Jacobian elliptic function with argument

and modulus ,

and are lower and upper bounds of the eigenvalues of

;ssn v sv2 2

min max1

Optimal Domain-Wall Fermion (cont.)

The action for Pauli-Villars fields is similar to

, ,, , 1 , 1, ,, 1 ,

s

x s x s

N

PV w s s w s s s sx x x xs

s ss x x

A I D I D P P

but with boundary conditions:

2

min 2

max2

wH

odwfA

PV 0,0 , , 2 sP x P x N m m

, 1 ,1sP x N P x

28

Page 29: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

odwf PVexp det qd d d d A A D m

Optimal Domain-Wall Fermion (cont.)

0 52 1q q q opt wD m m m m S H

, 2

1, 2

, 2 1

, 2

w

n n

Z w s

opt w n n

w Z w s

H R H N nS H

H R H N n

Zolotarev optimal rational approx. for2

1

wH

(1847 – 1878)

(1877)

The effective 4D Dirac operator

29

Page 30: Simulation of Lattice QCD with a GPU Cluster

( , )1 n m

ZxR x

log x

The salient feature of optimal rational approximation

Has alternate

change of sign in ,

and attains its max. and min.

(all with equal magnitude)

( 2)n m min max,x x

[1,1000]

In the figure, 6,

it has 14 alternate change

of sign in

n m

T.W. Chiu, GPU Computing Seminar, January 22, 2010

30

Page 31: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

Even-Odd Preconditioning of ODWF Matrix

, , 1 , 1, ,, ; ,odwf

=

s w s s s w s s s sx x x xx s x s

eo

w

oe

w

I D I D P P

X D Y

D Y X

D

, , 1 , 1s s s s s sL P P

with boundary conditions:, ,1

0

1, ,

0

2

2

s

s

q

N s s

q

s s N

mP L P

m

mP L P

m

, ,

, 0 , ,

( )

(4 ) ( ) ( )

s s s s s

s s s s s s s

Y I L

X m I L I L

31

Page 32: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

Even-Odd Preconditioning of ODWF Matrix (cont)

For 2-flavor QCD, the pseudofermion action is

1 1

odwfdet det( ) detoe eo

w wI D YX D YX CD

1 1oe eo

w wC I D YX D YX

† † † 1( )PV PVPF C CC CA

1

1 1

0 0

0 0

eo eo

w w

oe oe eooew w ww

I XX D Y I X D Y

D YX I X D YX D YD Y X I

Schur complement

0( 2 )PV qC C m m

32

Page 33: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

Hybrid Monte Carlo (HMC) for 2 flavor QCD

33

† † † 1

2

1. Initial gauge configuration

2. Generate with probability distribution

3. Generate with probability distributi exp( )

R

{ }

{ } exp[ ( ) / 2]

ecall: exp[ ( )

n

]

o

=PV

l

PV

l

a a

l

U

P P

C CC C

1

1†

exp[ ]

Omelyan integrator with multiple-time scale

4. Fixing the pseudofermion field

5. Molecular dynamics (

( ) ( ( ))

)

th e most expensive part of HMC

(

)

l

PV

DD U

U

C C D

† †

gauge

( ) ( ), ( ) ( )

( ) ( ) ( ) (

6. Accept with the probab

)

ility

7. Go t

{ }

( )

min 1,

o

ex

2

p( )

.

a a

l l l l

a a a

l l

l

l

AU P

iP U P

H

P T

P D A U D DD U

H

Page 34: Simulation of Lattice QCD with a GPU Cluster

†, Ax b A CC

, ,

, ,

k k k k

k

k k k k

r r r r

p Ap Cp Cp

Conjugate Gradient algorithm †( )CC

1k k k kx x p

1k k k kr r Ap

1 1,

,

k k

k

k k

r r

r r

1 1k k k kp r p

0 0 0 0 00, , x r b Ax p r

1If | | | |, then stopkr b

T.W. Chiu, GPU Computing Seminar, January 22, 2010

PVC †

PVC

34

The most time-consuming operations

are the matrix-vector multiplications:

(

GPU computes in SP D

)

P

kC C p

Page 35: Simulation of Lattice QCD with a GPU Cluster

CG Algorithm with Mixed Precision

1

1

1

1 1

1.

2. If | | | |, then stop

3.

4.

5. Go to 1.

Pr

Let

the

Solve in

| | |

single precision to an accuracy 1

,

oo

|

f of convergence:

|

|n

|

|

k k

k

k

k k

k

k k

k k k k

k k

At r

r b Ax

r b

x x

s r A s r

r b A

t

x

t

1| | | | | | | |k kkk kb Ax A s rt r

T.W. Chiu, GPU Computing Seminar, January 22, 2010

35

Page 36: Simulation of Lattice QCD with a GPU Cluster

36

Hardware Model

T10/ GTX285

Global memory: 4/1/2 Gbytes

# of multiprocessors: 30

# of cores: 240

Constant memory: 65536 bytes

Shared memory/block: 16384 bytes

# of registers/block: 16384

Warp size: 32

Max. # threads/block: 512

Block-size limits x,y,z: 512, 512, 64

Grid-size limits in x,y: 65535, 65535

T.W. Chiu, GPU Computing Seminar, January 22, 2010

Page 37: Simulation of Lattice QCD with a GPU Cluster

CUDA Programming Model

A kernel is executed by a grid

of thread blocks

A thread block is a set of threads that can cooperate with each other by:

Sharing data through shared memory

Synchronizing their execution

Threads from different blocks cannot cooperate

Host

Kernel

1

Kernel

2

Device

Grid 1

Block

(0, 0)

Block

(1, 0)

Block

(2, 0)

Block

(0, 1)

Block

(1, 1)

Block

(2, 1)

Grid 2

Block (1, 1)

Thread

(0, 1)

Thread

(1, 1)

Thread

(2, 1)

Thread

(3, 1)

Thread

(4, 1)

Thread

(0, 2)

Thread

(1, 2)

Thread

(2, 2)

Thread

(3, 2)

Thread

(4, 2)

Thread

(0, 0)

Thread

(1, 0)

Thread

(2, 0)

Thread

(3, 0)

Thread

(4, 0)

T.W. Chiu, GPU Computing Seminar, January 22, 2010

37

CPU GPU

Page 38: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

1 1oe eo

w wC I D YX D YX

Features of CG Kernel for ODWF

†( )CC

• 2-dim Block

One thread takes care of all computations at each

with a loop going over all (even/odd).

( , , , )s x y z

t3

s xX YNthread Nthread Nblock N N

16 (1,2,...)

64

sX

X Y

Nthread N

Nthread Nthread

• 1-dim Grid

38

Page 39: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

• in constant memory space 1M YX

39

Tuning the CG Kernel for ODWF

• Use texture memory for link variables and vectors

• Reorder data in the device memory

(coalescing; Ns threads share the same link variables)

• Reuse forward/backward data (in t) for neighboring sites

• Unroll short loops

• …

Attaining 140/120/100 Gflops for GTX285/GTX280/T10

Page 40: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

Salient Features of TWQCD’s HMC Simulation

Even-Odd Preconditioning for the 4D Quark Matrix

Conjugate Gradient with Mixed Precision

HMC with Multiple Time Scale Integration

and Mass Preconditioning

Omelyan Integrator for the Molecular Dynamics

New Algorithm for (2+1)-flavor QCD

The First Large-Scale Simulation of Unquenched Lattice

QCD with Exact Chiral Symmetry using a GPU Cluster.

Chiral Symmetry is preserved exactly with ODWF

All Topological Sectors are sampled ergodically.

40

Page 41: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

41

The First Large-Scale Dynamical Lattice QCD

Simulation with a GPU Cluster

The lattice QCD group (TWQCD) based at NTU

is the first group around the world to use a GPU cluster

to perform large-scale simulations of lattice QCD with

exact chiral symmetry. [T.W. Chiu et al., PoS (LAT2009)

034, arXiv:0911.0529].

Currently, there are only 3 groups (RBC/UKQCD,

JLQCD, and TWQCD) around the world who are capable

to perform large scale simulation of lattice QCD with

exact chiral symmetry.

Technical Breakthrough

Page 42: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

42

Research Highlights

Zero Temperature QCD ( 2)fN

ChPT to NLO2

3 2

412 1 ln ,

2 (4 )

q

q

BmmB x x c x x

m f

Page 43: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

43

Research Highlights (cont)

4Fitting data to 1 ln 0.0 1 6 9 ( )f f x x c x f a

1Using as in131 MeV p 1.45(9u ) GeVtf a

Page 44: Simulation of Lattice QCD with a GPU Cluster

• The vacuum (ground state) of QCD constitutes various

quantum fluctuations, which are the origin of many

interesting and important nonperturbative physics

• In QCD, each gauge configuration possesses a

well-defined topological charge Q with integer value.

• It is important to determine the topological charge

fluctuations in the QCD vacuum. One of the

outstanding problems is to identify the vacuum

fluctuations which are crucial to the mechanism of

color confinementT.W. Chiu, GPU Computing Seminar,

January 22, 201044

Research Highlights (cont)

Topological structure of the QCD vacuum

Page 45: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

45

Quantum fluctuations in the QCD vacuum

1tQ

244 10 sect

316 32

16

33 15

1.4 10 m

2.2 10 m

x a

L

Page 46: Simulation of Lattice QCD with a GPU Cluster

46

Finite Temperature QCD

The nature of finite temperature phase transition in QCD is

relevant to the evolution of the early universe from the

quark-gluon plasma to the hadrons.

However this issue has remained controversial for many

years. The Wuppertal group finds the deconfining transition

[170(7) MeV] and the chiral transition [146(5) MeV] are at

widely separated temperatures. On the other hand,the

Brookhaven/Bielefeld collaboration claims that both

temperatures coincide at 196(3) MeV. Since both groups

used the staggered fermion, it is vital to resolve this issue in

the framework of lattice QCD with exact chiral symmetry.

We are in a good position to resolve this problem.

Research Highlights (cont)

Page 47: Simulation of Lattice QCD with a GPU Cluster

47

Page 48: Simulation of Lattice QCD with a GPU Cluster

Time Era Temperature Characteristics of the Universe

0 to 10-43 s Big Bang infiniteinfinitely small, infinitely densePrimeval fireball 1 force in nature - Supergravity

10-43 s Planck Time 1032 KEarliest known time that can be described by modern physics 2 forces in nature, gravity, GUT

10-35 s End of GUT 1027 K

3 forces in nature, gravity, strong nuclear, electroweak Quarks and leptons form (along with their anti-particles)

10-35 to 10-33 s Inflation 1027 KSize of the Universe drastically increased, by factor of 1030to 1040

10-12 s End of unified forces 1015 K4 fundamental forces in nature,protons and neutrons start forming from quarks

10-7 s Heavy Particle 1014 Kproton, neutron production in full swing

10-4 s Light particle 1012 K electrons and positrons form100 s (a few minutes)

Nucleosynthesis era 109 - 107 Khelium, deuterium, and a few other elements form

380,000 years Recombination (Decoupling) 3000 KMatter and radiation seperateEnd of radiation domination, start of matter domination of the Universe

500 million yrs Galaxy formation 10 Kgalaxies and other large structures form in the universe

14 billion years or so

Now 3 KYou are reading this table, that's what's happening.

Big Bang Timeline

Page 49: Simulation of Lattice QCD with a GPU Cluster

T.W. Chiu, GPU Computing Seminar, January 22, 2010

49

Conclusion and Outlook

GPU has emerged as a revolutionary device for

lattice QCD as well as any computational science.

It is crucial to TWQCD’s dynamical DWF project.

Currently, there are many lattice QCD groups

around the world building large GPU clusters

dedicated to QCD. This will have significant

impacts to lattice QCD, leading to new

discoveries in the strong interaction physics.