icisa 2010 conference presentation

20
ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions CUDA-based Linear Solvers for Stable Fluids G. Amador and A. Gomes Departamento de Inform ´ atica Universidade da Beira Interior Covilh ˜ a, Portugal [email protected], [email protected] April, 2010

Upload: goncalo-amador

Post on 13-Apr-2017

40 views

Category:

Software


0 download

TRANSCRIPT

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

CUDA-based Linear Solvers for StableFluids

G. Amador and A. Gomes

Departamento de InformaticaUniversidade da Beira Interior

Covilha, Portugal

[email protected], [email protected]

April, 2010

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

1 Introduction2 Stable Fluids

The Eulerian approachPhysics Model

3 NVIDIA Compute Unified Device Architecture (CUDA)WorkflowIterative solvers

JacobiGauss-Seidel red-blackConjugate gradient

4 ResultsJacobi performanceGauss-Seidel performanceConjugate gradient performance

5 ConclusionsConclusionsFuture Work

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

OverviewThe study of fluid simulation (e.g., water) is importantfor two industries:

(real-time ≥ 30 fps) (off-line ≤ 30 fps)

Problems:How to implement (specifically for 3D stable fluids) theCUDA-based versions of the Jacobi, Gauss-Seidel,and conjugate gradient iterative solvers?What are the real-time performance limitations ofthese solvers implementations?

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

OverviewThe study of fluid simulation (e.g., water) is importantfor two industries:

(real-time ≥ 30 fps) (off-line ≤ 30 fps)Problems:How to implement (specifically for 3D stable fluids) theCUDA-based versions of the Jacobi, Gauss-Seidel,and conjugate gradient iterative solvers?What are the real-time performance limitations ofthese solvers implementations?

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

The Eulerian approach

The Eulerian approachSpace partitioning:

Variations of velocity and density are observed at thecenter of each cell.Velocities and densities are updated through an im-plicit method (Stam stable fluids, 1999), i.e., uncondi-tionally stable for any time step.

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Physics Model

Navier-Stokes equations for incompressible fluids

Mass conservation: ∇−→u = 0

Velocity evolution:∂−→u∂t

= −(−→u · ∇)−→u + v∇2−→u +

−→f

Density evolution:∂ρ

∂t= −

(−→u · ∇) ρ+ k∇2ρ+ S−→u : velocity field.v : fluids viscosity.ρ: density of the field.k : density diffusion rate.−→f : external forces added to the velocity field.

S: external sources added to the density field.

∇ =

(∂

∂x,∂

∂y,∂

∂z

): gradient.

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Physics Model

Navier-Stokes equations implementationUpdate velocity:

Add external forces (−→f ).

Velocity Diffusion (v∇2−→u ).Move (−

(−→u .∇)−→u e ∇−→u = 0).

Update density:Add external sources (S).Density advection (−

(−→u .∇) ρ).

Density diffusion (k∇2ρ).

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Physics Model

Navier-Stokes equations implementationUpdate velocity:

Add external forces (−→f ).

Velocity Diffusion (v∇2−→u ).Move (−

(−→u .∇)−→u e ∇−→u = 0).

Update density:Add external sources (S).Density advection (−

(−→u .∇) ρ).

Density diffusion (k∇2ρ).

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Physics Model

Diffusion

Exchanges of densityor velocity betweenneighbours (2D).

Solve a sparse linear system (Ax = b), using an iter-ative method (e.g., Jacobi, Gauss-Seidel, conjugategradient, etc.).

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Physics Model

MoveEnsure mass conservation and the fluid’s incom-pressibility.Hodge decomposition:Conservative field = our field - gradient

Determine the gradient using diffusion’s iterativemethod (e.g., Jacobi, Gauss-Seidel, conjugate gradi-ent, etc.).

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Workflow

Workflow

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Iterative solvers

Jacobi

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Iterative solvers

Gauss-Seidel red-black

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Iterative solvers

Conjugate gradient

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Jacobi performance

Jacobi performance

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Gauss-Seidel performance

Gauss-Seidel performance

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Conjugate gradient performance

Conjugate gradient performance

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Conclusions

ConclusionsThe CUDA-based implementation of the Gauss-Seidel solver allows more iterations than the CPU-based implementation, however it converges twotimes slower.The CUDA-based implementations of the Jacobi andGauss-Seidel iterative solvers achieved better perfor-mances (i.e. faster in processing time) than the CPU-based implementations.The CUDA-based implementation of the conjugategradient, for grid sizes superior to 643, due to globalmemory latency, performs worst than the CPU-basedversion.

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Future Work

Future WorkSearch ways, implementable using CUDA, to reduceglobal memory accesses (e.g., data structures, dy-namic memory, etc.).Implement the CPU-based multi-core versions ofthe solvers and compare their performance with theCUDA-based versions.Search new solvers implementable using CUDA, withbetter convergence rate than relaxation techniques(Jacobi and Gauss-Seidel), with no significant extracomputational effort such as the conjugate gradient.

ubi-logo

Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions

Future Work

Questions???