icisa 2010 conference presentation
TRANSCRIPT
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
CUDA-based Linear Solvers for StableFluids
G. Amador and A. Gomes
Departamento de InformaticaUniversidade da Beira Interior
Covilha, Portugal
[email protected], [email protected]
April, 2010
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
1 Introduction2 Stable Fluids
The Eulerian approachPhysics Model
3 NVIDIA Compute Unified Device Architecture (CUDA)WorkflowIterative solvers
JacobiGauss-Seidel red-blackConjugate gradient
4 ResultsJacobi performanceGauss-Seidel performanceConjugate gradient performance
5 ConclusionsConclusionsFuture Work
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
OverviewThe study of fluid simulation (e.g., water) is importantfor two industries:
(real-time ≥ 30 fps) (off-line ≤ 30 fps)
Problems:How to implement (specifically for 3D stable fluids) theCUDA-based versions of the Jacobi, Gauss-Seidel,and conjugate gradient iterative solvers?What are the real-time performance limitations ofthese solvers implementations?
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
OverviewThe study of fluid simulation (e.g., water) is importantfor two industries:
(real-time ≥ 30 fps) (off-line ≤ 30 fps)Problems:How to implement (specifically for 3D stable fluids) theCUDA-based versions of the Jacobi, Gauss-Seidel,and conjugate gradient iterative solvers?What are the real-time performance limitations ofthese solvers implementations?
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
The Eulerian approach
The Eulerian approachSpace partitioning:
Variations of velocity and density are observed at thecenter of each cell.Velocities and densities are updated through an im-plicit method (Stam stable fluids, 1999), i.e., uncondi-tionally stable for any time step.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations for incompressible fluids
Mass conservation: ∇−→u = 0
Velocity evolution:∂−→u∂t
= −(−→u · ∇)−→u + v∇2−→u +
−→f
Density evolution:∂ρ
∂t= −
(−→u · ∇) ρ+ k∇2ρ+ S−→u : velocity field.v : fluids viscosity.ρ: density of the field.k : density diffusion rate.−→f : external forces added to the velocity field.
S: external sources added to the density field.
∇ =
(∂
∂x,∂
∂y,∂
∂z
): gradient.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations implementationUpdate velocity:
Add external forces (−→f ).
Velocity Diffusion (v∇2−→u ).Move (−
(−→u .∇)−→u e ∇−→u = 0).
Update density:Add external sources (S).Density advection (−
(−→u .∇) ρ).
Density diffusion (k∇2ρ).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations implementationUpdate velocity:
Add external forces (−→f ).
Velocity Diffusion (v∇2−→u ).Move (−
(−→u .∇)−→u e ∇−→u = 0).
Update density:Add external sources (S).Density advection (−
(−→u .∇) ρ).
Density diffusion (k∇2ρ).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Diffusion
Exchanges of densityor velocity betweenneighbours (2D).
Solve a sparse linear system (Ax = b), using an iter-ative method (e.g., Jacobi, Gauss-Seidel, conjugategradient, etc.).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
MoveEnsure mass conservation and the fluid’s incom-pressibility.Hodge decomposition:Conservative field = our field - gradient
Determine the gradient using diffusion’s iterativemethod (e.g., Jacobi, Gauss-Seidel, conjugate gradi-ent, etc.).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Workflow
Workflow
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Iterative solvers
Jacobi
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Iterative solvers
Gauss-Seidel red-black
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Iterative solvers
Conjugate gradient
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Jacobi performance
Jacobi performance
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Gauss-Seidel performance
Gauss-Seidel performance
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Conjugate gradient performance
Conjugate gradient performance
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Conclusions
ConclusionsThe CUDA-based implementation of the Gauss-Seidel solver allows more iterations than the CPU-based implementation, however it converges twotimes slower.The CUDA-based implementations of the Jacobi andGauss-Seidel iterative solvers achieved better perfor-mances (i.e. faster in processing time) than the CPU-based implementations.The CUDA-based implementation of the conjugategradient, for grid sizes superior to 643, due to globalmemory latency, performs worst than the CPU-basedversion.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Future Work
Future WorkSearch ways, implementable using CUDA, to reduceglobal memory accesses (e.g., data structures, dy-namic memory, etc.).Implement the CPU-based multi-core versions ofthe solvers and compare their performance with theCUDA-based versions.Search new solvers implementable using CUDA, withbetter convergence rate than relaxation techniques(Jacobi and Gauss-Seidel), with no significant extracomputational effort such as the conjugate gradient.