gpu based global illumination of point models using fmm

1
( ) ( ) { } p qf qf é ù - ê ú ¥ ë û - - + = = =- + ì ü ï ï ï ï = í ý ï ï ï ï - î þ åå å r r /2 2 2 , 2 , 4 4 0 0 2 1 1 n n j j m n m n n j x x y n j y y n n j m n j x y x Y rY r r r GPU based Global Illumination of Point Models using FMM Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran | ViGIL IIT Bombay {rhushabh, prekshu, sharat}@cse.iitb.ac.in Motivation Scans of cultural heritage structures involve millions of points Seamless, geometrically consistent mesh creation is virtually not possible as it is difficult to automatically segment different objects Wish to preserve in a virtual museum and view them under novel, complex lighting conditions Problem Statement Capturing radiosity based inter-reflection effects in a scene when the inputs are point samples of hard to segment entities Exploit the inherent parallelism in the proposed solution using CUDA based GPU techniques to achieve fast solutions Global Illumination and Visibility Computing visibility between a point-pair is difficult due to lack of intermediate surface information Contributions View-Independent visibility solution between point-pairs required for computing correct global illumination Radiosity solution for complex point models using the Fast Multipole Method Exploit parallelism in the proposed algorithm to achieve upto 20x speed-up using GPUs Fast Multipole Method FMM is concerned with evaluating the effect of a “set of source points” Y, on a set of “evaluation points” X Total complexity : O(NM) FMM attempts to reduce this irreducible complexity to O(N+M) The three main insights that make this possible are 1. Factorization of the kernel into source and receiver terms 2. Many application domains can tolerate approximations upto a certain level 3. Clever use of hierarchical structure (Octree) Each node has an associated Interaction List Radiosity-based Global Illumination A N-body problem FMM: The Technique [Karapurkar A., Chandran S., ICVGIP 2004] Assigning Weights to Points Requires integration over a surface area For Point Based Models, we do not have any surface information; we therefore approximate this integration Weights are assigned to each point and signify the contribution of the point to the reconstruction of the surface Local property based on the normal available at points. As the number of points increase, the integration is computed more accurately Visibility Map Construction In general, it is advantageous to have links at high level in the tree Leaf-Leaf Visibility Parallel Octree Construction on GPU Parallel V-map Construction on GPU Parallel FMM on GPU Qualitative Results Quantitative Results The visibility map (or V- map) for a tree is a collection of visibility links for every node in the tree. The visibility link for any node p is a list L of nodes at the same level; every point in any node in L is guaranteed to be visible from every point in p The visibility map is constructed recursively by a variation of DFS Compute Visibility A B Roo t Complet e Visibil ity A B Roo t New Visibilit y Link Looku p Looku p Partia l Visibi lity A B Roo t New Visibilit y Link Looku p Looku p Compute Visibility A B Roo t a 2 a 3 a 4 b 1 a 2 b 1 a 1 c 1 c 2 c z p q c 1 c 2 cz 1 c 1 c 2 cz 2 c 1 c 2 cz 2 c 1 c 2 cz 1 4 11 6 5 9 7 1 8 10 2 3 12 1 2 3 4 5 6 7 8 9 10 11 12 N 4 11 6 5 9 7 1 8 10 2 3 12 1 3 8 7 9 4 5 6 11 2 10 12 N 1 N 2 N 3 N 4 4 11 6 5 9 7 1 8 10 2 3 12 3 1 8 7 9 4 11 5 6 10 12 2 N 13 N 2 N 33 N 42 N 14 N 34 N 43 N 44 N 1 N 3 N 4 N 2 0 1 2 3 4 5 6 78 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 Lookup from LVS A B Roo t Complet e Visibil ity A B Roo t New Visibilit y Link Looku p Looku p Partia l Visibi lity A B Roo t New Visibilit y Link Looku p Looku p Compute Visibility in Parallel & Store in LVS A B Roo t a 2 a 3 a 1 a 4 b 1 Sources Receiver X X X Receivers O O O Source 3 1 2 3 1 2 X {,, , }, , 1 Y {,, , }, , 1 n i m j x x x x i N y y y y j M = Î = = Î = K ¡ K K ¡ K r p Î - - = + - ò r r r r r r r r 4 [.( )][ .( )] () () () () y x y x y x y S y x n r r n r r Bx Ex x B y dy r r CPU GPU 1001 1231 1742 2117 2323 993 1421 2521 3981 7851 0 2000 4000 6000 8000 10000 12000 5 6 7 8 9 CPU GPU Tim e taken (m s) Octree level Bunny (124531 points) 1321 1536 2009 2654 3658 1200 1981 2997 4521 8001 0 2000 4000 6000 8000 10000 12000 14000 5 6 7 8 9 CPU GPU Tim e taken (m s) Ganpati (165646 points) Octree level 68.25 83.47 95.98 652.51 1159.04 1839.96 0 500 1000 1500 2000 2500 5 6 7 8 9 10 CPU GPU Tim e taken (m ins) Octree level Bunnyin Cornell Room (Upto 19x speed-up) 76.53 94.37 101.12 587.02 998.26 1747.85 0 200 400 600 800 1000 1200 1400 1600 1800 2000 5 6 7 8 9 10 CPU GPU Tim e taken (m ins) Octree level Ganesha in Cornell Room (Upto 17x speed-up) 15.96 19.18 21.11 23.81 25.87 1.01 1.09 1.16 1.21 1.3 0 5 10 15 20 25 30 200 150 100 50 25 CPU GPU Tim e taken (hrs) M axim um numberofpointsperleaf Bunnyin Cornell Room (Upto 20 x speed-up) 14.54 16.58 20.81 23.15 26.37 1.11 1.16 1.21 1.28 1.41 0 5 10 15 20 25 30 200 150 100 50 25 CPU GPU Tim e taken (hrs) Ganesha in Cornell Room (Upto 19x speed-up) M axim um numberofpointsperleaf Octree Construction V-map Construction Phases of FMM Empt y Empty Upward pass Step 1 Upward pass Step 2 Level 2 Level 3 Downward pass Step 1 Downward pass Step 2 Final Summatio n Step q p I q p Visibility in Point Models Visibility in Polygonal Models x y o r x r y r x r y n x n y Visi ble Visi ble Visib le Visi ble Leve l 2 Level 3 Leve l 1 Node N V 3 V1 V2 V 4 Node N V1 V2 V3 V4 1 () (,), 1 M i i j j fx xy i N f = = = å K

Upload: belle

Post on 22-Feb-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Partial Visibility. Complete Visibility. Level 3. Root. Root. Root. Root. c 2. Level 2. c 2. B. A. B. B. A. A. B. A. c 2. Level 1. cz 2. New Visibility Link. New Visibility Link. V2. n y. n x. a 1. a 4. b 1. q. cz 1. a 3. a 2. Visible. Empty. Empty. p. I. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: GPU based Global Illumination of Point Models using FMM

( ) ( ){ }p q f q fé ù -ê ú¥ ë û

- -+= = =- +

ì üï ïï ï= í ýï ïï ï- î þå å år r

/ 2 2

2 , 2 ,4 40 0 2

1 1n n jj m n mn n j x x y n j y yn

n j m n j xy x

Y r Yrr r

GPU based Global Illumination of Point Models using FMMRhushabh Goradia, Prekshu Ajmera, Sharat Chandran | ViGIL IIT Bombay

{rhushabh, prekshu, sharat}@cse.iitb.ac.in

Motivation

Scans of cultural heritage structures involve millions of points Seamless, geometrically consistent mesh creation is virtually not

possible as it is difficult to automatically segment different objects Wish to preserve in a virtual museum and view them under novel,

complex lighting conditions

Problem Statement Capturing radiosity based inter-reflection effects in a scene when

the inputs are point samples of hard to segment entities Exploit the inherent parallelism in the proposed solution using CUDA

based GPU techniques to achieve fast solutions

Global Illumination and Visibility

Computing visibility between a point-pair is difficult due to lack of intermediate surface information

Contributions View-Independent visibility solution between point-pairs required

for computing correct global illumination Radiosity solution for complex point models using the Fast Multipole

Method Exploit parallelism in the proposed algorithm to achieve upto 20x

speed-up using GPUs

Fast Multipole Method FMM is concerned with evaluating the effect of a “set of source

points” Y, on a set of “evaluation points” X

Total complexity : O(NM)

FMM attempts to reduce this irreducible complexity to O(N+M) The three main insights that make this possible are

1. Factorization of the kernel into source and receiver terms

2. Many application domains can tolerate approximations

upto a certain level

3. Clever use of hierarchical structure (Octree) Each node has an associated Interaction List

Radiosity-based Global Illumination A N-body problem

FMM: The TechniquePhases of FMM1. Octree Construction

2. Interaction List Construction

3. Point-Pair Visibility

4. Upward Pass

5. Downward Pass

6. Final Summation Stage

[Karapurkar A., Chandran S., ICVGIP 2004]

Assigning Weights to Points Requires integration over a surface area For Point Based Models, we do not have any surface information; we

therefore approximate this integration Weights are assigned to each point and signify the contribution of

the point to the reconstruction of the surface Local property based on the normal available at points. As the

number of points increase, the integration is computed more accurately

Visibility Map Construction

In general, it is advantageous to have links at high level in the tree

Leaf-Leaf Visibility

Parallel Octree Construction on GPU

Parallel V-map Construction on GPU

Parallel FMM on GPU

Qualitative Results

Quantitative Results

The visibility map (or V-map) for a tree is a collection of visibility links for every node in the tree. The visibility link for any node p is a list L of nodes at the same level; every point in any node in L is guaranteed to be visible from every point in p

The visibility map is constructed recursively by a variation of DFS

Compute Visibility

A B

Root Complete Visibility

A B

Root

New VisibilityLink

Lookup Lookup

Partial Visibility

A B

Root

New VisibilityLink

Lookup Lookup

Compute Visibility

A B

Root

a2 a3

a4 b1

a2 b1a1

c1

c2

cz

p

q

c1

c2

cz1

c1

c2

cz2

c1

c2

cz2

c1

c2

cz1

4 11

65

9

7

18

10

2

3

12

1 2 3 4 5 6 7 8 9 10 11 12

N

4 11

65

9

7

18

10

2

3

12

1 3 8 7 9 4 5 6 11 2 10 12

N1 N2 N3 N4

4 11

65

9

7

18

10

2

3

12

3 1 8 7 9 4 11 5 6 10 12 2

N13 N2 N33 N42N14 N34 N43 N44

N1

N3 N4

N2

0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11

Lookup from LVS

A B

Root Complete Visibility

A B

Root

New VisibilityLink

Lookup Lookup

Partial Visibility

A B

Root

New VisibilityLink

Lookup Lookup

Compute Visibility in Parallel & Store in LVS

A B

Root

a2 a3

a1 a4 b1

Sources Receiver

X

X

X

Receivers

O

O

O

Source

31 2

31 2

X { , , , }, , 1Y { , , , }, , 1

n i

m j

x x x x i Ny y y y j M

= Î == Î =

K ¡ KK ¡ K

rpÎ

- -= +-ò

r r r r r rr r 4

[ .( )][ .( )]( ) ( ) ( ) ( )y x y x y x

y S y x

n r r n r rB x E x x B y dy

r r

CPU GPU

10011231

1742 2117 23239931421

2521

3981

7851

0

2000

4000

6000

8000

10000

12000

5 6 7 8 9

CPUGPU

Tim

e ta

ken

(ms)

Octree level

Bunny (124531 points)

1321 1536 20092654

36581200

1981

2997

4521

8001

0

2000

4000

6000

8000

10000

12000

14000

5 6 7 8 9

CPUGPU

Tim

e ta

ken

(ms)

Ganpati (165646 points)

Octree level

68.25 83.47 95.98

652.51

1159.04

1839.96

0

500

1000

1500

2000

2500

5 6 7 8 9 10

CPUGPU

Tim

e ta

ken

(min

s)

Octree level

Bunny in Cornell Room (Upto 19x speed-up)

76.53 94.37 101.12

587.02

998.26

1747.85

0200400600800

100012001400160018002000

5 6 7 8 9 10

CPUGPU

Tim

e ta

ken

(min

s)

Octree level

Ganesha in Cornell Room (Upto 17x speed-up)

15.96

19.1821.11

23.8125.87

1.01 1.09 1.16 1.21 1.30

5

10

15

20

25

30

200 150 100 50 25

CPUGPU

Tim

e ta

ken

(hrs

)

Maximum number of points per leaf

Bunny in Cornell Room (Upto 20x speed-up)

14.5416.58

20.8123.15

26.37

1.11 1.16 1.21 1.28 1.41

0

5

10

15

20

25

30

200 150 100 50 25

CPUGPU

Tim

e ta

ken

(hrs

)

Ganesha in Cornell Room (Upto 19x speed-up)

Maximum number of points per leaf

Octree Construction

V-map Construction

Phases of FMM

Empty Empty

Upward pass Step 1 Upward pass Step 2

Level 2

Level 3

Downward pass Step 1 Downward pass Step 2

Final Summation

Step

qp I qp

Visibility in Point ModelsVisibility in Polygonal Models

x y

o

rx – ry

rx

ry

nx ny

Visible

Visible

Visible

Visible

Level 2

Level 3

Level 1

Node N

V3

V1

V2

V4

Node N V1 V2 V3 V4

1( ) ( , ), 1

M

i i jj

f x x y i Nf=

= =å K