multiscale geometricdictionaries for point-cloud dataconstruction easily extends to any point-cloud...

1
Multiscale Geometric Dictionaries for Point-cloud Data William K. Allard 1 , Guangliang Chen 1 , Mauro Maggioni 1,2 Departments of Mathematics 1 and Computer Science 2 , Duke University, PO BOX 90320, Durham, NC 27708 Abstract We develop a novel geometric multiresolution analysis for analyzing intrinsically low dimensional point clouds in high-dimensional spaces, modeled as samples from a d-dimensional set M (in particular, a manifold) embedded in R D , in the regime d D. This type of situation has been recognized as important in various appli- cations, such as the analysis of sounds, images, and gene arrays. In this paper we construct data-dependent multiscale dictionaries that aim at efficient encoding and manipulating of the data. Unlike existing constructions, our construction is fast, and so are the algorithms that map data points to dictionary coefficients and vice versa. In addition, data points have a guaranteed sparsity in terms of the dictionary. Keywords: Wavelets. Multiscale Analysis. Sparse Coding. Point Clouds. Introduction Data sets are often modeled as point clouds in R D , for D large, but having some interesting low-dimensional structure, for example that of a d-dimensional manifold M, with d D . When M is simply a linear subspace, one may exploit this as- sumption for encoding efficiently the data by projecting onto a dictionary of d vectors in R D (for example found by SVD), at a cost (n + D )d for n data points. When M is nonlin- ear, there are no “explicit” constructions of dictionaries that achieve a similar efficiency: typically one uses either random dictionaries, or dictionaries obtained by black-box optimiza- tion. This type of situation has been recognized as important in various applications, and has been at the center of much in- vestigation in the applied mathematics and machine learning communities during the past several years. We formalize this approach by requesting to find a dictionary Φ of size I , based on training data, such that every point (at least from the training data set) may be represented, up to a certain precision , by at most m elements of the dictionary. This requirement of sparsity of the representation is very nat- ural from the viewpoints of statistics, signal processing, and interpretation of the representation. Of course, the smaller I and m are, for a given , the better the dictionary. Current constructions of these dictionaries such as K-SVD [1], k -flats [5] and Bayesian methods [6] have several deficiencies. First, they cast the requirements above as an optimization problem, with many local minima, and for iterative algorithms little is known about their computational complexity. Sec- ond, no guarantees are provided about the size of I,m (as a function of ). Lastly, the dictionaries found in this way are in general highly over-complete and unstructured. As a consequence, there is no fast algorithm for computing the co- efficients of a data point w.r.t. the dictionary, thus requiring appropriate sparsity-seeking algorithms. In this paper we construct data-dependent dictionaries based on a geometric multiresolution analysis (GMRA) of the data, inspired by multiscale techniques in geometric measure theory [4]. These dictionaries are structured in a multiscale fashion; the expansion of a data point on the dictionary ele- ments is guaranteed to have a certain degree of sparsity m; both the dictionary elements and the coefficients may be com- puted by a fast algorithm; the growth of the number of dictio- nary elements I as a function of is controlled theoretically, and easy to estimate in practice. We call the elements of these dictionaries geometric wavelets, since in some aspects they generalize wavelets from vectors that analyze functions to affine vectors that analyze point clouds. Geometric Wavelets Let (M,g ) be a d-dimensional compact Riemannian mani- fold isometrically embedded in R D , with d D . Assume we have n samples drawn i.i.d. from M, according to the natu- ral volume measure dvol on M. We use such training data to present how to construct geometric wavelets, though our construction easily extends to any point-cloud data, by using locally adaptive dimensions d j,k (rather than a fixed d). Multiscale decomposition. We start by construct- ing a multiscale nested partition of M into dyadic cells {C j,k } k Γ j ,0j J that satisfy the usual properties of dyadic cubes in R D . There is a natural tree T associated to the family: For any j Z and k Γ j , we let children(j, k )= {k Γ j +1 : C j +1,k C j,k }. Also, for x ∈M, we denote by C j,x the unique cell at scale j that contains x (similar notation like P j,x , Φ j,x , Ψ j,x associated to C j,x are used later). Multiscale SVD. For every C j,k we define the mean (in R D ) by c j,k := E[x|x C j,x ]= 1 vol(C j,k ) C j,k xdvol(x) and the covariance by cov j,k = E[(x - c j,k )(x - c j,k ) * |x C j,k ]. Let the rank-d Singular Value Decomposition (SVD) of cov j,k be cov j,k j,k Σ j,k Φ * j,k , where Φ j,k is orthonormal and Σ is diag- onal.The subspace spanned by the columns of Φ j,k , and then translated to pass through c j,k , Φ j,k + c j,k , is an approximate tangent space to M at location c j,k and scale 2 -j . We think of {Φ j,k } k Γ j as a family of geometric scaling functions at scale j . Let P j,k be the associated affine projection P j,k (x)=Φ j,k Φ * j,k (x - c j,k )+ c j,k , x C j,k (1) We define the coarse approximations, at scale j , to the mani- fold M and to any point x ∈M, as follows: M j := k Γ j P j,k (C j,k ), x j P M j (x) := P j,x (x). (2) Multiscale geometric wavelets. We introduce our wavelet encoding of the difference between M j and M j +1 , for j<J . Fix x C j +1,k C j,k . The difference x j +1 - x j is a high-dimensional vector in R D , however it may be decomposed into a sum of vectors in certain well-chosen low-dimensional spaces, shared across multiple points, in a multiscale fashion. We proceed as follows: Q M j +1 (x) := P M j +1 (x) - P M j (x) = x j +1 - P j,k (x j +1 )+ P j,k (x j +1 ) - P j,k (x) =(I - Φ j,k Φ * j,k )(x j +1 - c j,k )+Φ j,k Φ * j,k (x j +1 - x). (3) Define the wavelet subspace and translation as W j +1,k := (I - Φ j,k Φ * j,k ) Φ j +1,k ; (4) w j +1,k := (I - Φ j,k Φ * j,k )( c j +1,k - c j,k ). (5) Clearly dim W j +1,k d. Let Ψ j +1,k be an orthonormal basis for W j +1,k which we call a geometric wavelet basis. Then we may rewrite (3) as Q M j +1 (x)=Ψ j +1,k Ψ * j +1,k (x j +1 - c j +1,k )+ w j +1,k - Φ j,k Φ * j,k (x - x j +1 ). (6) The last term x - x j +1 can be closely approximated by x J - x j +1 = J -1 l =j +1 Q M l +1 (x) as the finest scale J +, under general conditions. These operators are “detail” oper- ators analogous to the wavelet projections in wavelet theory, and satisfy, by construction, the crucial multiscale relationship P M j +1 (x)= P M j (x)+ Q M j +1 (x), x ∈M. (7) We have therefore constructed a multiscale family of projec- tion operators P M j (one for each node C j,k ) onto approximate local tangent planes and detail projection operators Q M j +1 (one for each edge) encoding the differences, collectively re- ferred to as a GMRA structure. The cost of encoding the GMRA structure is dominated by that of the scaling func- tions {Φ j,k }, which is O (dD - d 2 ), and the time complexity of the algorithm is O (Dn log(n)) [2]. Theorem 1 (Geometric Wavelet Decomposition). Let (M,g ) be a C 2 manifold of dimension d in R D , and {P M j ,Q M j +1 } a GMRA. For any x ∈M, there exists a constant C = C (x) and a scale j 0 = j 0 (reach(B x (1))), such that for any j j 0 , x - P M j 0 (x) - j l =j 0 +1 Q M l (x)‖≤ C · 2 -2j . (8) Geometric Wavelet Transforms (GWT). Given a GMRA structure, we may compute a discrete Forward GWT for a point x ∈M that maps it to a sequence of wavelet coefficient vectors: q x =(q J,x ,q J -1,x ,...,q 1,x ,q 0,x ) R d+ J j =1 d w j,x (9) where q j,x := Ψ * j,x (x j - c j,x ), and d w j,x := dim W j,x d. Note that, for a fixed precision > 0, q x has a maximum possible length (1 + 1 2 log 2 1 )d, which is independent of D and nearly optimal in d [3]. On the other hand, we may easily compute the wavelets at all scales using the GMRA structure and the wavelet coefficients, by a discrete Inverse GWT: Q M J (x)=Ψ J,x q J,x + w J,x ; (10) Q M j (x)=Ψ j,x q j,x + w j,x - Φ j -1,x Φ * j -1,x >j Q M (x) (11) for j gradually decreasing from J - 1 to 1. The finest approx- imation of x is ˆ x J = x 0 + j 1 Q M j (x). Experiments 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.2 0.4 0.6 0.8 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.5 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 scale = 11, error = 0.01699 Magnitude of Wavelet Coefficients (log10 scale) points scales 2000 4000 6000 8000 10000 1 2 3 4 5 6 7 8 9 10 11 -2 -1.5 -1 -0.5 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 scale error Error against scale (in log10 scale) y = - 1.8*x + 0.12 Figure 1. We sample 10, 000 points from a 2-D Oscillating2DWave embedded in R 50 and compute the GWT of the data. Bottom left figure shows the wavelet coefficients arranged into the natural tree: The x-axis indexes the points, and the y axis indexes the scales from coarsest (1) to finest (11) Dimension of wavelet subspaces points scales 500 1000 1500 2000 1 2 3 4 5 6 7 8 9 0 5 10 15 20 -4 -3.95 -3.9 -3.85 -3.8 -3.75 -3.7 -3.65 -1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 scale error Error against scale (in log10 scale) y = - 1.4*x - 6 1 2 3 4 5 6 7 8 9 projection original 1 2 3 4 5 6 7 8 9 Figure 2. We apply the GMRA to 2414 (cropped) face images from 38 human subjects in fixed frontal pose under varying illumination angles. Bottom row shows approxima- tions of a fixed point (image) and corresponding dictionary elements used Acknowledgements. MM is grateful for partial support from NSF (DMS 0650413, CCF 0808847, IIS 0803293), ONR N00014-07-1-0625, the Sloan Foundation and Duke. GC was partially supported by ONR N00014-07-1-0625 and NSF CCF 0808847. References. [1] M. Aharon, M. Elad, and A. M. Bruckstein. The K-SVD algorithm. In SPARSE’05. [2] W. K. Allard, G. Chen, and M. Maggioni. Multiscale geometric methods for data sets II: Geometric wavelets. In preparation. [3] G. Chen and M. Maggioni. Multiscale geometric wavelets for the analysis of point clouds. In CISS ’10. [4] G. David. Wavelets and Singular Integrals on Curves and Surfaces. Springer, 1991. [5] A. Szlam and G. Sapiro. Discriminative k-metrics. In ICML ’09. [6] M. Zhou, H. Chen, J. Paisley, L. Ren, G. Sapiro, and L. Carin. Non-parametric Bayesian dictionary learning for sparse image representations. In NIPS ’09.

Upload: others

Post on 08-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multiscale GeometricDictionaries for Point-cloud Dataconstruction easily extends to any point-cloud data, by using locally adaptive dimensions dj,k (rather than a fixed d). Multiscale

Multiscale Geometric Dictionaries for Point-cloud Data

William K. Allard1, Guangliang Chen1, Mauro Maggioni1,2

Departments of Mathematics1 and Computer Science2, Duke University, PO BOX 90320, Durham, NC 27708

Abstract

We develop a novel geometric multiresolution analysis for analyzing intrinsicallylow dimensional point clouds in high-dimensional spaces, modeled as samples froma d-dimensional set M (in particular, a manifold) embedded in R

D, in the regimed � D. This type of situation has been recognized as important in various appli-cations, such as the analysis of sounds, images, and gene arrays. In this paper weconstruct data-dependent multiscale dictionaries that aim at efficient encoding andmanipulating of the data. Unlike existing constructions, our construction is fast,and so are the algorithms that map data points to dictionary coefficients and viceversa. In addition, data points have a guaranteed sparsity in terms of the dictionary.

Keywords: Wavelets. Multiscale Analysis. Sparse Coding. Point Clouds.

Introduction

Data sets are often modeled as point clouds in RD, for D

large, but having some interesting low-dimensional structure,for example that of a d-dimensional manifoldM, with d � D.When M is simply a linear subspace, one may exploit this as-sumption for encoding efficiently the data by projecting ontoa dictionary of d vectors in R

D (for example found by SVD),at a cost (n + D)d for n data points. When M is nonlin-ear, there are no “explicit” constructions of dictionaries thatachieve a similar efficiency: typically one uses either randomdictionaries, or dictionaries obtained by black-box optimiza-tion. This type of situation has been recognized as importantin various applications, and has been at the center of much in-vestigation in the applied mathematics and machine learningcommunities during the past several years.We formalize this approach by requesting to find a dictionaryΦ of size I , based on training data, such that every point (atleast from the training data set) may be represented, up to acertain precision ε, by at most m elements of the dictionary.This requirement of sparsity of the representation is very nat-ural from the viewpoints of statistics, signal processing, andinterpretation of the representation. Of course, the smaller Iand m are, for a given ε, the better the dictionary.Current constructions of these dictionaries such as K-SVD [1],k-flats [5] and Bayesian methods [6] have several deficiencies.First, they cast the requirements above as an optimizationproblem, with many local minima, and for iterative algorithmslittle is known about their computational complexity. Sec-ond, no guarantees are provided about the size of I,m (asa function of ε). Lastly, the dictionaries found in this wayare in general highly over-complete and unstructured. As aconsequence, there is no fast algorithm for computing the co-efficients of a data point w.r.t. the dictionary, thus requiringappropriate sparsity-seeking algorithms.In this paper we construct data-dependent dictionaries basedon a geometric multiresolution analysis (GMRA) of thedata, inspired by multiscale techniques in geometric measuretheory [4]. These dictionaries are structured in a multiscalefashion; the expansion of a data point on the dictionary ele-

ments is guaranteed to have a certain degree of sparsity m;both the dictionary elements and the coefficients may be com-puted by a fast algorithm; the growth of the number of dictio-nary elements I as a function of ε is controlled theoretically,and easy to estimate in practice. We call the elements ofthese dictionaries geometric wavelets, since in some aspectsthey generalize wavelets from vectors that analyze functionsto affine vectors that analyze point clouds.

Geometric Wavelets

Let (M, g) be a d-dimensional compact Riemannian mani-fold isometrically embedded in R

D, with d � D. Assume wehave n samples drawn i.i.d. from M, according to the natu-ral volume measure dvol on M. We use such training datato present how to construct geometric wavelets, though ourconstruction easily extends to any point-cloud data, by usinglocally adaptive dimensions dj,k (rather than a fixed d).Multiscale decomposition. We start by construct-ing a multiscale nested partition of M into dyadic cells{Cj,k}k∈Γj,0≤j≤J that satisfy the usual properties of dyadiccubes in R

D. There is a natural tree T associated to thefamily: For any j ∈ Z and k ∈ Γj, we let children(j, k) ={k′ ∈ Γj+1 : Cj+1,k′ ⊆ Cj,k}. Also, for x ∈ M, we denote byCj,x the unique cell at scale j that contains x (similar notationlike Pj,x,Φj,x,Ψj,x associated to Cj,x are used later).Multiscale SVD. For every Cj,k we define the mean (inR

D) by cj,k := E[x|x ∈ Cj,x] =1

vol(Cj,k)

∫Cj,k

x dvol(x) and the

covariance by covj,k = E[(x − cj,k)(x − cj,k)∗|x ∈ Cj,k]. Let

the rank-d Singular Value Decomposition (SVD) of covj,k becovj,k = Φj,kΣj,kΦ

∗j,k, where Φj,k is orthonormal and Σ is diag-

onal.The subspace spanned by the columns of Φj,k, and thentranslated to pass through cj,k, 〈Φj,k〉+cj,k, is an approximatetangent space to M at location cj,k and scale 2−j. We thinkof {Φj,k}k∈Γj

as a family of geometric scaling functions atscale j. Let Pj,k be the associated affine projection

Pj,k(x) = Φj,kΦ∗j,k(x− cj,k) + cj,k, ∀ x ∈ Cj,k (1)

We define the coarse approximations, at scale j, to the mani-fold M and to any point ∀x ∈ M, as follows:

Mj := ∪k∈ΓjPj,k(Cj,k), xj ≡ PMj

(x) := Pj,x(x). (2)

Multiscale geometric wavelets. We introduce ourwavelet encoding of the difference betweenMj andMj+1, forj < J . Fix x ∈ Cj+1,k′ ⊂ Cj,k. The difference xj+1 − xj is ahigh-dimensional vector inRD, however it may be decomposedinto a sum of vectors in certain well-chosen low-dimensionalspaces, shared across multiple points, in a multiscale fashion.We proceed as follows:

QMj+1(x) := PMj+1

(x)− PMj(x)

= xj+1 − Pj,k(xj+1) + Pj,k(xj+1)− Pj,k(x)

= (I − Φj,kΦ∗j,k)(xj+1 − cj,k) + Φj,kΦ

∗j,k(xj+1 − x). (3)

Define the wavelet subspace and translation as

Wj+1,k′ := (I − Φj,kΦ∗j,k) 〈Φj+1,k′〉; (4)

wj+1,k′ := (I − Φj,kΦ∗j,k)(cj+1,k′ − cj,k). (5)

Clearly dimWj+1,k′ ≤ d. Let Ψj+1,k′ be an orthonormal basisfor Wj+1,k′ which we call a geometric wavelet basis. Then wemay rewrite (3) as

QMj+1(x) = Ψj+1,k′Ψ

∗j+1,k′(xj+1 − cj+1,k′) + wj+1,k′

− Φj,kΦ∗j,k(x− xj+1). (6)

The last term x − xj+1 can be closely approximated by

xJ − xj+1 =∑J−1

l=j+1QMl+1(x) as the finest scale J → +∞,

under general conditions. These operators are “detail” oper-ators analogous to the wavelet projections in wavelet theory,and satisfy, by construction, the crucial multiscale relationship

PMj+1(x) = PMj

(x) +QMj+1(x), ∀ x ∈ M. (7)

We have therefore constructed a multiscale family of projec-tion operators PMj

(one for each node Cj,k) onto approximatelocal tangent planes and detail projection operators QMj+1

(one for each edge) encoding the differences, collectively re-ferred to as a GMRA structure. The cost of encoding theGMRA structure is dominated by that of the scaling func-tions {Φj,k}, which is O(dDε−

d2), and the time complexity of

the algorithm is O(Dn log(n)) [2].

Theorem 1 (Geometric Wavelet Decomposition). Let(M, g) be a C2 manifold of dimension d in R

D, and{PMj

, QMj+1} a GMRA. For any x ∈ M, there exists a

constant C = C(x) and a scale j0 = j0(reach(Bx(1))), suchthat for any j ≥ j0,

‖x− PMj0(x)−

j∑

l=j0+1

QMl(x)‖ ≤ C · 2−2j. (8)

Geometric Wavelet Transforms (GWT). Given aGMRA structure, we may compute a discrete Forward GWTfor a point x ∈ M that maps it to a sequence of waveletcoefficient vectors:

qx = (qJ,x, qJ−1,x, . . . , q1,x, q0,x) ∈ Rd+

∑Jj=1 d

wj,x (9)

where qj,x := Ψ∗j,x(xj − cj,x), and dwj,x := dimWj,x ≤ d. Note

that, for a fixed precision ε > 0, qx has a maximum possiblelength (1 + 1

2 log21ε)d, which is independent of D and nearly

optimal in d [3]. On the other hand, we may easily computethe wavelets at all scales using the GMRA structure and thewavelet coefficients, by a discrete Inverse GWT:

QMJ(x) = ΨJ,xqJ,x + wJ,x; (10)

QMj(x) = Ψj,xqj,x + wj,x − Φj−1,xΦ

∗j−1,x

`>j

QM`(x) (11)

for j gradually decreasing from J − 1 to 1. The finest approx-imation of x is x̂J = x0 +

∑j≥1QMj

(x).

Experiments

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

0.20.40.60.8

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1 scale = 11, error = 0.01699

Magnitude of Wavelet Coefficients (log10 scale)

points

scal

es

2000 4000 6000 8000 10000

1

2

3

4

5

6

7

8

9

10

11−2

−1.5

−1

−0.5

0

0 0.2 0.4 0.6 0.8 1 1.2 1.4−3

−2.5

−2

−1.5

−1

−0.5

0

0.5

scale

erro

r

Error against scale (in log10 scale)

y = − 1.8*x + 0.12

Figure 1. We sample 10, 000 points from a 2-D Oscillating2DWave embedded in R50

and compute the GWT of the data. Bottom left figure shows the wavelet coefficients

arranged into the natural tree: The x-axis indexes the points, and the y axis indexes the

scales from coarsest (1) to finest (11)

Dimension of wavelet subspaces

points

scal

es

500 1000 1500 2000

1

2

3

4

5

6

7

8

9

0

5

10

15

20

−4 −3.95 −3.9 −3.85 −3.8 −3.75 −3.7 −3.65−1

−0.9

−0.8

−0.7

−0.6

−0.5

−0.4

−0.3

−0.2

scale

erro

r

Error against scale (in log10 scale)

y = − 1.4*x − 6

1 2 3 4

5 6 7 8

9 projection original

1 2 3

4 5 6

7 8 9

Figure 2. We apply the GMRA to 2414 (cropped) face images from 38 human subjects

in fixed frontal pose under varying illumination angles. Bottom row shows approxima-

tions of a fixed point (image) and corresponding dictionary elements used

Acknowledgements. MM is grateful for partial support from NSF (DMS 0650413,

CCF 0808847, IIS 0803293), ONR N00014-07-1-0625, the Sloan Foundation and Duke.

GC was partially supported by ONR N00014-07-1-0625 and NSF CCF 0808847.

References.

[1] M. Aharon, M. Elad, and A. M. Bruckstein. The K-SVD algorithm. In SPARSE’05.

[2] W. K. Allard, G. Chen, and M. Maggioni. Multiscale geometric methods for data sets II: Geometric wavelets. In

preparation.

[3] G. Chen and M. Maggioni. Multiscale geometric wavelets for the analysis of point clouds. In CISS ’10.

[4] G. David. Wavelets and Singular Integrals on Curves and Surfaces. Springer, 1991.

[5] A. Szlam and G. Sapiro. Discriminative k-metrics. In ICML ’09.

[6] M. Zhou, H. Chen, J. Paisley, L. Ren, G. Sapiro, and L. Carin. Non-parametric Bayesian dictionary learning forsparse image representations. In NIPS ’09.