june3,2015 · pdf file june3,2015 off the grid tensor completion for seismic data...

Post on 19-Feb-2018

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

University  of  British  ColumbiaSLIM

Curt  Da  Silva  and  Felix  J.  HerrmannJune  3,  2015

Off the grid tensor completion for seismic data interpolation

Motivation

3D  seismic  experiments  -­‐  5D  data  • expensive  to  acquire,  store• sample  at  sub-­‐Nyquist  rates

Data  exhibits  low-­‐rank  structure• exploit  structure  for  interpolation

Fully  sampled  data  • simultaneous  sources  in  wave-­‐equation  based  inversion• mitigating  multiples

Motivation

Standard  matrix,  tensor  completion• defined  on  a  regularly  spaced  grid• sampling  removes  points  from  this  regular  grid

Degraded  results  in  practice• very  difficult  to  sample  regularly  in  practice• irregular  “off-­‐the-­‐grid”  sampling  destroys  low-­‐rank  behaviour

7.34 Hz - 90% missing receivers - irregular samplingCommon source gather

Subsampled  dataTrue  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

7.34 Hz - 90% missing receivers - irregular samplingCommon source gather

Regularized  tensor  completionSNR  11  dB

True  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

Context

Low-­‐rank  matrix/tensor  completion  via  nuclear  norm  projection  [1]• Require  SVDs  on  huge  data  matrices• Not  scalable  to  large  problem  sizes

Data  completion  via  Toeplitz  embedding  [2]• Problem  size  -­‐  (#  data  points)2

• Ad-­‐hoc  windowing  -­‐  can  degrade  quality,  as  demonstrated  in  [3]

 

[1]  Kreimer  and  Sacchi,  "A  tensor  higher-­‐order  singular  value  decomposi^on  for  prestack  seismic  data  noise  reduc^on  and  interpola^on."  (2012)[2]  Gao,  Vicente,  and  Sacchi.  "Evalua^on  of  a  fast  algorithm  for  the  eigen-­‐decomposi^on  of  large  block  Toeplitz  matrices  with  applica^on  to  5D  seismic  data  interpola^on."  (2011)[3]  Da  Silva,  Kumar,  et  al,  “Efficient  matrix  comple^on  for  seismic  data  reconstruc^on.”  (To  appear)

Goals

Review  Hierarchical  Tucker  tensor  format,  principles  of  low-­‐rank  tensor  recovery

Compensate  for  lack  of  low-­‐rank  behaviour  in  irregular  sampling  domain• Introduce  a  new  domain  in  which  the  data  is  low  rank

Compressive sensingwith sparsity promotion

Successful  reconstruction  scheme

Signal  structure• sparsity

Sampling  • subsampling  decreases  sparsity

Optimization• look  for  sparsest  solution

Multidimensional interpolationwith Hierarchical Tucker

Successful  reconstruction  scheme

Signal  structure• Hierarchical  Tucker

Sampling  • subsampling,  noise  increases  hierarchical  rank

Optimization• fit  data  in  the  Hierarchical  Tucker  format

Hierarchical Tucker format

“SVD”-like decomposition

X � n1 ⇥ n2 ⇥ n3 ⇥ n4 tensor

X(1,2)

n1n2

n3n4

=

n1n2

U12

k12

B1234 UT34

n3n4

k34

U12

n1n2

k12

! U12n1

n2k12

U12

n1n2

k12

! U12n1

n2k12

! U1

UT2

n1

k1 n2

k2

B12

1

Hierarchical Tucker formatX � n1 ⇥ n2 ⇥ n3 ⇥ n4 tensor

U12

n1n2

k12

! U12n1

n2k12

Hierarchical Tucker formatX � n1 ⇥ n2 ⇥ n3 ⇥ n4 tensor

U12

n1n2

k12

! U12n1

n2k12

! U1

UT2

n1

k1 n2

k2

B12

Hierarchical Tucker format

Intermediate  matrices  don’t  need  to  be  stored

                           -­‐  small  parameter  matrices• specify  the  tensor  completely

Separating  groups  of  dimensions  from  each  other• dimension  tree

Ut, Bt

Example

A.  Uschmajew,  B.  Vandereycken.  The  geometry  of  algorithms  using  hierarchical  tensors.  Linear  algebra  and  its  applications,  2013.

The geometry of hierarchical tensors 7

{1, 2, 3, 4, 5}

{1, 2, 3} {4, 5}

{4} {5}

{2} {3}

{1} {2, 3}

= tr

= t

= t2= t1

Fig. 3.1 A dimension tree of {1, 2, 3, 4, 5}.

3 The hierarchical Tucker decomposition

In this section, we define the set of HT tensors of fixed rank using the HT de-composition and the hierarchical rank. Much of this and related concepts werealready introduced in [24,22,23] but we establish in addition some new relationson the parametrization of this set. In order to better facilitate the derivation of thesmooth manifold structure in the next section, we have adopted a slightly di↵erentpresentation compared to [24,22].

3.1 The hierarchical Tucker format

Definition 3.1 Given the order d, a dimension tree T is a non-trivial, rootedbinary tree whose nodes t can be labeled (and hence identified) by elements of thepower set P({1, 2, . . . , d}) such that

(i) the root has the label tr = {1, 2, . . . , d}; and,(ii) every node t 2 T , which is not a leaf, has two sons t1 and t2 that form an

ordered partition of t, that is,

t1 [ t2 = t and µ < ⌫ for all µ 2 t1, ⌫ 2 t2. (3.1)

The set of leafs is denoted by L. An example of a dimension tree for tr ={1, 2, 3, 4, 5} is depicted in Figure 3.1.

The idea of the HT format is to recursively factorize subspaces of Rn1⇥n2⇥···⇥nd

into tensor products of lower-dimensional spaces according to the index splittingsin the tree T . If X is contained in such subspaces that allow for preferably low-dimensional factorizations, then X can be e�ciently stored based on the nextdefinition. Given dimensions n1, n2, . . . , nd, called spatial dimensions, and a nodet ✓ {1, 2, . . . , d}, we define the dimension of t as nt =

Qµ2t nµ.

Definition 3.2 Let T be a dimension tree and k = (kt)t2T a set of positiveintegers with kt

r

= 1. The hierarchical Tucker (HT) format for tensors X 2Rn1,n2,...,nd is defined as follows.

(i) To each node t 2 T , we associate a matrix Ut 2 Rnt

⇥kt .

(ii) For the root tr, we define Utr

= vec(X).

U123 B123 U45 B45

B12345

U1 U23 B23 U4 U5

U2 U3

Hierarchical Tucker format

Storage

Compare  to                  storage  for  the  full  tensor

Effectively  breaking  the  curse  of  dimensionality  when  

Low  frequency  data  compresses  in  HT

dNK + (d� 2)K3 +K2

Nd

K ⌧ N d � 4

Seismic Hierarchical Tucker

We  consider  a  3D  seismic  survey  with  coordinates  (src  x,  src  y,  rec  x,  rec  y,  time)

We  take  a  Fourier  transform  in  time  and  restrict  ourselves  to  a  single  frequency  slice

Seismic Hierarchical Tucker

For  a  frequency  slice  with  coordinates  (src  x,  src  y,  rec  x,  rec  y),    

there  are  essentially  two  choices  of  dimension  splitting  (by  reciprocity)

First  explored  in  [1]  -­‐  low  rank  solution  operators  for  wave  equations

{src x, rec x, src y, rec y}

{src x, rec x} {src y, rec y}

{src x} {rec x} {src y} {rec y}

{src x, srcy, rec x, rec y}

{src x, src y} {rec x, rec y}

{src x} {src y} {rec x} {rec y}

Canonical  Decomposition Non-­‐canonical  Decomposition

[1] L. Demanet. Curvelets, Wave Atoms, and Wave Equations. PhD Thesis, MIT

(Rec  x,  Rec  y)  matricization  -­‐  Canonical  ordering

Matricizations

10 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500 600 70010−7

10−6

10−5

10−4

10−3

10−2

10−1

100

Nor

mal

ized

sin

gula

r val

ue

Full data

xsrc, ysrc

x rec, y

rec

100 200 300 400 500 600

100

200

300

400

500

600

(Src  x,  Rec  x)  matricization  -­‐  Noncanonical  ordering

Matricizations

10 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

100

ysrc,yrec

x src,xrec

100 200 300 400 500 600

100

200

300

400

500

600

0 100 200 300 400 500 600 70010−7

10−6

10−5

10−4

10−3

10−2

10−1

100

Nor

mal

ized

sin

gula

r val

ue

(Rx Ry) matricization(Sx Rx) matricization

Multidimensional interpolationwith Hierarchical Tucker

Successful  reconstruction  scheme

Signal  structure• Hierarchical  Tucker

Sampling  • subsampling,  noise  increases  hierarchical  rank

Optimization• fit  data  in  the  Hierarchical  Tucker  format

Matrix Completion

2

66664

⇤ ⇤ ⇤ ⇤ ⇤⇤ ⇤ ⇤ ⇤ ⇤⇤ ⇤ ⇤ ⇤ ⇤⇤ ⇤ ⇤ ⇤ ⇤⇤ ⇤ ⇤ ⇤ ⇤

3

77775

0 10 20 30 40 50 60 70 80 90 10010

⌧4

10⌧3

10⌧2

10⌧1

100

Nor

mal

ize

d si

ng

ular

val

ue

X

Matrix CompletionA(X)2

66664

⇤ ⇤ ⇤ 0 ⇤⇤ 0 0 ⇤ 0⇤ ⇤ ⇤ ⇤ ⇤⇤ ⇤ 0 ⇤ ⇤0 ⇤ ⇤ ⇤ 0

3

77775

0 10 20 30 40 50 60 70 80 90 10010

⌧4

10⌧3

10⌧2

10⌧1

100

Nor

mal

ize

d si

ng

ular

val

ue

Sampling

Sampling                                                                                          points  from  the  data• idealized  recovery• impossible  to  physically  implement

Sampling                                                points  from  the  data• less  idealized• possible  to  acquire  data  -­‐  e.g.  ocean  bottom  nodes

(xsrc, ysrc, xrec, yrec)

(xrec, yrec)

Realistic recovery50% random receivers removed

xsrc, ysrc

x rec, y

rec

100 200 300 400 500 600

100

200

300

400

500

600

(Rec  x,  Rec  y)  matricization  -­‐  Canonical  ordering

10 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500 600 70010−7

10−6

10−5

10−4

10−3

10−2

10−1

100

Nor

mal

ized

sin

gula

r val

ue

No subsampling50% missing sources

Realistic recovery50% random receivers removed

(Src  x,  Rec  x)  matricization  -­‐  Noncanonical  ordering

ysrc,yrec

x src,xrec

100 200 300 400 500 600

100

200

300

400

500

600

10 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500 600 70010−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

Nor

mal

ized

sin

gula

r val

ue

No subsampling50% missing sources

Data organization

(rec  x,  rec  y)  organization• High  rank• Missing  receivers  operator  -­‐  removes  rows• Poor  recovery  scenario

(src  x,  rec  x)  organization• Low  rank• Missing  receivers  operator  -­‐  removes  blocks• Closer  to  ideal  recovery  scenario

Nonuniform sampling

BG  data  set• 68  x  68  sources,  150m  spacing• 401  x  401  receivers,  25m  spacing

Consider  two  sampling  situations• regular  201  x  201  grid,  50m  spacing

• irregular  201  x  201  grid• random  25m  perturbations  from  the  regular  201  x  201  grid  • load  true  data  from  the  irregular  grid,  avoid  generating  inverse  crime  data

Regular grid

28

Unstructured grid

29

Regular vs irregular grid - singular values

0 50 100 150 200 25010−3

10−2

10−1

100

norm

aliz

ed s

ingu

lar v

alue

s

singular value index0 10 20 30 40 50 60 70

100

norm

aliz

ed s

ingu

lar v

alue

s

singular value index0 2000 4000 6000 8000 10000 12000 14000

10−10

10−9

10−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

norm

aliz

ed s

ingu

lar v

alue

s

singular value index

source  x,  receiver  x source  x receiver  x

Black  -­‐  regular  gridBlue  -­‐  irregular  grid  

30

Off the grid tensor interpolation

The  data  volume  is  no  longer  low  rank  when  irregularly  sampled• standard  tensor  completion  framework  won’t  work  well

Solution• construct  a  domain  where  the  data  is  low  rank• choose  an  appropriate  transform  :  low  rank  domain  -­‐>  sampling  domain• incorporate  transform  in  to  the  optimization  problem

31

Multidimensional interpolationwith Hierarchical Tucker

Successful  reconstruction  scheme

Signal  structure• Hierarchical  Tucker

Sampling  • subsampling,  noise  increases  hierarchical  rank

Optimization• fit  data  in  the  Hierarchical  Tucker  format

Optimization program�(x)

Parameter  space

Full-­‐tensor  space

Optimization program

Optimization problem

The  standard  problem  we  solve  is  

Our  sampling  operator  is  typically

where

minx

kA�(x)� bk22

A = RP

R : regular full grid ! subsampled grid

P : (src x, rec x, src y, rec y) ! (src x, src y, rec x, rec y)

35

Optimization problem

In  the  irregular  grid  case,  the  subsampling    operator  is  in  fact

In  order  to  take  this  discrepancy  in  to  account,  we  introduce  an  operator

This  is  an  extension  of  [1]  to  the  tensor  case  

R : irregular full grid ! subsampled grid

F : regular full grid ! irregular full grid

36

[1] R. Kumar, O. Lopez, E. Esser, and F. J. Herrmann. Matrix completion on unstructured grids : 2-D seismic data regularization and interpolation. EAGE 2015.

Optimization problem

The  sequence  of  operators  is  then

We  set                                                        and  use  the  same  optimization  code  as  previously  in  [1]

In  our  examples,  we  use  the  non-­‐uniform  Fourier  transform  [2]

R : irregular full grid ! subsampled grid

F : regular full grid ! irregular full gridP : (src x, rec x, src y, rec y) ! (src x, src y, rec x, rec y)

A = RFP

37

[1]! C. Da Silva and F. J. Herrmann. Optimization on the hierarchical tucker manifold – applications to tensor completion. Linear Algebra and its Applications

[2] L Greengard and J.Y. Lee. Accelerating the nonuniform fast fourier transform. SIAM Review.

Optimization problem

Main  computational  costs• Gauss-­‐Newton  method  -­‐>  convergence  in  ~15  iterations

• ~  4  objective  evaluations  per  iteration

• ~60  applications  of  the  interpolation  operator• main  source  of  computational  costs

38

Results

Synthetic BG Group data

Unknown  model  • 68  x  68  sources  with  401  x  401  receivers,  data  at  7.34  Hz,  12.3  Hz

Receivers  sampled  on  a  randomly  perturbed  201  x  201  grid

We  compare  to  ‘vanilla’  tensor  completion,  where  we  just  bin  the  data  to  a  regular  grid

7.34 Hz - 75% missing receiversCommon source gather

Subsampled  dataTrue  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

7.34 Hz - 75% missing receiversCommon source gather

Vanilla  tensor  completion  SNR  9.46  dB  

True  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

7.34 Hz - 75% missing receiversCommon source gather

Regularized  tensor  completionSNR  15.7  dB  

True  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

7.34 Hz - 75% missing receiversCommon source gather

Regularized  tensor  completion  difference

Vanilla  tensor  completiondifference

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

12.3 Hz - 75% missing receiversCommon source gather

True  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

Subsampled  data

12.3 Hz - 75% missing receiversCommon source gather

True  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

Vanilla  tensor  completion  SNR  4.45  dB

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

12.3 Hz - 75% missing receiversCommon source gather

True  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

Regularized  tensor  completionSNR  11.4  dB  

12.3 Hz - 75% missing receiversCommon source gather

Vanilla  tensor  completiondifference

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

Regularized  tensor  completion  difference

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

7.34 Hz - 90% missing receiversCommon source gather

Subsampled  dataTrue  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

7.34 Hz - 90% missing receiversCommon source gather

Vanilla  tensor  completionSNR  6.95  dB

True  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

7.34 Hz - 90% missing receiversCommon source gather

Regularized  tensor  completionSNR  11  dB

True  data

receiver x20 40 60 80 100 120 140 160 180 200

rece

iver

y

20

40

60

80

100

120

140

160

180

200

receiver x20 40 60 80 100 120 140 160 180 200

rece

ive

r y

20

40

60

80

100

120

140

160

180

200

Conclusion

3D  seismic  data  has  an  underlying  structure  that  we  can  exploit  for  interpolation  (Hierarchical  Tucker  format)

Different  schemes  for  organizing  data  -­‐  important  for  recovery

Conclusion

We  can  interpolate  HT  tensors  with  missing  entries  using  the  Riemannian  manifold  structure  of  the  HT  format

It  is  important  and  worthwhile  to  account  for  the  off-­‐grid  nature  of  sampling  when  interpolating  data• binning  just  doesn’t  cut  it

Acknowledgements

This  work  was  in  part  financially  supported  by  the  Natural  Sciences  and  Engineering  Research  Council  of  Canada  Discovery  Grant  (22R81254)  and  the  Collaborative  Research  and  Development  Grant  DNOISE  II  (375142-­‐08).  This  research  was  carried  out  as  part  of  the  SINBAD  II  project  with  support  from  the  following  organizations:  BG  Group,  BGP,  BP,  CGG,  Chevron,  ConocoPhillips,  ION,  Petrobras,  PGS,  Total  SA,  WesternGeco,  Statoil,  and  Woodside.

Thank  you  for  your  attention

top related