pattern decomposition with scaling - cornell universitylebras/publications/pattern... · 2012. 1....

36
Constraint Reasoning and Kernel Clustering for Pattern Decomposition With Scaling Ronan LeBras Computer Science Theodoros Damoulas Computer Science Ashish Sabharwal Computer Science Carla P. Gomes Computer Science John M. Gregoire Materials Science / Physics Bruce van Dover Materials Science / Physics Sept 15, 2011 CP’11

Upload: others

Post on 29-Mar-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Constraint Reasoning and Kernel Clustering

for Pattern Decomposition With Scaling

Ronan LeBras Computer Science

Theodoros Damoulas Computer Science

Ashish Sabharwal Computer Science

Carla P. Gomes Computer Science

John M. Gregoire Materials Science / Physics

Bruce van Dover Materials Science / Physics

Sept 15, 2011 CP’11

Page 2: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Motivation

Ronan Le Bras - CP’11, Sept 15, 2011 2

Cornell Fuel Cell Institute

Mission: develop new materials for fuel cells.

An Electrocatalyst must:

1) Be electronically conducting

2) Facilitate both reactions

Platinum is the best known metal to

fulfill that role, but:

1) The reaction rate is still considered

slow (causing energy loss)

2) Platinum is fairly costly, intolerant

to fuel contaminants, and has a short

lifetime.

Goal: Find an intermetallic compound that is a better catalyst than Pt.

Page 3: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Motivation

3

Recipe for finding alternatives to Platinum

1) In a vacuum chamber, place a silicon wafer.

2) Add three metals.

3) Mix until smooth, using three sputter guns.

4) Bake for 2 hours at 650ºC

Ta

Rh

Pd

(38% Ta, 45% Rh, 17% Pd)

• Deliberately inhomogeneous

composition on Si wafer

• Atoms are intimately mixed

[Source: Pyrotope, Sebastien Merkel]

Ronan Le Bras - CP’11, Sept 15, 2011

Page 4: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Motivation

4

Identifying crystal structure using X-Ray Diffraction at CHESS

• XRD pattern characterizes the underlying crystal fairly well

• Expensive experimentations: Bruce van Dover’s research

team has access to the facility one week every year.

Ta

Rh

Pd

(38% Ta, 45% Rh, 17% Pd)

[Source: Pyrotope, Sebastien Merkel]

15 20 25 30 35 40 45 50 55 600

20

40

60

80

100

120

Ronan Le Bras - CP’11, Sept 15, 2011

Page 5: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Motivation

5

Ta

Rh

Pd

α

β

δ

γ

Ronan Le Bras - CP’11, Sept 15, 2011

Page 6: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Motivation

6

Ta

Rh

Pd

α

β

α+β

δ

γ

γ+δ

Ronan Le Bras - CP’11, Sept 15, 2011

Page 7: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Motivation

7

Ta

Rh

Pd

α

β

δ

γ

Ronan Le Bras - CP’11, Sept 15, 2011

Page 8: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Motivation

8

15 20 25 30 35 40 45 50 55 600

20

40

60

80

100

120

Ta

Rh

Pd

15 20 25 30 35 40 45 50 55 600

20

40

60

80

100

120

α

β

δ

γ

α+β

Ronan Le Bras - CP’11, Sept 15, 2011

Pi

Pj

Page 9: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Motivation

9

15 20 25 30 35 40 45 50 55 600

20

40

60

80

100

120

15 20 25 30 35 40 45 50 55 60 650

10

20

30

40

50

60

70

15 20 25 30 35 40 45 50 55 600

20

40

60

80

100

120

Ta

Rh

Pd

α

β

α+β

δ

γ

Ronan Le Bras - CP’11, Sept 15, 2011

Pi

Pk

Pj

Page 10: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Motivation

Fe

Al

Si

15 20 25 30 35 40 45 50 55 600

20

40

60

80

100

120

15 20 25 30 35 40 45 50 55 600

20

40

60

80

100

120

INPUT: pure phase

region

Fe

Al

Si

m phase regions

k pure regions

m-k mixed regions

XRD pattern

characterizing

pure phases

Mixed

phase

region

OUTPUT:

Additional Physical characteristics:

Peaks shift by 15% within a region

Phase Connectivity

Mixtures of 3 phases

Small peaks might be discriminative

Peak locations matter, more than peak

intensities

10Ronan Le Bras - CP’11, Sept 15, 2011

Page 11: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Motivation

11

Ta

Rh

Pd

Figure 1: Phase regions of Ta-Rh-Pd Figure 2: Fluorescence activity of Ta-Rh-Pd

Ronan Le Bras - CP’11, Sept 15, 2011

Page 12: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Outline

12

• Motivation

• Problem Definition

Abstraction

Hardness

• CP Model

• Kernel-based Clustering

• Bridging CP and Machine learning

• Conclusion and Future work

Ronan Le Bras - CP’11, Sept 15, 2011

Page 13: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Problem Abstraction: Pattern Decomposition with Scaling

13

Input

Output

Ronan Le Bras - CP’11, Sept 15, 2011

Page 14: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Problem Abstraction: Pattern Decomposition with Scaling

14

Input

Output

v1

v2

v3

v4

v5

Ronan Le Bras - CP’11, Sept 15, 2011

Page 15: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Problem Abstraction: Pattern Decomposition with Scaling

15

Input

Output

v1

v2

v3

v4

v5

P1

P2

P3

P4

P5

Ronan Le Bras - CP’11, Sept 15, 2011

Page 16: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Problem Abstraction: Pattern Decomposition with Scaling

16

Input

Output

v1

v2

v3

v4

v5

P1

P2

P3

P4

P5

M= K = 2, δ = 1.5

Ronan Le Bras - CP’11, Sept 15, 2011

Page 17: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Problem Abstraction: Pattern Decomposition with Scaling

17

Input

Output

v1

v2

v3

v4

v5

P1

P2

P3

P4

P5

M= K = 2, δ = 1.5 B2

B1

Ronan Le Bras - CP’11, Sept 15, 2011

Page 18: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Problem Abstraction: Pattern Decomposition with Scaling

18

Input

Output

v1

v2

v3

v4

v5

P1

P2

P3

P4

P5

M= K = 2, δ = 1.5 B2

B1

s11=1.00

s12=0.95

s13=0.88

s14=0.84

s15=0

s21=0.68

s22=0.78

s23=0.85

s24=0.96

s25=1.00

Ronan Le Bras - CP’11, Sept 15, 2011

Page 19: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Problem Hardness

19

Assumptions: Each Bk appears by itself in some vi / No experimental

noise

The problem can be solved* in polynomial time.

Assumption: No experimental noise

The problem becomes NP-hard (reduction from the “Set Basis” problem)

Assumption used in this work: Experimental noise in the form of

missing elements in Pi

v1

v2

v3

v4

v5

P1

P2

P3

P4

P5

M= K = 2, δ = 1.5 B2

B1

s11=1.00

s12=0.95

s13=0.88

s14=0.84

s15=0

s21=0.68

s22=0.78

s23=0.85

s24=0.96

s25=1.00

Ronan Le Bras - CP’11, Sept 15, 2011

P0=

Page 20: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Outline

20

• Motivation

• Problem Definition

• CP Model

Model

Experimental Results

• Kernel-based Clustering

• Bridging CP and Machine learning

• Conclusion and Future work

Ronan Le Bras - CP’11, Sept 15, 2011

Page 21: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

CP Model

21Ronan Le Bras - CP’11, Sept 15, 2011

Page 22: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

CP Model (continued)

22

Advantage: Captures physical properties and relies on peak location rather than height.

Drawback: Does not scale to realistic instances; poor propagation if experimental noise.

Ronan Le Bras - CP’11, Sept 15, 2011

Page 23: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

CP Model – Experimental Results

23

0

200

400

600

800

1000

1200

1400

0 1 2 3 4 5 6

Ru

nn

ing

tim

e (i

n s

)

Number of unknown phases (K’)

Number of unknown phases vs. running times

(AlLiFe with |P| = 24)

N=10

N=15

N=28

N=218

For realistic instances, K’ = 6 and N ≈ 218...

Ronan Le Bras - CP’11, Sept 15, 2011

Page 24: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Outline

24

• Motivation

• Problem Definition

• CP Model

• Kernel-based Clustering

• Bridging CP and Machine learning

• Conclusion and Future work

Ronan Le Bras - CP’11, Sept 15, 2011

Page 25: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Kernel-based Clustering

25

Set of features: X =

Similarity matrices:[X.XT]

Method: K-means on Dynamic Time-Warping kernel

Goal: Select groups of samples that belong to the same phase region to feed the CP

model, in order to extract the underlying phases of these sub-problems.

Better approach: take

shifts into account

Red: similar

Blue: dissimilar

+

+

+

Ronan Le Bras - CP’11, Sept 15, 2011

Page 26: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Bridging CP and ML

26

Goal: a robust, physically meaningful, scalable,

automated solution method that combines:

Similarity “Kernels”

& Clustering

Machine Learningfor a “global

data-driven view”

Underlying

Physics

A language for

Constraintsenforcing

“local details”

Constraint Programming model

Ronan Le Bras - CP’11, Sept 15, 2011

Page 27: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Bridging Constraint Reasoning and Machine

Learning: Overview of the Methodology

27

Fe

Al

Si

15 20 25 30 35 40 45 50 55 600

20

40

60

80

100

120

15 20 25 30 35 40 45 50 55 600

20

40

60

80

100

120

INPUT:

Peak

detection

Machine Learning:

Kernel methods,

Dynamic Time Wrapping

Machine Learning:

Partial “Clustering”

Fe

Al

SiCP Model & Solver

on sub-problems

, +,

only only

Full CP Model

guided by

partial solutions

OUTPUT

Fix errors

in data

Ronan Le Bras - CP’11, Sept 15, 2011

Page 28: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Experimental Validation

28

Al-Li-Fe instance

with 6 phases:

Ground truth

(known)

Our CP + ML hybrid

approach is much

more robust

Previous work (NMF)

violates many

physical requirements

Page 29: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Conclusion

29

Hybrid approach to clustering under constraints

More robust than data-driven “global” ML approaches

More scalable than a pure CP model “locally” enforcing constraints

An exciting application in close collaboration with physicists

Best inference out of expensive experiments

Towards the design of better fuel cell technology

Ronan Le Bras - CP’11, Sept 15, 2011

Page 30: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Future work

Ongoing work:

Spatial Clustering, to further enhance cluster quality

Bayesian Approach, to better exploit prior knowledge

about local smoothness and available inorganic libraries

Active learning:

Where to sample next, assuming we can interfere with

the sampling process?

When to stop sampling if sufficient information has

been obtained?

Correlating catalytic properties across many thin-films:

In order to understand the underlying physical

mechanism of catalysis and to find promising

intermettalic compounds

30Ronan Le Bras - CP’11, Sept 15, 2011

Page 31: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

The end

Thank you!

31Ronan Le Bras - CP’11, Sept 15, 2011

Page 32: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Extra slides

32Ronan Le Bras - CP’11, Sept 15, 2011

Page 33: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Experimental Sample

33

Example on Al-Li-Fe diagram:

Ronan Le Bras - CP’11, Sept 15, 2011

Page 34: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Applications with similar structure

34

Flight Calls / Bird conservation

Identifying bird population from sound recordings at

night.

Analogy: basis pattern = species

samples = recordings

physical constraints = spatial constraints,

species and season specificities…

Fire Detection

Detecting/Locating fires.

Analogy: basis pattern = warmth sources

samples = temperature recordings

physical constraints = gradient of

temperatures, material properties…

Ronan Le Bras - CP’11, Sept 15, 2011

Page 35: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Previous Work 1: Cluster Analysis (Long et al., 2007)

35

xi =

(Feature vector) (Pearson correlation coefficients) (Distance matrix)

(PCA – 3 dimensional approx) (Hierarchical Agglomerative Clustering)

Drawback: Requires sampling of pure phases, detects phase regions (not phases),

overlooks peak shifts, may violate physical constraints (phase continuity, etc.).

Ronan Le Bras - CP’11, Sept 15, 2011

Page 36: Pattern Decomposition with Scaling - Cornell Universitylebras/publications/pattern... · 2012. 1. 27. · Motivation Ronan Le Bras - CP’11, Sept 15, 2011 2 Cornell Fuel Cell Institute

Previous Work 2: NMF (Long et al., 2009)

36

xi =

(Feature vector) (Linear positive combination (A) of

basis patterns (S))

(Minimizing squared

Frobenius norm

X = A.S + E Min ║E║

Drawback: Overlooks peak shifts (linear combination only), may violate physical

constraints (phase continuity, etc.).

Ronan Le Bras - CP’11, Sept 15, 2011