statistical regimes across constrainedness regions carla p. gomes, cesar fernandez bart selman, and...

36
Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de Lleida LIRMM-CNRS CP 2004 Toronto

Post on 19-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Statistical Regimes Across Constrainedness Regions

Carla P. Gomes, Cesar FernandezBart Selman, and Christian Bessiere

Cornell UniversityUniversitat de Lleida

LIRMM-CNRS

CP 2004Toronto

Page 2: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Motivation

Bring together recent results on:

• Typical Case Analysis

• Randomized Complete Search Methods

• Heavy-Tailed Phenomena

• Random CSP Models

Page 3: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Typical Case Analysis: Beyond NP-Completeness

Constrainedness

Com

puta

tion

al C

ost (

Mea

n)

% o

f so

lvab

le in

stan

ces

Phase TransitionPhenomenon:Discriminating “easy” vs.“hard” instances

Hogg et al 96

Page 4: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Exceptional Hard Instances

Seem to defy the “easy-hard” pattern:

– such instances occur in the under-constrained area;

– they are considerably harder than other similar instances and even harder than instances from the critically constrained area.

Gent and Walsh 94Hogg and Williams 94Smith and Grant 97

Page 5: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Are Exceptionally Hard Instances Truly Hard?

• Different algorithms encounter different exceptionally hard instances.

• ``Hardness'' of exceptionally hard instances

not necessarily hardness of the instances, but rather a the combination of the instance with the details of the search method;

Gent and Walsh 94Hogg and Williams 94Selman and Kirkpatrick 96Smith and Grant 97

Page 6: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Randomized Backtrack Search

What if we introduce a tiny element of randomness into the search heuristic – e.g., by breaking ties randomly --- and run this (still complete) randomized search procedure on the same instance over and over again?

Study of runtime distributions of a randomized backtrack search

on the same instance : Way of isolating the variance caused

solely by the algorithm Gomes et al CP 97

Page 7: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Time: >20003011 >20007

Easy instance – 15 % preassigned cells

Gomes, et al 97

Extreme Variance in Runtimeof Randomized Backtrack Search

Page 8: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Heavy-tailed distributions

0,]Pr[ 2

CsomeforxCexX

Exponential decay for standard distributions, e.g. Normal, Logonormal,

exponential:

Heavy-Tailed Power Law Decay e.g. Pareto-Levy:

0,]Pr[ xCxxX

Normal

(Frost et al 97; Gomes et al 97 ,Hoos 1999,Walsh 99,)

Page 9: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Heavy-tailed Dist.

Visualization of Heavy-tailed Phenomenon(Log-Log Plot of Tail o Distribution)

Visualization of Heavy-tailed Phenomenon(Log-Log Plot of Tail o Distribution)

Normal(2,1000000)

Normal(2,1)

1-F

(x)

Unso

lved f

ract

ion

Runtime (Number of backtracks) (log scale)

O,1%>200000

50%

2

Median=2

Page 10: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Formal Results

Abstract Search Tree Models with provably heavy-tailed behavior (Chen, Gomes, Selman 2001)

Generalization and Assignment of Semantics to the Abstract Search Tree Models

(Williams, Gomes, Selman 2003)

Provably Polytime Restart Strategies

(Williams, Gomes, Selman 2003)

Page 11: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

What about concrete CSP models?(so far no good characterization of

runtime distributions of concrete CSP models)

Page 12: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Research Questions:

1. Can we provide a characterization of heavy-tailed behavior: when it occurs and it does not occur?

2. Can we identify different tail regimes across different constrainedness regions?

3. Can we get further insights into the tail regime by analyzing the concrete search trees produced by the backtrack search method?

Concrete CSP ModelsComplete Randomized Backtrack Search

Page 13: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Outline of the Rest of the Talk

• Random Binary CSP Models• Encodings of CSP Models• Randomized Backtrack Search Algorithms• Search Trees• Statistical Tail Regimes Across Cosntrainedness

Regions– Empirical Results– Theoretical Model

• Conclusions

Page 14: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Binary Constraint Networks

• A finite binary constraint network P = (X, D,C)

– a set of n variables X = {x1, x2, …, xn}– For each variable, set of finite domains

D = { D(x1), D(x2), …, D(xn)}– A set C of binary constraints between pairs of variables;

a constraint Cij, on the ordered set of variables (xi, xj) is a subset of the Cartesian product D(xi) x D(xj) that specifies the allowed combinations of values for the variables xi and xj.

– Solution to the constraint networkinstantiation of the variables such that all constraints are satisfied.

Page 15: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Random Binary CSP Models

Model B < N, D, c, t >

N – number of variables; D – size of the domains; c – number of constrained pairs of variables;

p1 – proportion of binary constraints included in network ;c = p1 N ( N-1)/ 2;

t – tightness of constraints;p2 - proportion of forbidden tuples; t = p2 D2

Model E <N, D, p>

N – number of variables; D – size of the domains: p – proportion of forbidden pairs (out of D2N ( N-1)/ 2)

(Achlioptas et al 2000)

(Gent et al 1996)

N – from 15 to 50; (Xu and Li 2000)

Page 16: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Encodings

• Direct CSP Binary Encoding• Satisfiability Encoding (direct encoding)

Walsh 2000

Page 17: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Backtrack Search Algorithms

• Look-ahead performed::– no look-ahead (simple backtracking BT);– removal of values directly inconsistent with the last instantiation

performed (forward-checking FC);– arc consistency and propagation (maintaining arc consistency, MAC).

• Different heuristics for variable selection (the next variable to instantiate):– Random (random);– variables pre-ordered by decreasing degree in the constraint graph (deg);– smallest domain first, ties broken by decreasing degree (dom+deg)

• Different heuristics for variable value selection:– Random– Lexicographic

• For the SAT encodings we used the simplified Davis-Putnam-Logemann-Loveland procedure: Variable/Value static and random

Page 18: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Inconsistent Subtrees

Bessiere at al 2004

Page 19: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Distributions

• Runtime distributions of the backtrack search algorithms;

• Distribution of the depth of the inconsistency trees found during the search;

All runs were performed without censorship.

Page 20: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Main Results

1 - Runtime distributions2 – Inconsistent Sub-tree Depth

Distributions

Dramatically different statistical regimes across the constrainedness

regions of CSP models;

Page 21: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Runtime distributions

Page 22: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Distribution of Depth of Inconsistent Subtrees

Page 23: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Applet

Applet

Page 24: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Depth of Inconsistent Search Tree vs. Runtime Distributions

Page 25: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Other Models and More Sophisticated Consistency Techniques

Other Models and More Sophisticated Consistency Techniques

BT MAC

Heavy-tailed and non-heavy-tailed regions.As the “sophistication” of the algorithm increases the heavy-tailed region extends to the right, getting closer to the phase transition

Model B

Page 26: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

SAT encoding: DPLL

Page 27: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Theoretical Model

Page 28: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Depth of Inconsistent Search Tree vs. Runtime Distributions

Theoretical Model

X – search cost (runtime);ISTD – depth of an inconsistent sub-tree;

Pistd [IST = N]– probability of finding an inconsistent sub-tree of depth N during search;

P[X>x | N] – probability of the search cost being larger x, given an inconsistent tree of depth N

Page 29: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Depth of Inconsistent Search Tree vs. Runtime Distributions:

Theoretical Model

See paper for proofdetails

Page 30: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Regressions for B1, B2, K

Regression for B1 and B2 Regression for k

Page 31: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Validation: Theoretical Model vs. Runtime Data

α= 0.26 using the model;α= 0.27 using runtime data;

Page 32: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Summary of Results

1 As constrainedness increases change from heavy-tailed to a non-heavy-tailed regime

Both models (B and E), CSP and SAT encodings, for the different backtrack search strategies:

Page 33: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Summary of Results

2 Threshold from the heavy-tailed to non-heavy-tailed regime

– Dependent on the particular search procedure;

– As the efficiency of the search method increases, the extension of the heavy-tailed region increases: the heavy-tailed threshold gets closer to the phase transition.

Page 34: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Summary of Results

3 Distribution of the depth of inconsistent search sub-trees

Exponentially distributed inconsistent sub-tree depth (ISTD) combined with exponential growth of the search space as the tree depth increases implies heavy-tailed runtime distributions.

As the ISTD distributions move away from the exponential distribution, the runtime distributions become non-heavy-tailed.

Page 35: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Research Challenges

How to exploit these results in terms of the design of more efficient search procedures?

– Randomization and restart strategies;– Search heuristics:– Look ahead and look back strategies;

Very exciting and promising research area !

Page 36: Statistical Regimes Across Constrainedness Regions Carla P. Gomes, Cesar Fernandez Bart Selman, and Christian Bessiere Cornell University Universitat de

Demos and papers:

www.cs.cornell.edu/gomes/http://fermat.eup.udl.es/~cesar/

www.cs.cornell.edu/selman/ http://www.lirmm.fr/~bessiere/