optimization by model fitting chapter 9 luke, essentials of metaheuristics, 2011 byung-hyun ha r1
TRANSCRIPT
Optimization by Model Fitting
Chapter 9
Luke, Essentials of Metaheuristics, 2011
Byung-Hyun Ha
R1
2
Outline
Introduction
Model fitting by classification
Model fitting with distribution
Summary
3
Introduction
Exploring and/or exploiting solution space Construction or composition Tweak or mutation Recombination or crossover .. other ways?
In perspective of statistics Population and sampling
• e.g., a set of all students, a sample of students for examining their height
Tweaking in search (metaheuristics)• Sampling space of candidate solutions to select high-quality ones
An alternative to selecting and Tweaking by (statistical) model Classification model
• Graduate students ;-), decision trees, neural networks, …
Probability distribution
4
Introduction
Example: T-problem with 5 jobs Training by sampling 15 solutions from population of 120 ones,
• and question: what is the quality of 2-5-3-1-0?
How?• By classification or using probability distribution
0-2-3-4-1(23) 4-1-0-3-2(15) 1-2-3-4-0(12) 0-3-1-2-4(19) 1-4-2-3-0(11)
1-2-4-3-0(11) 1-3-4-0-2(15) 2-1-4-3-0(10) 0-3-2-4-1(24) 1-3-2-0-4(16)
3-4-2-1-0(15) 0-1-3-2-4(19) 2-0-3-1-4(16) 2-4-3-0-1(16) 3-1-0-2-4(19)
4-3-2-1-0(15) 0-4-2-3-1(20) 3-4-0-1-2(15) 4-1-3-0-2(15) 3-1-4-2-0(12)
4-1-3-0-2(15) 1-0-2-3-4(18) 0-4-3-1-2(15) 1-0-2-4-3(17) 3-4-2-1-0(15)
4-2-0-1-3(16) 1-2-3-4-0(12) 4-1-2-3-0(11) 0-4-2-1-3(19) 1-2-4-3-0(11)
0-1-4-3-2(15) 3-2-4-1-0(15) 4-2-3-0-1(17) 0-4-3-2-1(21) 3-1-2-4-0(13)
1-4-2-3-0(11) 0-3-1-2-4(19) 4-1-3-0-2(15) 3-0-1-2-4(19) 2-4-3-1-0(13)
2-3-4-0-1(17) 0-3-4-2-1(21) 0-2-4-3-1(22) 0-4-1-2-3(14) 4-3-2-0-1(18)
3-4-2-0-1(18) 1-4-2-0-3(11) 4-0-2-1-3(19) 0-1-2-3-4(18) 4-3-2-1-0(15)
...
0-1-3-2-4(19) 2-4-3-0-1(16) 2-4-3-1-0(13)
1-0-2-3-4(18) 1-0-2-4-3(17) 3-4-1-0-2(15)
0-3-4-2-1(21) 0-4-1-2-3(14) 2-1-3-0-4(14)
4-1-0-3-2(15) 0-3-1-2-4(19) 2-0-4-1-3(18)
3-2-4-1-0(15) 0-4-3-2-1(21) 4-0-3-1-2(15)
solution space as populationsampling
a sample asrepresentatives of population
something we can do?
5
Model Fitting by Classification
Classification problem Given a collection of records, to find a model for class attribute as a
function of the values of other attributes
Fitting a model, or model induction, machine learning
0-2-3-4-1(23) 4-1-0-3-2(15) 1-2-3-4-0(12)
1-2-4-3-0(11) 1-3-4-0-2(15) 2-1-4-3-0(10)
3-4-2-1-0(15) 0-1-3-2-4(19) 2-0-3-1-4(16)
4-3-2-1-0(15) 0-4-2-3-1(20) 3-4-0-1-2(15)
training set
a classificationmodel
induction
Is 2-5-3-1-0 a good solution?
query or test
Give me a good solution!
generation
6
Model Fitting by Classification
Examples of classification algorithms Graduate students by naggings of professors ;-) Decision trees by C4.5 and ID3
• c.f., http://www-users.cs.umn.edu/~kumar/dmbook/ch4.pdf
k-nearest-neighbor (kNN) by kNN algorithm Neural networks by backpropagation algorithm
Tid Refund MaritalStatus
TaxableIncome Cheat
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes10
Refund
MarSt
TaxInc
YESNO
NO
NO
Yes No
MarriedSingle, Divorced
< 80K > 80K
records (training set)
a decision tree for classification (or prediction)
7
Model Fitting by Classification
Classification problem (revisited) Given a collection of records, to find a model for class attribute as a
function of the values of other attributes
Application of classification to search Given a collection of solutions, to find a model for fitness as a function of
the values of components of solutions
Generating children from the model Rejection sampling with discriminative models
• Algorithm 115 and 117
Region-based sampling with generative models• Algorithm 116
Learnable Evolution Model (LEM) Algorithm 114
a classificationmodel
Is 2-5-3-1-0 a good solution?
Give me a good solution!
discriminative model
generative model
rejection sampling
region-based sampling
8
Model Fitting by Classification
Examples Inducing a decision tree Generating children from a decision tree
x
y
1.0
1.0
0.4
0.6
0.7
0.70.50.30.0
good bad
bad good
y
0.7
x
0.3
y
0.6
x
0.5
good bady
0.4
bad
x
0.7
bad good
9
Model Fitting by Classification
Example (cont’d) A model that specifies the probability
y
x x
x
y
y y
1.0
1.0
0.4
0.6
0.7
0.70.50.30.0
good bad
0.00 0.75
0.7
0.3
0.6
0.5
0.80 0.25
0.4
0.14
x
0.7
0.00 1.00
10
Model Fitting by Classification
Example (Talbi, 2009) Application of rule-based classifier into crossover operator Rules
• If X4 = 5 and X5 < 2, then class = best
• ...
Patterns matching the rules 5 1 • ...
Possible crossover?• 2 1 7 2 1 3 4 3 2 1 7 5 8 1 7 4• 3 2 5 7 8 0 7 4 3 2 5 5 1 1 4 3
11
Model Fitting with a Distribution
An alternative form of model A distribution of an infinite-sized population
• A set of candidate solutions: a sample from population
Working with sample distribution
Estimation of Distribution Algorithm Representing distribution of infinite population with a number of samples Loop: sampling a set of individuals assessing them adjust the
distribution to reflect the new fitness results Algorithm 118: An Abstract Estimation of Distribution Algorithm (EDA)
12
Model Fitting with a Distribution
Representing distributions for genotype with n genes Using n-dimensional histogram
• A fairly high-resolution grid to accurately represent the distribution• c.f., kd-tree or quadtree
• A fairly high amount of grid points• an when distribution of each gene is discretized into a pieces
Using parametric distribution• e.g., m number of gaussian curves
• How many gaussian curves?• n-dimensional gaussian: mean vector of size n and a covariance matrix
of size n2
1,000 genes? 1,000,000 numbers
13
Model Fitting with a Distribution
Representing distributions (cont’d) Using marginal distributions
• Projecting full distribution into a single dimension for each gene• Representing single distribution, again
• 1-dimensional array as a histogram• 1-dimensional gaussians as a parametric representation• Size of representation?
• Problems (very big)?
14
Model Fitting with a Distribution
Univariate Estimation of Distribution Algorithms Population-Based Incremental Learning (PBIL)
• Genes having finite discrete values• n marginal distributions with n genes, initially uniform• Representation?
• Truncation selection of good solutions sampled using distribution• Gradual marginal distribution update• Algorithm 119: Population-Based Incremental Learning
Univariate Marginal Distribution Algorithm (UMDA)• A variation on PBIL• Any selection procedure, allowed• Entirely replacing distribution D each time around ( = 1)
• Large sample, required (why?)
Compact Genetic Algorithm (cGA)• Genes having boolean values• Updating each marginal distribution by pairwise comparison of individuals
• c.f., Modeling finite population instead of infinite one• Algorithm 120: The Compact Genetic Algorithm
15
Model Fitting with a Distribution
Univariate Estimation of Distribution Algorithms (cont’d) Real-valued representations
• By discretization of each marginal distribution• Histogram approach• Using PBIL directly
• By parametric approach• e.g., using single gaussian
Unbiased estimators of mean and variance for parameter estimation
Updating each marginal distribution by linear combination
• Using multiple distributions
c.f., Expectation Maximization (EM) algorithm
16
Model Fitting with a Distribution
Multivariate Estimation of Distribution Algorithms Problems in univariate estimation (using marginal distributions)
• Assumption of no linkage between genes• c.f., cooperative coevolution
An alternative• Using bivariate distributions
• One distribution for every pair of genes• Using triple genes per distribution, using quadruple …
A better way• Multivariate distribution for strongly-linked genes, selectively
• e.g., Bayes Network
c.f., not only about how good, but also about why it is good
• (Hierarchical) Bayesian Optimization Algorithm• Algorithm 121: An Abstract Version of the Bayesian Optimization Algorit
hm (BOA)
17
Hybrid Metaheuristics (Talbi, 2009)
Combining with X Mathematical programming approaches
• Enumeration algorithms• Relaxation and decomposition methods• Branch and cut and price algorithms
Constraint programming Data mining techniques Multiobjective optimization
Classical hybrid approaches Low-level relay hybrids Low-level teamwork hybrids High-level relay hybrids High-level teamwork hybrids
18
Summary
Exploring and/or exploiting solution space In perspective of statistics
Model fitting by classification Employing decision trees, kNN, neural networks Generating children from the model
Model fitting with a distribution Estimation of Distribution Algorithm Representing distributions
• n-dimensional histogram, parametric distributions, marginal distributions
Univariate Estimation of Distribution Algorithms• Problems
Multivariate Estimation of Distribution Algorithms• Bayes Network