lecture8 - from cbr to ibk
DESCRIPTION
TRANSCRIPT
Introduction to MachineIntroduction to Machine LearningLearning
Lecture 8Lecture 8 Instance Based Learning and Case-Based Reasoning
Albert Orriols i Puigi l @ ll l [email protected]
Artificial Intelligence – Machine LearningEnginyeria i Arquitectura La Salleg y q
Universitat Ramon Llull
Recap of Lecture 7kNN
15-NN 1-NN
Key aspectsValue of kValue of k
Distance functions
Slide 2Artificial Intelligence Machine Learning
Recap of Lecture 7Where is learning in kNN?g
Retrieval system
N l b l d lNo global model
No generalization
…
No learning!o ea g
B t till it i bl t t t l ifi tiBut still, it is able to create accurate classification models
Slide 3Artificial Intelligence Machine Learning
Today’s Agenda
Formalizing the framework: From kNN to CBRIncorporating learning in different phases:
Learn prototypesOrganize the memory in clustersOrganize the memory in clustersLearn the best distance function
Provide explanations
Slide 4Artificial Intelligence Machine Learning
From kNN to CBR
kNN provides a retrieving system
Much work on different phases of kNNPrototype selection
Distance function selection
…
CBR provides a general framework based on kNN
Slide 5Artificial Intelligence Machine Learning
Schema of CBR
Select a solution
CBR cycle(Aamodt & Plaza 1994)
ReuseSimilarityfunction Revise the
solution
Plaza, 1994)
Retrieve Revise
solution
Problem SolutionCaseMemory
Retain
Coherence andrelevance of the
attributesStructure and
agrupation of the casesRetain the
new knowledge
Slide 6Artificial Intelligence Machine Learning
Phases of CBRFive key phasesFive key phases
Preprocess the training instanceSo that it meets the requirements of the systemSo that it meets the requirements of the system
RetrieveU kNN i h h l d di f iUse kNN with the selected distance function
ReuseVote-based scheme
Revise Adapt the solution if necessary
RetainRemove examples from or add examples to the case memory
Slide 7Artificial Intelligence Machine Learning
Challenges in CBR
Hot areasReduce the cost of matchingReduce the cost of matching
Reduce the total number of examples in the case memoryO i th i l t d l lt lOrganize the case memory in clusters and only consult examples of some clusters
Automatically create distance functions that are suited to yourAutomatically create distance functions that are suited to your problem
Extraction of explanations:Extraction of explanations:CBR does not extract legible models (actually, does not learn any model))
Slide 8Artificial Intelligence Machine Learning
Prototype SelectionTraining data sets contain a large number of instancesg g
Increase the prediction time
M t i i i tMay contain noisy instances
Prototype selectionSelect the representative examples to form the case baseSelect the representative examples to form the case base
Remove all the other examples
How?Learn which examples are the ones that maximize CBR accuracy
Slide 9Artificial Intelligence Machine Learning
Prototype SelectionPossible sets of prototypes
Sel.Proto 1
Sel.Proto 2
Sel.Proto 3
…Training Data set
Training Data set
Training Data set Proto 1 Proto 2 Proto 3Data set
How do we know which i th b t S l ti f
Data setSplit the
training set
Data set
is the best Selection of Prototypes?
Validation
KNN
set
Test data set
Does it sound familiar to you?Does it sound familiar to you?Problem: Search for the best SPIt’s just an optimization problem
Slide 10Artificial Intelligence Machine Learning
It s just an optimization problemFor robustness, use cross-validation or similar validation procedures
Prototype SelectionOptimization methods used so farp
Genetic algorithms (Holland, 75)
G ti P i (K t l 1989)Genetic Programming (Koza et al., 1989)
Grammar Evolution (Ryan & O’Neill, 1998)
Slide 11Artificial Intelligence Machine Learning
Case-Based Memory Clustering
Training data sets contain a large number of instancesg gClustering: Place instances in different clusters
O l t i f th l t l t th tOnly retrieve from the same cluster or clusters that are close to you
Slide 12Artificial Intelligence Machine Learning
Case-Based Memory Clustering
Retrieve phase ReuseRetrieve phase1. Compare with all the prototypes2. Compare only with the examples
of the closest cluster
Reuse phasePropose a solution with theretrieved casesof the closest cluster retrieved cases
CaseR t i R iCase Memory
Retrieve Revise
Retain
Retain phaseUpdate the organization.It may imply the update of the
Revise phaseRevise if the solution is potentially valid
Slide 13Artificial Intelligence Machine Learning
Retainy p y pclusters
p y
Generation of Distance Functions
How does the distance function influences learning?g
It may be the key between success and failure!
Slide 14Artificial Intelligence Machine Learning
Generation of Distance Functions
Can I find a distance function that makes kNN perform pthe best in all cases?
No way Actually NFL announces it (Wolpert 1992)No way. Actually, NFL announces it (Wolpert, 1992)
Different distances suited for different domains
May I try to create a new distance function for each specific problem?
Of course. Again, an optimization problem
Slide 15Artificial Intelligence Machine Learning
Generation of Distance FunctionsSplit the training data set into
T i i t’Training set’Validation set
Optimization problem
Assume a parametric form
Validation set
Optimize the parameters of the underlying function
Dist.function1
Dist.function2
Dist.functionn
Being more ambitious?
Do not assume any parametric form… form
Optimize both the function structure and the parameters
Training Data set‘
kNN Examples:
(Fornells et al., 2005)
error1 error2 errorn
( , )
(Camps et al., 2003)
Slide 16Artificial Intelligence Machine Learning
Extraction of ExplanationsOne of the main drawbacks of CBR is that it does not provide pany explanation
Prediction based on nearest neighbors
New techniques to provide explanationsNew techniques to provide explanationsBased on used instances
Building of partial models
Not studied in more detail here
Slide 17Artificial Intelligence Machine Learning
Next Class
Probabilistic-based learning
Slide 18Artificial Intelligence Machine Learning
Introduction to MachineIntroduction to Machine LearningLearning
Lecture 8Lecture 8 Instance Based Learning and Case-Based Reasoning
Albert Orriols i Puigi l @ ll l [email protected]
Artificial Intelligence – Machine LearningEnginyeria i Arquitectura La Salleg y q
Universitat Ramon Llull