05.1-learningobserv-object oriented analysis and design

65
Learning From Observations “In which we describe agents that can improve their behavior through diligent study of their own experiences.” -Artificial Intelligence: A Modern Approach Prepared by: San Chua, Natalie Weber, Henry Kwong

Upload: studentscorners

Post on 29-Mar-2015

322 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Learning From Observations

“In which we describe agents that can improve their behavior through diligent study of their own experiences.”

-Artificial Intelligence: A Modern Approach

Prepared by: San Chua, Natalie Weber, Henry Kwong

Page 2: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Outline

• Learning agents• Inductive learning• Learning decision trees

– Example of a decision tree– Decision-tree-learning algorithm– Accessing the performance

• Learning general logical descriptions– Current-best hypothesis search algorithm– Version space learning algorithm

• Computational learning theory• Summary

Page 3: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Learning Agent

• Four Components1. Performance Element: collection of

knowledge and procedures to decide on the next action. E.g. walking, turning, drawing, etc.

2. Learning Element: takes in feedback from the critic and modifies the performance element accordingly.

Page 4: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Learning Agent (con’t)

- Critic: provides the learning element with information on how well the agent is doing based on a fixed performance standard. E.g. the audience- Problem Generator: provides the performance element with suggestions on new actions to take.

Page 5: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Designing a Learning Element

• Depends on the design of the performance element

• Four major issues1. Which components of the performance

element to improve2. The representation of those components3. Available feedback4. Prior knowledge

Page 6: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Components of the Performance Element

• A direct mapping from conditions on the current state to actions

• Information about the way the world evolves• Information about the results of possible

actions the agent can take• Utility information indicating the desirability of

world states

Page 7: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Representation

• A component may be represented using different representation schemes

• Details of the learning algorithm will differ depending on the representation, but the general idea is the same

• Functions are used to describe a component

Page 8: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Feedback & Prior Knowledge

• Supervised learning: inputs and outputs available

• Reinforcement learning: evaluation of action• Unsupervised learning: no hint of correct

outcome• Background knowledge is a tremendous help

in learning

Page 9: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Outline

• Learning agents• Inductive learning• Learning decision trees

– Example of a decision tree– Decision-tree-learning algorithm– Accessing the performance

• Learning general logical descriptions– Current-best hypothesis search algorithm– Version space learning algorithm

• Computational learning theory• Summary

Page 10: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Inductive Learning

• Key idea:– To use specific examples to reach general

conclusions• Given a set of examples, the system tries to

approximate the evaluation function.• Also called Pure Inductive Inference

Page 11: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Recognizing Handwritten Digits

Learning Agent

Training Examples

Page 12: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Recognizing Handwritten Digits

Different variations of handwritten 3’s

Page 13: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Bias

• Bias: any preference for one hypothesis over another, beyond mere consistency with the examples.

• Since there are almost always a large number of possible consistent hypotheses, all learning algorithms exhibit some sort of bias.

Page 14: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Example of Bias

Is this a 7 or a 1?Some may be more biasedtoward 7 and others more biased toward 1.

Page 15: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Formal Definitions

• Example: a pair (x, f(x)), where – x is the input, – f(x) is the output of the function applied to x.

• hypothesis: a function h that approximates f, given a set of examples.

Page 16: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Task of Induction

• The task of induction: Given a set of examples, find a function h that approximates the true evaluation function f.

Page 17: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Outline

• Learning agents• Inductive learning• Learning decision trees

– Example of a decision tree– Decision-tree-learning algorithm– Accessing the performance

• Learning general logical descriptions– Current-best hypothesis search algorithm– Version space learning algorithm

• Computational learning theory• Summary

Page 18: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Patrons?

No Yes WaitEst?

No Alternate? Hungry? Yes

Reservation? Fri/Sat?

NoYes Yes

nonesome

full

>6030-60 10-30

0-10

no yes

no yes no yes

Decision Tree Example

Goal Predicate: Will wait for a table?

No

No Yes

no yes

http://www.cs.washington.edu/education/courses/473/99wi/

Page 19: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Patrons?

WaitEst?

Hungry?

Yes

nonesome

full

>60 30-60 10-300-10

no yes

Logical Representation of a Path

r [Patrons(r, full) Wait_Estimate(r, 10-30) Hungry(r, yes)] Will_Wait(r)

Page 20: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Expressiveness of Decision Trees• Any Boolean function can be written as a decision tree• Limitations

– Can only describe one object at a time.– Some functions require an exponentially large

decision tree.• E.g. Parity function, majority function

• Decision trees are good for some kinds of functions, and bad for others.

• There is no one efficient representation for all kinds of functions.

Page 21: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Principle Behind the Decision-Tree-Learning Algorithm

• Uses a general principle of inductive learning often called Ockham’s razor: “The most likely hypothesis is the simplest one that is consistent with all observations.”

Page 22: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Outline

• Learning agents• Inductive learning• Learning decision trees

– Example of a decision tree– Decision-tree-learning algorithm– Accessing the performance

• Learning general logical descriptions– Current-best hypothesis search algorithm– Version space learning algorithm

• Computational learning theory• Summary

Page 23: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

• Goal: Find a relatively small decision tree that is consistent with all training examples, and will correctly classify new examples.

• Note that finding the smallest decision tree is an intractable problem. So the Decision-Tree-Algorithm uses some simple heuristics to find a “smallish” one.

Decision-Tree-Learning Algorithm

Page 24: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Getting Started

• Come up with a set of attributes to describe the object or situation.

• Collect a complete set of examples (training set) from which the decision tree can derive a hypothesis to define (answer) the goal predicate.

Page 25: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Attributes Goal

Example Fri Hun Pat Price Rain Res Type Est WillWait

X1 No Yes Some $$$ No Yes French 0-10 Yes

X2 No Yes Full $ No No Thai 30-60 No

X3 No No Some $ No No Burger 0-10 Yes

X4 Yes Yes Full $ No No Thai 10-30 Yes

X5 Yes No Full $$$ No Yes French >60 No

X6 No Yes Some $$ Yes Yes Italian 0-10 Yes

X7 No No None $ Yes No Burger 0-10 No

X8 No Yes Some $$ Yes Yes Thai 0-10 Yes

X9 Yes No Full $ Yes No Burger >60 No

X10 Yes Yes Full $$$ No Yes Italian 10-30 No

X11 No No None $ No No Thai 0-10 No

X12 Yes Yes Full $ No No Burger 30-60 Yes

Will we wait, or not?

The Restaurant Domain

http://www.cs.washington.edu/education/courses/473/99wi/

Page 26: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Splitting Examples by Testing on Attributes

+ X1, X3, X4, X6, X8, X12 (Positive examples) - X2, X5, X7, X9, X10, X11 (Negative examples)

Page 27: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Splitting Examples by Testing on Attributes (con’t)

+ X1, X3, X4, X6, X8, X12 (Positive examples) - X2, X5, X7, X9, X10, X11 (Negative examples)

Patrons?

+- X7, X11

nonesome

full

+X1, X3, X6, X8-

+X4, X12- X2, X5, X9, X10

Page 28: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Splitting Examples by Testing on Attributes (con’t)

+ X1, X3, X4, X6, X8, X12 (Positive examples) - X2, X5, X7, X9, X10, X11 (Negative examples)

Patrons?

+- X7, X11

nonesome

full

+X1, X3, X6, X8-

+X4, X12- X2, X5, X9, X10

No Yes

Page 29: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Splitting Examples by Testing on Attributes (con’t)

+ X1, X3, X4, X6, X8, X12 (Positive examples) - X2, X5, X7, X9, X10, X11 (Negative examples)

Patrons?

+- X7, X11

nonesome

full

+X1, X3, X6, X8-

+X4, X12- X2, X5, X9, X10

Hungry?

+ X4, X12- X2, X10

+- X5, X9

no yes

No Yes

Page 30: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Patrons?

+- X7, X11

nonesome

full

+X1, X3, X6, X8-

+X4, X12- X2, X5, X9, X10

Type?

+ X1- X5

FrenchItalian Thai

+X6- X10

+X3, X12- X7, X9

+ X4,X8- X2, X11

Burger

What Makes a Good Attribute?

BetterAttribute

Not As Good An Attribute

http://www.cs.washington.edu/education/courses/473/99wi/

Page 32: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Patrons?

No Yes WaitEst?

No Alternate? Hungry? Yes

Reservation? Fri/Sat?

NoYes Yes

nonesome

full

>6030-60 10-30

0-10

no yes

no yes no yes

Original Decision Tree Example

Goal Predicate: Will wait for a table?

No

No Yes

no yes

http://www.cs.washington.edu/education/courses/473/99wi/

Page 33: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Outline

• Learning agents• Inductive learning• Learning decision trees

– Example of a decision tree– Decision-tree-learning algorithm– Accessing the performance

• Learning general logical descriptions– Current-best hypothesis search algorithm– Version space learning algorithm

• Computational learning theory• Summary

Page 34: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Assessing the Performance of the Learning Algorithm

• A learning algorithm is good if it produces hypotheses that do a good job of predicating the classifications of unseen examples

• Test the algorithm’s prediction performance on a set of new examples, called a test set.

Page 35: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Methodology in Accessing Performance1. Collect a large set of examples.2. Divide it into 2 disjoint set: the training set and the test

set. It is very important that these 2 sets are separate so that the algorithm doesn’t cheat. Usually this division of examples is done randomly.

3. Use the learning algorithm with the training set as examples to generate a hypothesis H.

Page 36: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Methodology (con’t)

4. Measure the percentage of examples in the test set that are correctly classified by H.

5. Repeat steps 1 to 4 for different sizes of training sets and different randomly selected training sets of each size.

Page 37: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Analyzing the Results

0 20 40 60 80 100

1.0

0.9

0.8

0.7

0.6

0.5

0.4

% correct on test set

Training set size

Learning Curve for the Decision Tree Algorithm(On examples in the restaurant domain)

Happy Graph

“Artificial Intelligence A Modern Approach”, Stuart Russel Peter Norwig

Page 38: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Overfitting

• Overfitting is what happens when a learning algorithm finds meaningless “regularity” in the data.

• Caused by irrelevant attributes.• Solution: decision tree pruning.

– Resulting decision tree is.• Smaller.• More tolerant to noise.• More accurate in its predictions.

Page 39: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Practical Uses of Decision Tree Learning

• Designing oil platform equipment.• Learning to fly a plane.• Diagnosing heart attacks.

Page 40: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Outline

• Learning agents• Inductive learning• Learning decision trees

– Example of a decision tree– Decision-tree-learning algorithm– Accessing the performance

• Learning general logical descriptions– Current-best hypothesis search algorithm– Version space learning algorithm

• Computational learning theory• Summary

Page 41: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Learning General Logical Description• Key idea:

– Look at inductive learning generally – Find a logical description that is

equivalent to the (unknown) evaluation function

• Make our hypothesis more or less specific to match the evaluation function.

Page 42: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Outline

• Learning agents• Inductive learning• Learning decision trees

– Example of a decision tree– Decision-tree-learning algorithm– Accessing the performance

• Learning general logical descriptions– Current-best hypothesis search algorithm– Version space learning algorithm

• Computational learning theory• Summary

Page 43: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Current-best-hypothesis Search

• Key idea:– Maintain a single hypothesis throughout. – Update the hypothesis to maintain

consistency as a new example comes in.

Page 44: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Definitions

• Positive example: an instance of the hypothesis

• Negative example: not an instance of the hypothesis

• False negative example: the hypothesis predicts it should be a negative example but it is in fact positive

• False positive example: should be positive but it is actually negative.

Page 45: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Current-best-hypothesis Search Algorithm1. Pick a random example to define the initial

hypothesis2. For each example,

– In case of a false negative:• Generalize the hypothesis to include it

– In case of a false positive:• Specialize the hypothesis to exclude it

3. Return the hypothesis

Page 46: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

How to Generalize

a) Replacing Constants with Variables: Object(Animal,Bird) Object (X,Bird)

b) Dropping Conjuncts: Object(Animal,Bird) & Feature(Animal,Wings) Object(Animal,Bird)

c) Adding Disjuncts: Feature(Animal,Feathers) Feature(Animal,Feathers) v Feature(Animal,Fly)

d) Generalizing Terms: Feature(Bird,Wings) Feature(Bird,Primary-Feature)

http://www.pitt.edu/~suthers/infsci1054/8.html

Page 47: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

How to Specialize

a) Replacing Variables with Constants: Object (X, Bird) Object(Animal, Bird)

b) Adding Conjuncts: Object(Animal,Bird) Object(Animal,Bird) & Feature(Animal,Wings)

c) Dropping Disjuncts: Feature(Animal,Feathers) v Feature(Animal,Fly) Feature(Animal,Fly)

d) Specializing Terms: Feature(Bird,Primary-Feature) Feature(Bird,Wings)

http://www.pitt.edu/~suthers/infsci1054/8.html

Page 48: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

What do all these mean?

• Let’s look at some examples...

Page 49: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Generalize and Specialize

• Must be consistent with all other examples• Non-deterministic

– At any point there may be several possible specializations or generalizations that can be applied.

Page 50: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Potential Problem of Current-best-hypothesis Search• Extension made not necessarily lead to the

simplest hypothesis.• May lead to an unrecoverable situation where

no simple modification of the hypothesis is consistent with all of the examples.

• The program must backtrack to a previous choice point.

Page 51: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Problem of Backtracking

• Require large space to store all examples• Need to check all previous instances after

each modification of the hypothesis.• Search and check all these previous instances

over again after each modification is very expensive

Page 52: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Outline

• Learning agents• Inductive learning• Learning decision trees

– Example of a decision tree– Decision-tree-learning algorithm– Accessing the performance

• Learning general logical descriptions– Current-best hypothesis search algorithm– Version space learning algorithm

• Computational learning theory• Summary

Page 53: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Version Space Learning Algorithm

• Least-Commitment Search• No backtracking• Key idea:

– Maintain the most general and specific hypotheses at any point in learning. Update them as new examples come in.

Page 54: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Definitions

• Version space: a set of all hypotheses consistent with examples seen so far

• Boundary sets: sets of hypotheses defining boundary on which hypotheses are consistent with examples– Most general (G-set) and most specific (S-

set) boundary sets

Page 55: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Requirement

• Usually requires an enormous number of hypotheses to record.

• An assumption: a partial ordering (more-specific-than ordering) exists on all of the hypotheses in the space– hierarchical

• Boundary sets circumscribing the space of possible hypotheses. – G-set(the most general boundary)– S-set (the most specific boundary)

Page 56: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Version Space Learning Algorithm

1. Initially, the G-set is True, and the S-set is False

2. For each new example, there are 6 possible cases:

a) false positive for Si in S• Si is too general - no consistent specializations.

• Throw it out.

b) false negative for Si in S• Si is too specific.

• Replace it with its generalizations.

Page 57: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Version Space Learning Algorithm (con’t)

c) false positive for Gi in G• Gi is too general.

• Replace it with its specializations.

d) false negative for Gi in G• Gi is too specific - no consistent generalizations. • Throw it out.

e) Si more general than some other hypothesis in S or G

• Throw it out.

f) Gi more specific than some other hypothesis in S or G

• Throw it out.

Page 58: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Version Space Learning Algorithm (con’t)3. Repeat the process until one of three things

happens:a) Only one hypothesis left in the version

space.• This is the answer we want.

b) The version space collapses, i.e. either G or S becomes empty.• This means there are no consistent hypotheses.

c) We run out of examples while the version space still has several hypotheses. • Use their collective evaluation (breaking

disagreements with majority vote).

Page 59: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Advantages of the Algorithm

• Never favor one possible hypothesis over another; all remaining hypotheses are consistent

• Never require backtracking

Page 60: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Potential Problems

• Does not deal with noise– Not very practical in real-world learning problem

• Unlimited disjunctions in the version space leads to – The S-set has a single most specific hypothesis – The G-set has a most general hypothesis

Page 61: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Outline

• Learning agents• Inductive learning• Learning decision trees

– Example of a decision tree– Decision-tree-learning algorithm– Accessing the performance

• Learning general logical descriptions– Current-best hypothesis search algorithm– Version space learning algorithm

• Computational learning theory• Summary

Page 62: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Why Learning Works

• Problem: How can you know if a theory will accurately predict the future?

OR

How can you know that a hypothesis is close to the target function if you don’t know what the target function is?

• Answers provided by Computational Learning Theory

Page 63: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Computational Learning Theory

• Main principle: “any hypothesis that is seriously wrong will almost certainly be ‘found out’ with high probability after a small number of examples, because it will make an incorrect prediction.”

• Assumes that the training and test sets are drawn randomly

Page 64: 05.1-LearningObserv-OBJECT ORIENTED ANALYSIS AND DESIGN

Summary

• Learning agents• Inductive learning• Learning decision trees• Learning general logical descriptions

– Current-best hypothesis search algorithm– Version space learning algorithm

• Computational learning theory