learning agents laboratory computer science department george mason university
DESCRIPTION
CS 782 Machine Learning. 3. Inductive Learning from Examples: Version space learning. Prof. Gheorghe Tecuci. Learning Agents Laboratory Computer Science Department George Mason University. Overview. Instances, concepts and generalization. Concept learning from examples. - PowerPoint PPT PresentationTRANSCRIPT
1 2003, G.Tecuci, Learning Agents Laboratory
Learning Agents LaboratoryComputer Science Department
George Mason University
Prof. Gheorghe Tecuci
3. Inductive Learning from Examples:Version space learning
3. Inductive Learning from Examples:Version space learning
2 2003, G.Tecuci, Learning Agents Laboratory
OverviewOverview
Concept learning from examples
Version spaces and the candidate elimination algorithm
The LEX system
Discussion
Instances, concepts and generalization
Recommended reading
The learning bias
3 2003, G.Tecuci, Learning Agents Laboratory
Basic ontological elements: instances and conceptsBasic ontological elements: instances and concepts
An instance is a representation of a particular entity from the application domain.
A concept is a representation of a set of instances.
government_of_Britain_1943government_of_US_1943
state_government
instance_of instance_of
“state_government” represents the set of all entities that are governments of states. This set includes “government_of_US_1943” and “government_of_Britain_1943” which are called positive examples.
government_of_US_1943
government_of_Britain_1943
state_government
“instance_of” is the relationship between an instance and the concept to which it belongs.
An entity which is not an instance of a concept is called a negative example of that concept.
5 2003, G.Tecuci, Learning Agents Laboratory
Concept generalityConcept generality
A concept P is more general than another concept Q if and only if the set of instances represented by P includes the set of instances represented by Q.
state_government
democratic_government
representative_democracy
totalitarian_government
parliamentary_democracy
Example:
“subconcept_of” is the relationship between a concept and a more general concept.
state_government
subconcept_of
democratic_government
7 2003, G.Tecuci, Learning Agents Laboratory
A generalization hierarchyA generalization hierarchy
feudal_god_king_government
totalitarian_government
democratic_government
theocratic_government
state_government
military_dictatorship
police_state
religious_dictatorship
representative_democracy
parliamentary_democracy
theocratic_democracy
monarchy
governing_body
other_state_government
dictator
deity_figure
chief_and_tribal_council
autocratic_leader
democratic_council_or_board
group_governing_body
other_group_
governing_body
government_of_Italy_1943
government_of_Germany_1943
government_of_US_1943
government_of_Britain_1943
ad_hoc_ governing_body established_ governing_body
other_type_of_governing_body
fascist_state
communist_dictatorshipreligious_
dictatorshipgovernment_
of_USSR_1943
9 2003, G.Tecuci, Learning Agents Laboratory
OverviewOverview
Concept learning from examples
Version spaces and the candidate elimination algorithm
The LEX system
Discussion
Instances, concepts and generalization
Recommended reading
The learning bias
10 2003, G.Tecuci, Learning Agents Laboratory
Empirical inductive concept learning from examplesEmpirical inductive concept learning from examples
Illustration
Positive examples of cups: P1 P2 ...
Negative examples of cups: N1 …
A description of the cup concept: has-handle(x), ...
Given
Learn
Approach:Compare the positive and the negative examples of a concept, in terms of their similarities and differences, and learn the concept as a generalized description of the similarities of the positive examples.
Concept Learning allows the agent to recognize other entities as being instances of the learned concept.
Why is Concept Learning important?
11 2003, G.Tecuci, Learning Agents Laboratory
The learning problemThe learning problem
Given • a language of instances; • a language of generalizations; • a set of positive examples (E1, ..., En) of a concept • a set of negative examples (C1, ... , Cm) of the same concept • a learning bias • other background knowledge
Determine • a concept description which is a generalization of the positive
examples that does not cover any of the negative examples
Purpose of concept learningPredict if an instance is an example of the learned concept.
12 2003, G.Tecuci, Learning Agents Laboratory
Generalization and specialization rulesGeneralization and specialization rules
A generalization rule is a rule that transforms an expression into a more general expression.
A specialization rule is a rule that transforms an expression into a less general expression.
The reverse of any generalization rule is a specialization rule.
Learning a concept from examples is based on generalization and specialization rules.
13 2003, G.Tecuci, Learning Agents Laboratory
Indicate several generalizations of the following sentence:
Students who have lived in Fairfax for more then 3 years.
DiscussionDiscussion
Indicate several specializations of the following sentence:
Students who have lived in Fairfax for more then 3 years.
14 2003, G.Tecuci, Learning Agents Laboratory
Generalization (and specialization) rulesGeneralization (and specialization) rules
Climbing the generalization hierarchy
Dropping condition
Generalizing numbers
Adding alternatives
Turning constants into variables
15 2003, G.Tecuci, Learning Agents Laboratory
Turning constants into variablesTurning constants into variables
Generalizes an expression by replacing a constant with a variable.
?O1 is multi_group_force number_of_subgroups 5
?O1 is multi_group_force number_of_subgroups ?N1
generalization specialization?N1 5 5 ?N1
The set of multi_group_forces with 5 subgroups.
The set of multi_group_forces with any number of subgroups.
Allied_forces_operation_Husky
Axis_forces_Sicily
Japan_1944_Armed_Forces
17 2003, G.Tecuci, Learning Agents Laboratory
Climbing the generalization hierarchiesClimbing the generalization hierarchies
Generalizes an expression by replacing a concept with a more general one.
?O1 is single_state_forcehas_as_governing_body ?O2
?O2 is representative_democracy
generalization specializationrepresentative_democracy
democratic_government
The set of single state forces governed by representative democracies
democratic_government representative_democracy
?O1 is single_state_forcehas_as_governing_body ?O2
?O2 is democratic_government
The set of single state forces governed by democracies
democratic_government
representative_democracy parliamentary_democracy
19 2003, G.Tecuci, Learning Agents Laboratory
Dropping conditionsDropping conditions
Generalizes an expression by removing a constraint from its description.
?O1 is multi_member_forcehas_international_legitimacy “yes”
?O1 is multi_member_force
generalization specialization
The set of multi-member forces that have international legitimacy.
The set of multi-member forces (that may or may not have international legitimacy).
20 2003, G.Tecuci, Learning Agents Laboratory
Extending intervalsExtending intervals
Generalizes an expression by replacing a number with an interval, or by replacing an interval with a larger interval.
?O1 is multi_group_force number_of_subgroups ?N1?N1 is-in [3 .. 7]
generalization specialization[3 .. 7] 5 5 [3 .. 7]
?O1 is multi_group_force number_of_subgroups ?N1?N1 is-in [2 .. 10]
generalization specialization[2 .. 10] [3 .. 7] [3 .. 7] [2 .. 10]
?O1 is multi_group_force number_of_subgroups 5
The set of multi_group_forces with exactly 5 subgroups.
The set of multi_group_forces with at least 3 subgroups and at most 7 subgroups.
The set of multi_group_forces with at most 10 subgroups.
22 2003, G.Tecuci, Learning Agents Laboratory
Adding alternativesAdding alternatives
?O1 is alliance
has_as_member ?O2
?O1 is alliance OR coalition
has_as_member ?O2
generalization specialization
The set of alliances.
The set including both the alliances and the coalitions.
Generalizes an expression by replacing a concept C1 with the union (C1 U C2), which is a more general concept.
23 2003, G.Tecuci, Learning Agents Laboratory
Generalization and specialization rulesGeneralization and specialization rules
Climbing the generalization hierarchies
Dropping conditions
Extending intervals
Adding alternatives
Turning constants into variables
Descending the generalization hierarchies
Adding conditions
Reducing intervals
Dropping alternatives
Turning variables into constants
24 2003, G.Tecuci, Learning Agents Laboratory
Operational definition of generalization/specialization
Generalization/specialization of two concepts
Least general generalization of two concepts
Minimally general generalization of two concepts
Types of generalizations and specializationsTypes of generalizations and specializations
Maximally general specialization of two concepts
25 2003, G.Tecuci, Learning Agents Laboratory
Operational definition of generalizationOperational definition of generalization
Operational definition:
A concept P is said to be more general than another concept Q if and only if Q can be transformed into P by applying a sequence of generalization rules.
Non-operational definition:
A concept P is said to be more general than another concept Q if and only if the set of instances represented by P includes the set of instances represented by Q.
This definition is not operational because it requires to show that each instance I from a potential infinite set Q is also in the set P.
Why isn’t this an operational definition?
26 2003, G.Tecuci, Learning Agents Laboratory
Generalization of two conceptsGeneralization of two concepts
Operational definition:
The concept Cg is a generalization of the concepts C1 and C2 if and only if both C1 and C2 can be transformed into Cg by applying gene-ralization rules (assuming the existence of a complete set of rules).
Definition:
The concept Cg is a generalization of the concepts C1 and C2 if and only if Cg is more general than C1 and Cg is more general than C2.
MANEUVER-UNIT
ARMORED-UNIT INFANTRY-UNIT
MANEUVER-UNITis a generalization of
ARMORED-UNITand
INFANTRY-UNIT
How would you define this?
Is the above definition operational?
27 2003, G.Tecuci, Learning Agents Laboratory
Generalization of two concepts: exampleGeneralization of two concepts: example
?O1 IS COURSE-OF-ACTIONTOTAL-NUMBER-OF-OFFENSIVE-ACTIONS 10TYPE OFFENSIVE
C1:
?O1 IS COURSE-OF-ACTIONTOTAL-NUMBER-OF-OFFENSIVE-ACTIONS 5
C2:
?O1 IS COURSE-OF-ACTIONTOTAL-NUMBER-OF-OFFENSIVE-ACTIONS ?N1
?N1 IS-IN [5 … 10]
C:
Generalize 10 to [5 .. 10]Drop “?O1 TYPE OFFENSIVE”
Generalize 5 to [5 .. 10]
Remark: COA=Course of Action
28 2003, G.Tecuci, Learning Agents Laboratory
Specialization of two conceptsSpecialization of two concepts
Operational definition:
The concept Cs is a specialization of the concepts C1 and C2 if and only if both C1 and C2 can be transformed into Cs by applying specialization rules (or Cs can be transformed into both C1 and into C2 by applying generalization rules). This assumes a complete set of rules.
Definition:
The concept Cs is a specialization of the concepts C1 and C2 if and only if Cs is less general than C1 and Cs is less general than C2.
MILITARY-MANEUVER
MILITARY-ATTACK
PENETRATE-MILITARY-TASKis a specialization of
MILITARY-MANEUVERand
MILITARY-ATTACK
PENETRATE-MILITARY-TASK
29 2003, G.Tecuci, Learning Agents Laboratory
Other useful definitionsOther useful definitions
The concept G is a minimally general generalization of A and B if and only if G is a generalization of A and B, and G is not more general than any other generalization of A and B.
If there is only one minimally general generalization of two concepts A and B, then this generalization is called the least general generalization of A and B.
The concept C is a maximally general specialization of two concepts A and B if and only if C is a specialization of A and B and no other specialization of A and B is more general than C.
Minimally general generalization
Least general generalization
Maximally general specialization
Specialization of a concept with a negative example
30 2003, G.Tecuci, Learning Agents Laboratory
Concept learning: another illustrationConcept learning: another illustration
Learned concept:Cautious learner
Allied_Forces_1943 is equal_partner_multi_state_alliancehas_as_member US_1943
Positive examples:
Negative examples:
European_Axis_1943 is dominant_partner_multi_state_alliancehas_as_member Germany_1943
Somali_clans_1992 is equal_partner_multi_group_coalitionhas_as_member Isasq_somali_clan_1992
?O1 is multi_state_alliancehas_as_member ?O2
?O2 is single_state_force
A multi-state alliance that has as member a single state force.
31 2003, G.Tecuci, Learning Agents Laboratory
What could be said about the predictions of a cautious learner?
DiscussionDiscussion
Concept to be learned
Concept learned by a
cautions learner
33 2003, G.Tecuci, Learning Agents Laboratory
Concept learning: yet another illustrationConcept learning: yet another illustration
Aggressive learnerLearned concept:
Allied_Forces_1943 is equal_partner_multi_state_alliancehas_as_member US_1943
Positive examples:
Negative examples:
European_Axis_1943 is dominant_partner_multi_state_alliancehas_as_member Germany_1943
Somali_clans_1992 is equal_partner_multi_group_coalitionhas_as_member Isasq_somali_clan_1992
?O1 is multi_member_forcehas_as_member ?O2
?O2 is single_state_force
A multi-member force that has as member a single state force.
34 2003, G.Tecuci, Learning Agents Laboratory
What could be said about the predictions of an aggressive learner?
DiscussionDiscussion
Concept learned byan aggressivelearner
Concept to be learned
36 2003, G.Tecuci, Learning Agents Laboratory
How could one synergistically integrate a cautious learner with an aggressive learner to take advantage of their qualities to compensate for each other’s weaknesses?
DiscussionDiscussion
Concept to be learned
Concept learned by a
cautions learner
Concept learned byan aggressivelearner
Concept to be learned
Concept learned byan aggressivelearner Concept to be learned
Concept learned by a
cautions learner
37 2003, G.Tecuci, Learning Agents Laboratory
OverviewOverview
Concept learning from examples
Version spaces and the candidate elimination algorithm
The LEX system
Discussion
Instances, concepts and generalization
Recommended reading
The learning bias
38 2003, G.Tecuci, Learning Agents Laboratory
Basic idea of version space concept learningBasic idea of version space concept learning
UB+
Initialize the lower bound to the first positive example (LB=E1) and the upper bound (UB) to the most general generalization of E1.
LB
++
UBLBIf the next example is a positive one,
then generalize LB as little as possible to cover it.
_++
UBLB
If the next example is a negative one, then specialize UB as little as possible to uncover it and to remain more general than LB.
_++
UB=LB _
_ ++
…
Repeat the above two steps with the rest of examples until UB=LB.This is the learned concept.
Consider the examples E1, … , E2 in sequence.
40 2003, G.Tecuci, Learning Agents Laboratory
The candidate elimination algorithm (Mitchell, 1978)The candidate elimination algorithm (Mitchell, 1978)
Let us suppose that we have an example e1 of a concept to be learned. Then, any sentence of the representation language which is more general than this example, is a plausible hypothesis for the concept.
G: {e }
S: {e }
•
•• •
• ••
••••
••
• • •••
••
••• •
•
g
1 H = { h | h is more general than e1 }
The version space is:
41 2003, G.Tecuci, Learning Agents Laboratory
The candidate elimination algorithm (cont.)The candidate elimination algorithm (cont.)
• • ••
••••
••
• • •••
••
••
more general UB
LBmore specific
As new examples and counterexamples are presented to the program, candidate concepts are eliminated from H.
This is practically done by updating the set G (which is the set of the most general elements in H) and the set S (which is the set of the most specific elements in H).
43 2003, G.Tecuci, Learning Agents Laboratory
The candidate elimination algorithmThe candidate elimination algorithm
1. Initialize S to the first positive example and G to its most general generalization
2. Accept a new training instance I• If I is a positive example then
- remove from G all the concepts that do not cover I;- generalize the elements in S as little as possible to
cover I but remain less general than some concept in G;- keep in S the minimally general concepts.
• If I is a negative example then- remove from S all the concepts that cover I;
- specialize the elements in G as little as possible to uncover I and be more general than at least one element from S;
- keep in G the maximally general concepts.
3. Repeat 2 until G=S and they contain a single concept C (this is the learned concept)
44 2003, G.Tecuci, Learning Agents Laboratory
Illustration of the candidate elimination algorithmIllustration of the candidate elimination algorithm
Language of instances: (shape, size)shape: {ball, brick, cube}size: {large, small}
Learning process:
Input examples:
ball small +
shape size classball large +brick small –cube large –
-(brick, small)
G = {(ball, any-size) (any-shape, large)}
2
-(cube, large)
G = {(ball, any-size)}3
G = {(any-shape, any-size)}
S = {(ball, large)}
+(ball, large)
+(ball, large)
1
1
+(ball, small)
S = {(ball, any-size)}||
4
Language of generalizations: (shape, size)shape: {ball, brick, cube, any-shape}size: {large, small, any-size}
48 2003, G.Tecuci, Learning Agents Laboratory
OverviewOverview
Concept learning from examples
Version spaces and the candidate elimination algorithm
The LEX system
Discussion
Instances, concepts and generalization
Recommended reading
The learning bias
49 2003, G.Tecuci, Learning Agents Laboratory
The LEX systemThe LEX system
Lex is a system that uses the version space method to learn heuristics for suggesting when the integration operators should be applied for solving symbolic integration problems.
The problem of learning control heuristics
GivenOperators for symbolic integration:
OP1: ∫ r f(x) dx --> r ∫ f(x) dx
OP2: ∫ u dv --> uv - ∫ v du, where u=f1(x) and dv=f2(x)dx
OP3: 1 f(x) --> f(x)
OP4: ∫ (f1(x) + f2(x))dx --> ∫ f1(x) dx + ∫ f2(x)dx
OP5: ∫ sin(x) dx --> -cos(x) + C
OP6: ∫ cos(x) dx --> sin(x) + CFind
Heuristics for applying the operators as, for instance, the following one:
To solve ∫ rx transc(x) dx apply OP2 with u=rx and dv=transc(x)dx
50 2003, G.Tecuci, Learning Agents Laboratory
Remarks
The integration operators assure a satisfactory level of competence to the LEX system. That it, LEX is able in principle to solve a significant class of symbolic integration problems. However, in practice, it may not be able to solve many of these problems because this would require too many resources of time and space.
The description of an operator shows when the operator is applicable, while a heuristic associated with an operator shows when the operator should be applied, in order to solve a problem.
LEX tries to discover, for each operator OPi, the definition of the concept:situations in which OPi should be used.
51 2003, G.Tecuci, Learning Agents Laboratory
The architecture of LEXThe architecture of LEX
Version space of a proposed heuristic:
S: 3x cos(x) dx --> Apply OP2 with u = 3x dv = cos(x) dx
G: f1(x) f2(x) dx --> Apply OP2 with u = f1(x) dv = f2(x) dx
One of the suggested positive training instances:
3x cos(x) dx --> Apply OP2 with u = 3x dv = cos(x) dx
3x cos(x) dx
3x cos(x) dxOP2 with u = 3x, dv = cos(x) dx
3x sin(x) - 3sin(x) dx
3x sin(x) - 3 sin(x) dx
3x sin(x) + 3cos(x) + C
PROBLEMGENERATOR
LEARNERPROBLEMSOLVER
CRITIC
...
...OP1
OP5
∫ ∫
∫
∫
∫
∫
∫
1. What search strategy to use for problem solving?
2. How to characterize individual problem solving steps?
3. How to learn from these steps?
How is the initial VS defined?
4. How to generate a new problem?
53 2003, G.Tecuci, Learning Agents Laboratory
f
prim op
transc
trig explog
monom
sin cos tan ln exp
+ - * / ^
Generalization hierarchy for functions
r xn
k xn
r x...
k x ... ...
3 x ...
poly...
54 2003, G.Tecuci, Learning Agents Laboratory
Illustration of the learning processIllustration of the learning process
Continue learning of the heuristic for applying OP2:
The problem generatorgenerates a new problem to solve that is useful for learning.
The problem solverSolves this problem
The criticExtract positive and negative examples from the problem solving tree.
The learnerRefine the version space of the heuristic.
57 2003, G.Tecuci, Learning Agents Laboratory
OverviewOverview
Concept learning from examples
Version spaces and the candidate elimination algorithm
The LEX system
Discussion
Instances, concepts and generalization
Recommended reading
The learning bias
58 2003, G.Tecuci, Learning Agents Laboratory
Types of bias: - restricted hypothesis space bias; - preference bias.
The learning biasThe learning bias
A bias is any basis for choosing one generalization over another, other than strict consistency with the observed training examples.
59 2003, G.Tecuci, Learning Agents Laboratory
Some of the restricted spaces investigated:
- logical conjunctions (i.e. the learning system will look for a concept description in the form of a conjunction);
- linear threshold functions (for exemplar-based representations);
- three-layer neural networks with a fixed number of hidden units.
Restricted hypothesis space biasRestricted hypothesis space bias
The hypothesis space H (i.e. the space containing all the possible concept descriptions) is defined by the generalization language. This language may not be capable of expressing all possible classes of instances. Consequently, the hypothesis space in which the concept description is searched is restricted.
60 2003, G.Tecuci, Learning Agents Laboratory
The language of instances consists of triples of bits as, for example: (0, 1, 1), (1, 0, 1).
How many concepts are in this space?
Restricted hypothesis space bias: exampleRestricted hypothesis space bias: example
The total number of subsets of instances is 28 = 256.
This hypothesis space consists of 3x3x3 = 27 elements.
The language of generalizations consists of triples of 0, 1, and *, where * means any bit, for example: (0, *, 1), (*, 0, 1).
How many concepts could be represented in this language?
61 2003, G.Tecuci, Learning Agents Laboratory
Most preference biases attempt to minimize some measure of syntactic complexity of the hypothesis representation (e.g. shortest logical expression, smallest decision tree).
These are variants of Occam's Razor, which is the bias first defined by William of Occam (1300-1349):
Given two explanations of data, all other things being equal, the simpler explanation is preferable.
Preference biasPreference bias
A preference bias places a preference ordering over the hypotheses in the hypothesis space H. The learning algorithm can then choose the most preferred hypothesis f in H that is consistent with the training examples, and produce this hypothesis as its output.
62 2003, G.Tecuci, Learning Agents Laboratory
In general, the preference bias may be implemented as an order relationship 'better(f1, f2)' over the hypothesis space H.Then, the system will choose the "best" hypothesis f, according to the "better" relationship.
An example of such a relationship:"less-general-than" which produces the least general expression consistent with the data.
Preference bias: representationPreference bias: representation
How could the preference bias be represented?
63 2003, G.Tecuci, Learning Agents Laboratory
OverviewOverview
Concept learning from examples
Version spaces and the candidate elimination algorithm
The LEX system
Discussion
Instances, concepts and generalization
Recommended reading
The learning bias
64 2003, G.Tecuci, Learning Agents Laboratory
ProblemProblem
color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3green rectangle small - i4yellow circle large + i5
any-color
warm-color cold-color
red yelloworange blackblue green
any-shape
polygone round
triangle rectangle
square
circle ellipse
any-size
large small
Apply the candidate elimination algorithm to learn the concept represented by the above examples.
Language of instances: An instance is defined by triplet of the form (specific-color, specific-shape, specific-size)
Language of generalization: (color-concept, shape-concept, size-concept)
Set of examples:
Background knowledge:
Task:
65 2003, G.Tecuci, Learning Agents Laboratory
Solution:
+i1: (color = orange) & (shape = square) & (size = large)S: {[(color = orange) & (shape = square) & (size = large)]}G: {[(color = any-color) & (shape = any-shape) & (size = any-size)]}
-i2: (color = blue) & (shape = ellipse) & (size = small)S: {[(color = orange) & (shape = square) & (size = large)]}G: {[(color = warm-color) & (shape = any-shape) & (size = any-size)], [(color = any-color) & (shape = polygon) & (size = any-size)], [(color = any-color) & (shape = any-shape) & (size = large)]}
+i3: (color = red) & (shape = triangle) & (size = small)S: {[(color = warm-color) & (shape = polygon) & (size = any-size)]}G: {[(color = warm-color) & (shape = any-shape) & (any-size)], [(color = any-color) & (shape = polygon) & (size = any-size)]}
-i4: (color = green) & (shape = rectangle) & (size = small)S: {[(color = warm-color) & (shape = polygon) & (size = any-size)]}G: {[(color = warm-color) & (shape = any-shape) & (size = any-size)}
+i5: (color = yellow) & (shape = circle) & (size = large)S: {[(color = warm-color) & (shape = any-shape) & (size = any-size)]}G: {[(color = warm-color) & (shape = any-shape) & (size = any-size)]}
The concept is:(color = warm-color) & (shape = any-shape) & (size = any-size) ; a warm color object
66 2003, G.Tecuci, Learning Agents Laboratory
Does the order of the examples count? Why and how?
Consider the following order:
color shape size classorange square large + i1red triangle small + i3 yellow circle large + i5 blue ellipse small - i2green rectangle small - i4
67 2003, G.Tecuci, Learning Agents Laboratory
What happens if there are not enough examples for S and G to become identical?
DiscussionDiscussion
Could we still learn something useful?
How could we classify a new instance?
When could we be sure that the classification is the same as the one made if the concept were completely learned?
Could we be sure that the classification is correct?
68 2003, G.Tecuci, Learning Agents Laboratory
What happens if there are not enough examples for S and G to become identical?
Let us assume that one learns only from the first 3 examples:
color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3
S: {[(color = warm-color) & (shape = polygon) & (size = any-size)]}
G: {[(color = warm-color) & (shape = any-shape) & (any-size)], [(color = any-color) & (shape = polygon) & (size = any-size)]}
The final version space will be:
69 2003, G.Tecuci, Learning Agents Laboratory
color shape size classblue circle largeorange square smallred ellipse largeblue polygon small
G: {[(color = warm-color) & (shape = any-shape) & (any-size)], [(color = any-color) & (shape = polygon) & (size = any-size)]}
S: {[(color = warm-color) & (shape = polygon) & (size = any-size)]}
Assume that the final version space is:
How could we classify the following examples, how certain we are about the classification, and why?
_
+don’t knowdon’t know
70 2003, G.Tecuci, Learning Agents Laboratory
Could the examples contain errors?
What kind of errors could be found in an example?
What will be the result of the learning algorithm if there are errors in examples?
What could we do if we know that there are errors?
DiscussionDiscussion
71 2003, G.Tecuci, Learning Agents Laboratory
Could the examples contain errors?
What kind of errors could be found in an example?
DiscussionDiscussion
- Classification errors:- positive examples labeled as negative- negative examples labeled as positive
- Measurement errors- errors in the values of the attributes
72 2003, G.Tecuci, Learning Agents Laboratory
What will be the result of the learning algorithm if there are errors in examples?
Let us assume that the 4th example is incorrectly classified:
color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3green rectangle small + i4 (incorrect classification)yellow circle large + i5
S: {[(color = warm-color) & (shape = polygon) & (size = any-size)]}
G: {[(color = warm-color) & (shape = any-shape) & (any-size)], [(color = any-color) & (shape = polygon) & (size = any-size)]}
The version space after the first three examples is:
Continue learning
73 2003, G.Tecuci, Learning Agents Laboratory
What could we do if we know that there might be errors in the examples?
If we cannot find a concept consistent with all the training examples, then we may try to find a concept that is consistent with all but one of the examples.
If this fails, then we may try to find a concept that is consistent with all but two of the examples, an so on.
What is a problem with this approach?
Combinatorial explosion.
74 2003, G.Tecuci, Learning Agents Laboratory
What happens if we extend the generalization language to include conjunction, disjunction and negation of examples?
color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3green rectangle small - i4yellow circle large + i5
any-color
warm-color cold-color
red yelloworange blackblue green
any-shape
polygone round
triangle rectangle
square
circle ellipse
any-size
large small
Learn the concept represented by the above examples by applying the Versions Space method.
Set of examples:
Background knowledge:
Task:
75 2003, G.Tecuci, Learning Agents Laboratory
color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3green rectangle small - i4yellow circle large + i5
Set of examples:
G = {all the examples}S = { i1 }
G = { ¬i2 } ; all the examples except i2S = { i1 }
G = { ¬i2 }S = { i1 or i3 }
G = { ¬i2 or ¬i4 } ; all examples except i2 and i4S = { i1 or i3 }
G = { ¬i2 or ¬i4 } ; all examples except i2 and i4S = { i1 or i3 or i5 }
These are the minimal
generalizations and
specializations
76 2003, G.Tecuci, Learning Agents Laboratory
The futility of bias-free learningThe futility of bias-free learning
A learner that makes no a priori assumptions regarding the identity of the target concept has no rational basis for classifying any unseen instance.
77 2003, G.Tecuci, Learning Agents Laboratory
What happens if we extend the generalization language to include internal disjunction? Does the algorithm still generalizes over the observed data?
color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3green rectangle small - i4yellow circle large + i5
any-color
warm-color cold-color
red yelloworange blackblue green
any-shape
polygone round
triangle rectangle
square
circle ellipse
any-size
large small
Learn the concept represented by the above examples by applying the Versions Space method.
Set of examples:
Background knowledge:
Task:
Generalization(i1, i3): (orange or red, square or triangle, large or small)Is it different from: i1 or i3?
78 2003, G.Tecuci, Learning Agents Laboratory
How is the generalization language extended by the internal disjunction?
Consider the following generalization hierarchy:
any-shape
polygon
triangle rectangle circle
79 2003, G.Tecuci, Learning Agents Laboratory
How is the generalization language extended by the internal disjunction?
polygon
triangle rectangle circle
triangle or rectangle triangle or circle rectangle or circle
polygon or circle
triangle or rectangle or circle
any-shape
polygon
triangle rectangle circle
The above hierarchy is replaced with the following one:
80 2003, G.Tecuci, Learning Agents Laboratory
any-color
warm-color cold-color
red yelloworange blackblue green
Consider now the following generalization hierarchy:
Which is the corresponding hierarchy containing disjunctions?
81 2003, G.Tecuci, Learning Agents Laboratory
Could you think of another approach to learning a disjunctive concept with the candidate elimination algorithm?
Find a concept1 that is consistent with some of the positive examples and none of the negative examples.
Remove the covered positive examples from the training set and repeat the procedure for the rest of examples, computing another concept2 that covers some positive examples, and so on, until there is no positive example left.
The learned concept is “concept1 or concept2 or …”
Could you specify this algorithm better?
Hint: Initialize S with the first positive example, …
82 2003, G.Tecuci, Learning Agents Laboratory
Consider the following:
Instance languagecolor {red, orange, yellow, blue, green, black}
Generalization languagecolor {red, orange, yellow, blue, green, black, warm-color, cold-color, any-color}
sequence of positive and negative examples of a concept, and the background knowledge represented by the following hierarchy:
Apply the candidate elimination algorithm to learn the concept represented by the above examples.
any-color
warm-color cold-color
red yelloworange blackblue green
example1(+): orange example2(-): blue example3(+): red
ExerciseExercise
83 2003, G.Tecuci, Learning Agents Laboratory
• In its original form learns only conjunctive descriptions.
• However, it could be applied successively to learn disjunctive descriptions.
• Requires an exhaustive set of examples.
• Conducts an exhaustive bi-directional breadth-first search.
• The sets S and G can be very large for complex problems.
• It is very important from a theoretical point of view, clarifying the process of inductive concept learning from examples.
• Has very limited practical applicability because of the combinatorial explosion of the S and G sets.
• It is at the basis of the powerful Disciple multistrategy learning method which has practical applications.
Features of the version space methodFeatures of the version space method
84 2003, G.Tecuci, Learning Agents Laboratory
Recommended readingRecommended reading
Mitchell T.M., Machine Learning, Chapter 2: Concept learning and the general to specific ordering, pp. 20-51, McGraw Hill, 1997.
Mitchell, T.M., Utgoff P.E., Banerji R., Learning by Experimentation: Acquiring and Refining Problem-Solving Heuristics, in Readings in Machine Learning.
Tecuci, G., Building Intelligent Agents, Chapter 3: Knowledge representation and reasoning, pp. 31-75, Academic Press, 1998.
Barr A. and Feigenbaum E. (Eds.), The Handbook of Artificial Intelligence, vol III, pp.385-400, pp.484-493.