![Page 1: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/1.jpg)
CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNING
11/20/2013 Santiago Ontañón [email protected] https://www.cs.drexel.edu/~santi/teaching/2013/CS380/intro.html
![Page 2: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/2.jpg)
Machine Learning Summary: • Several types of learning:
• Learning from examples: • Supervised Learning • Unsupervised learning
• Reinforcement Learning • Learning from Demonstration (imitation) • Etc.
• Today: • Learning decision trees
![Page 3: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/3.jpg)
Inductive Learning From Examples • f: unknown function that we want to learn
• f: X à Y where “X” is the input space, and “Y” is the target space
![Page 4: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/4.jpg)
Inductive Learning From Examples • f: unknown function that we want to learn
• f: X à Y where “X” is the input space, and “Y” is the target space
For example: If we want to use machine learning to learn the evaluation function for “Othello”: - X: space of Othello boards - Y: real number
If we want to use machine learning to learn how to read hand-written characters: - X: 16x16 pixel images - Y: characters
![Page 5: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/5.jpg)
Inductive Learning From Examples • f: unknown function that we want to learn
• f: X à Y where “X” is the input space, and “Y” is the target space
• Training set: • Set of examples from which to learn: e = (x1, f(x1))
![Page 6: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/6.jpg)
Inductive Learning From Examples • f: unknown function that we want to learn
• f: X à Y where “X” is the input space, and “Y” is the target space
• Training set: • Set of examples from which to learn: e = (x1, f(x1))
For example: If we want to use machine learning to learn the evaluation function for “Othello”: e1 = (board1,+15) e2 = (board2, -5) …
![Page 7: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/7.jpg)
Inductive Learning From Examples • f: unknown function that we want to learn
• f: X à Y where “X” is the input space, and “Y” is the target space
• Training set: • Set of examples from which to learn: e = (x1, f(x1))
• Learning algorithm: • Method that given the training set, generated a hypothesis h that
fits the data • Different learning algorithms explore different hypothesis spaces:
• Hypothesis space: set of all possible hypotheses that can be formulated • Learning algorithm explores this search space looking for the simplest
hypothesis that fits the data
![Page 8: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/8.jpg)
Induction of Decision Trees • One of the earliest forms of machine learning
• Algorithm: ID3: • Hypothesis space: decision trees • Example representation: feature vectors • Explores the space of decision trees, trying to find one that fits the
data
![Page 9: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/9.jpg)
Decision Tree Example • Target function: “is it a good day to play tennis?”
Outlook
Overcast
Humidity
NormalHigh
No Yes
Wind
Strong Weak
No Yes
Yes
RainSunny
![Page 10: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/10.jpg)
Training Set
![Page 11: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/11.jpg)
Training Set
f([sunny,hot,high,weak]) = no f(sunny,hot,high,strong]) = no
etc.
![Page 12: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/12.jpg)
Learning Decision Trees • Generating a hypothesis from examples:
Outlook
Overcast
Humidity
NormalHigh
No Yes
Wind
Strong Weak
No Yes
Yes
RainSunny
![Page 13: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/13.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
![Page 14: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/14.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind]
![Page 15: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/15.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
![Page 16: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/16.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
![Page 17: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/17.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
![Page 18: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/18.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook
![Page 19: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/19.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
![Page 20: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/20.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
![Page 21: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/21.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
![Page 22: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/22.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
![Page 23: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/23.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
![Page 24: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/24.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
Humidity
![Page 25: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/25.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
Humidity High Normal
![Page 26: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/26.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
Humidity High Normal
![Page 27: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/27.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
Humidity High Normal
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, Wind] Tree:
![Page 28: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/28.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
Humidity High Normal
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, Wind] Tree:
![Page 29: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/29.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
Humidity High Normal
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, Wind] Tree:
No
![Page 30: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/30.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
Humidity Sunny Overcast Rainy
No
![Page 31: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/31.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
Humidity
No
Humidity = High
High Normal
![Page 32: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/32.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Examples = [ … ] Attributes_left = [day, temperature, humidity, Wind] Tree:
Humidity
No
Humidity = High High Normal
Yes
Humidity = Normal
![Page 33: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/33.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Humidity
…
![Page 34: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/34.jpg)
ID3 Algorithm ID3(examples, attributes_left)
Tree = new Node() If all examples have the same target value vt
Tree.target = vt Return Tree
If attributes_left = empty list Tree.target = most common target value in examples Return Tree
Tree.attribute = A = best attribute in attributes_left For each possible value v of A
If “examples where A = v” is empty SubTree = leaf node with most common value in examples Else SubTree = ID3(examples where A = v, attributes_left – A) add SubTree to Tree in a branch labeled “A = v”
Return Tree
Examples = [ … ] Attributes_left = [day, outlook, temperature, humidity, Wind] Tree:
Outlook Sunny Overcast Rainy
Humidity
…
Outlook = Sunny
![Page 35: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/35.jpg)
ID3 Output
Outlook
Overcast
Humidity
NormalHigh
No Yes
Wind
Strong Weak
No Yes
Yes
RainSunny
![Page 36: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/36.jpg)
ID3 Output
Outlook
Overcast
Humidity
NormalHigh
No Yes
Wind
Strong Weak
No Yes
Yes
RainSunny
This tree can now be used to predict the target value for examples that were not in the original training set: generalization
![Page 37: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/37.jpg)
Which Attribute is Best? • In the original training set, we have:
• 9 examples with “YES” • 5 examples with “NO”
• If we start with “outlook” If we start with “wind”:
9 yes 5 no
2 yes 3 no 4 yes 3 yes
2 no
9 yes 5 no
3 yes 3 no
6 yes 2 no
sunny overcast rainy strong mild
![Page 38: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/38.jpg)
Which Attribute is Best? • In the original training set, we have:
• 9 examples with “YES” • 5 examples with “NO”
• If we start with “outlook” If we start with “wind”:
9 yes 5 no
2 yes 3 no 4 yes 3 yes
2 no
9 yes 5 no
3 yes 3 no
6 yes 2 no
sunny overcast rainy strong mild
We want examples to be classified as well as possible. Ideally, all “yes”
examples in one branch, and all “no” examples in another branch.
![Page 39: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/39.jpg)
Entropy • Given a set of symbols S, drawn from an alphabet B
• Entropy: expected number of bits needed to encode the next symbol (amount of information that knowing one more symbol provides us)
• If S are drawn at random from B, entropy is maximal • If S is always the same symbol from B, entropy is minimal
(we know which symbol will come next, so, no new information)
![Page 40: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/40.jpg)
Entropy
• Example for binary variable:
Entr
opy(S
)
1.0
0.5
0.0 0.5 1.0
p+
H(X) = �X
i
p(xi)log p(Xi)
![Page 41: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/41.jpg)
Entropy for Attribute Selection • The entropy in a node of the tree determines how well
“grouped” are the examples in that node:
9 yes 5 no
2 yes 3 no 4 yes 3 yes
2 no
sunny overcast rainy
Outlook
H = 0.94
H = 0.98 H = 0.00 H = 0.98
9 yes 5 no
3 yes 3 no
6 yes 2 no
strong mild
Wind
H = 0.94
H = 1.00 H = 0.81
![Page 42: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/42.jpg)
Entropy for Attribute Selection • Information Gain: reduction gain in Entropy due to
selecting attribute A
• Idea: the best attribute is the one that maximizes information gain
Gain(S,A) = H(S)�X
v2values(A)
|Sv||S| H(Sv)
![Page 43: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/43.jpg)
Information Gain
9 yes 5 no
2 yes 3 no 4 yes 3 yes
2 no
sunny overcast rainy
Outlook
H = 0.94
H = 0.98 H = 0.00 H = 0.98
9 yes 5 no
3 yes 3 no
6 yes 2 no
strong mild
Wind
H = 0.94
H = 1.00 H = 0.81
Gain(outlook) = 0.94 – 5/14 * (0.98) – 4/14 * 0 – 4/14 * 0.98 = 0.24 Gain(Wind) = 0.94 – 6/14 * (1.00) – 8/14 * (0.81) = 0.05
![Page 44: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/44.jpg)
ID3 Does Best-First Search
...
+ + +
A1
+ – + –
A2
A3
+
...
+ – + –
A2
A4
–
+ – + –
A2
+ – +
... ...
–
�
Search in the hypothesis space, starting from a tree with a single node, and using Information Gain as the heuristic function. We can conceive better search strategies (e.g. A*), but the computational cost might be too large (although it might learn much better!)
![Page 45: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/45.jpg)
ID3 • Better heuristics than “Information Gain” exits:
• E.g. Gain Ratio, GINI index, RLDM distance, etc.
• Many alternative search strategies (e.g. decision forests)
• Over-fitting: • What happens if we have an example in the training set that is
noise? (e.g. that is mislabeled) • ID3 will try to force it into the decision tree! • Over-fitting strategies exist to avoid this (e.g. prevent leaves that
have only a very small number of examples)
![Page 46: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/46.jpg)
ID3 • Converting a tree to rules:
• Each branch of the tree is a rule • A decision tree is just a compact way to represent a set of rules
Outlook
Overcast
Humidity
NormalHigh
No Yes
Wind
Strong Weak
No Yes
Yes
RainSunny
Outlook = rain and Wind = strong è playtennis = no
![Page 47: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/47.jpg)
ID3 • Converting a tree to rules:
• Each branch of the tree is a rule • A decision tree is just a compact way to represent a set of rules
Outlook
Overcast
Humidity
NormalHigh
No Yes
Wind
Strong Weak
No Yes
Yes
RainSunny
Outlook = rain and Wind = strong è playtennis = no
Thus, ID3 can be used to extract knowledge (e.g. rules) from large databases of examples
![Page 48: CS 380: ARTIFICIAL INTELLIGENCE DECISION TREE LEARNINGsanti/teaching/2013/CS380/... · 2013-11-20 · Machine Learning Summary: • Several types of learning: • Learning from examples:](https://reader033.vdocument.in/reader033/viewer/2022042222/5ec916cd233920076327a315/html5/thumbnails/48.jpg)
Other Supervised ML Methods • Lazy Methods
• Instance-based Learning • Case-Based Reasoning
• Bayesian Learning: • Naïve Bayes • Bayesian Networks
• Regression methods (when target function is numerical) • Neural Networks • Boosting • Bagging • Support Vector Machines • etc.