overfitting, cross-validation2005/02/02 · overfitting, cross-validation recommended reading: •...
TRANSCRIPT
![Page 1: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/1.jpg)
Overfitting, Cross-Validation
Recommended reading:
• Neural nets: Mitchell Chapter 4
• Decision trees: Mitchell Chapter 3
Machine Learning 10-701
Tom M. MitchellCarnegie Mellon University
![Page 2: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/2.jpg)
Overview
• Followup on neural networks– Example: Face classification
• Cross validation– Training error– Test error– True error
• Decision trees– ID3, C4.5– Trees and rules
![Page 3: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/3.jpg)
![Page 4: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/4.jpg)
![Page 5: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/5.jpg)
![Page 6: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/6.jpg)
# of gradient descent steps
![Page 7: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/7.jpg)
# of gradient descent steps
![Page 8: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/8.jpg)
# of gradient descent steps
![Page 9: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/9.jpg)
![Page 10: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/10.jpg)
![Page 11: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/11.jpg)
Cognitive Neuroscience Models Based on ANN’s[McClelland & Rogers, Nature 2003]
![Page 12: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/12.jpg)
![Page 13: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/13.jpg)
![Page 14: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/14.jpg)
How should we choose the number of weight updates?
![Page 15: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/15.jpg)
![Page 16: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/16.jpg)
How should we choose the number of weight updates?
How should we allocate N examples to training, validation sets?
How will curves change if we double training set size?
How will curves change if we double validation set size?
What is our best unbiased estimate of true network error?
![Page 17: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/17.jpg)
Overfitting and Cross ValidationOverfitting: a learning algorithm overfits the
training data if it outputs a hypothesis, h 2 H, when there exists h’ 2 H such that:
where
![Page 18: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/18.jpg)
Three types of errorTrue error:
Train set error:
Test set error:
![Page 19: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/19.jpg)
Bias in estimates
Gives a biased (optimistically) estimate for
Gives an unbiased estimate for
![Page 20: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/20.jpg)
Leave one out cross validationMethod for estimating true error of h’• e=0• For each training example z
– Training on {data – z}– Test on single example z; if error, then e e+1
Final error estimate (for training on sample of size |data|-1) is: e / |data|
![Page 21: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/21.jpg)
Leave one out cross validationThe leave-one-out error, e / |data|, gives an almost
unbiased estimate for
![Page 22: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/22.jpg)
Leave one out cross validationIn fact, the e / |data| estimate of leave-one-out cross
validation is a slightly pessimistic estimate of
![Page 23: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/23.jpg)
How should we choose the number of weight updates?
How should we allocate N examples to training, validation sets?
How will curves change if we double training set size?
How will curves change if we double validation set size?
What is our best unbiased estimate of true network error?
![Page 24: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning](https://reader035.vdocument.in/reader035/viewer/2022070221/6138a6ad0ad5d2067649633b/html5/thumbnails/24.jpg)
What you should know:
• Neural networks– Hidden layer representations
• Cross validation– Training error, test error, true error– Cross validation as low-bias estimator