machine learning : introduction · machine learning - ht 2016 1. introduction varun kanade...
TRANSCRIPT
![Page 1: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/1.jpg)
Machine learning - HT 2016
1. Introduction
Varun Kanade
University of OxfordJanuary 20, 2016
![Page 2: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/2.jpg)
What is machine learning?
1
![Page 3: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/3.jpg)
Machine learning and Artificial intelligence
What does intelligence entail?
I Reasoning, planning, representation, learning
I Courses: Intelligent Systems (MT), Knowledge Representation &Reasoning (HT)
I Open AI Initiative: http://openai.com/
2
![Page 4: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/4.jpg)
Outline
History of Machine Learning
This Class
Some Machine Learning Applications
Some Practical Concerns
![Page 5: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/5.jpg)
History of Machine Learning
Statistics: Ronald Fisher
I Three types of iris: setosa, versicolour, virginica(1936)
I For each flower: sepal width (x1), sepal length (x2),petal width (x3), petal length (x4)
3
![Page 6: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/6.jpg)
Visualize Iris Data: Setosa vs Versicolor
4.0 4.5 5.0 5.5 6.0 6.5 7.0Sepal length (cm)
0
2
4
6
8
10
12
4
![Page 7: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/7.jpg)
Visualize Iris Data: Setosa vs Versicolor
2.0 2.5 3.0 3.5 4.0 4.5Sepal width (cm)
0
2
4
6
8
10
12
14
16
4
![Page 8: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/8.jpg)
Visualize Iris Data: Setosa vs Versicolor
4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5Sepal length (cm)
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0Se
pal w
idth (c
m)
4
![Page 9: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/9.jpg)
History of Machine Learning
Statistics: Ronald Fisher
I Three types of iris: setosa, versicolour, virginica(1936)
I For each flower: sepal width (x1), sepal length (x2),petal width (x3), petal length (x4)
I FindX = w1x1 +w2x2 +w3x3 +w4x4 that maximizesD2/S
I D =∑4
i=1 wi(E[xi | setosa]− E[xi | versicolour])
I µi = E[xi]
I S =∑4
i=1
∑4j=1 wiwjE[(xi − µi)(xj − µj)]
I We will see that this is basically linear regression
5
![Page 10: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/10.jpg)
History of Machine Learning
Computer Science: Alan Turing
Turing Test: The Imitation Game
Learning Machines (Computing Machinery andIntelligence.Mind (1950))
‘‘Instead of trying to produce a programme tosimulate the adult mind, why not rather try toproduce one which simulates the child’s? If thiswere then subjected to an appropriate course ofeducation one would obtain the adult brain.’’
Quantitative computational considerationsI Howmuch memory would be required?I Howmuch computational power would be
required?
6
![Page 11: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/11.jpg)
History of Machine Learning
Neuroscience: Frank Rosenblatt
I Perceptron - neurally-inspired
I Simple training (learning)algorithm
I Built using specialized hardware
1 x1 x2 x3 x4
ϕ
sign(w0 + w1x1 + · · ·w4x4)
w0w1 w2
w3w4
7
![Page 12: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/12.jpg)
Perceptron Training Algorithm
Setting
I Get a sequence of points (xt, yt) (where only xt is observed at first)
I After prediction is made yt is revealed
I Start withw0 some arbitrary starting weights for the perceptron
Algorithm
1. Supposewt−1 are the weights after t− 1 steps
2. Predict yt = sign(wt−1 · xt)
3. Update:I If yt = yt; do nothingI Else setwt = wt−1 − η(1− 2yt)xt
8
![Page 13: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/13.jpg)
Perceptron Training Algorithm
Setting
I Get a sequence of points (xt, yt) (where only xt is observed at first)
I After prediction is made yt is revealed
I Start withw0 some arbitrary starting weights for the perceptron
Algorithm
1. Supposewt−1 are the weights after t− 1 steps
2. Predict yt = sign(wt−1 · xt)
3. Update:I If yt = yt; do nothingI Else setwt = wt−1 − η(1− 2yt)xt
8
![Page 14: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/14.jpg)
Perceptron Training Algorithm in Action
−8 −6 −4 −2 0 2 4 6 8−8
−6
−4
−2
0
2
4
6
8
9
![Page 15: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/15.jpg)
Perceptron Training Algorithm in Action
−8 −6 −4 −2 0 2 4 6 8−8
−6
−4
−2
0
2
4
6
8
9
![Page 16: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/16.jpg)
Perceptron Training Algorithm in Action
−8 −6 −4 −2 0 2 4 6 8−8
−6
−4
−2
0
2
4
6
8
9
![Page 17: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/17.jpg)
Perceptron Training Algorithm in Action
−8 −6 −4 −2 0 2 4 6 8−8
−6
−4
−2
0
2
4
6
8
9
![Page 18: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/18.jpg)
Perceptron Training Algorithm in Action
−8 −6 −4 −2 0 2 4 6 8−8
−6
−4
−2
0
2
4
6
8
9
![Page 19: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/19.jpg)
Perceptron Training Algorithm in Action
−8 −6 −4 −2 0 2 4 6 8−8
−6
−4
−2
0
2
4
6
8
9
![Page 20: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/20.jpg)
Perceptron Training Algorithm in Action
−8 −6 −4 −2 0 2 4 6 8−8
−6
−4
−2
0
2
4
6
8
9
![Page 21: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/21.jpg)
Perceptron Training Algorithm in Action
−8 −6 −4 −2 0 2 4 6 8−8
−6
−4
−2
0
2
4
6
8
9
![Page 22: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/22.jpg)
Perceptron Training Algorithm in Action
−8 −6 −4 −2 0 2 4 6 8−8
−6
−4
−2
0
2
4
6
8
9
![Page 23: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/23.jpg)
Perceptron Training Algorithm in Action
−8 −6 −4 −2 0 2 4 6 8−8
−6
−4
−2
0
2
4
6
8
9
![Page 24: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/24.jpg)
What is machine learning?
Some Definitions
I Kevin Murphy: ‘‘. . . , we define machine learning as a set of methods thatcan automatically detect patterns in data, and then use the uncoveredpatterns to predict future data, or to perform other kinds of decisionmaking under uncertainty ..’’
I TomMitchell: ‘‘A computer program is said to learn from experience Ewith respect to some class of tasks T and performance measure P, if itsperformance at tasks in T, as measured by P, improves with experience E.’’
I Same (or similar) programs work across a range of learning tasks.(though not universally)
10
![Page 25: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/25.jpg)
Machine Learning
I Intersection of computer science, statistics, neuroscience/biology,engineering, optimization etc.
I StatisticsI Howmuch data is needed?I When can we be confident in our predictions?
I Computer ScienceI Design algorithms for automated pattern discovery. How fast do
these run?I Howmuch computational power is needed?
11
![Page 26: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/26.jpg)
Outline
History of Machine Learning
This Class
Some Machine Learning Applications
Some Practical Concerns
![Page 27: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/27.jpg)
About this course
I Pre-requisites: Basic linear algebra, calculus, probability, algorithms,programming.
I Mathematical foundations of ML; not computational or statisticallearning theory
I Regression, support vector machines, neural networks, deep learning,clustering [video]
I Conceptual programming assignments; not scaling to real-worldsystems
12
![Page 28: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/28.jpg)
About this course
I Discussion forum on Piazza (link on webpage)
I Classes in Weeks 3-7 (Mon, Wed, Fri - 6 groups)
I Practicals in Weeks 2-8 (Tue, Thu - 2 groups)
I Final examination over easter break
I Office Hours: Tue 15:30-16:30 (449 Wolfson Building)
13
![Page 29: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/29.jpg)
Outline
History of Machine Learning
This Class
Some Machine Learning Applications
Some Practical Concerns
![Page 30: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/30.jpg)
Application: Boston Housing Dataset
Real attributes
I Crime rate per capita
I Non-retail business fraction
I Nitric Oxide concentration
I Age of house
I Floor area
I Distance to city centre
Integer attributes
I Number of rooms
Categorical attributes
I On the Charles river?
I Index of highway access (1-5)
Predict house cost
Source: UCI repository
14
![Page 31: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/31.jpg)
Application: Breast Cancer
Integer attributes
I Clump thickness
I Uniformity of cell size
I Uniformity of cell shape
I Marginal adhesion
I Single epithelial cell size
I Bare nuclei
I Bland Chromatin
I Normal nucleoli
I Mitoses
Predict: Benign vs Malignant
Source: UCI repository
15
![Page 32: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/32.jpg)
Application: Object Detection and Localization
I 200-basic level categories
I Dataset contains over 400,000 images
I Imagenet competition (2010--15)
16
![Page 33: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/33.jpg)
Application: Object Detection and Localization
Source: DeepLearning.net (top); Brain-Maps.com (bottom)
17
![Page 34: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/34.jpg)
Application: Object Detection and Localization
Source: Zeiler and Fergus (2013)
18
![Page 35: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/35.jpg)
Supervised Learning
I Training data has inputs (x) as well as outputs (y)
I Regression: When the output is real-valued, e.g.,Housing data
I Classification: Output is a categoryI Binary classification -- only two classes e.g.,Cancer, spam
I Multi-class classification -- several classes e.g.,Object detection
19
![Page 36: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/36.jpg)
Unsupervised Learning : Grouping News Articles
I Group items intocategories: sports, music,business, etc.
I Labels are not known
I Algorithm cannot know‘‘label names’’
20
![Page 37: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/37.jpg)
Unsupervised Learning : Genetic Data of European Populations
Source: Novembre et al., Nature (2008)
21
![Page 38: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/38.jpg)
Active and Semi-Supervised Learning
Active Learning
I Data is unlabelled
I Learning algorithm can ask for a label (from ahuman)
Semi-supervised Learning
I Some data is labelled, a lot more unlabelled
I Can using the two together help?
22
![Page 39: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/39.jpg)
Anomaly Detection or One-class Classification
Examples
I Detect possible malfunction at nuclearreactors
I Detect fraudulent transactions for creditcards
I Supervised learning vs anomalydetection
I Anomalous events much rarer, possiblynot related to each other
23
![Page 40: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/40.jpg)
Recommendation Systems
Movie / User Alice Bob Charlie Dean EveThe Shawshank Redemption 7 9 9 5 2
The Godfather 3 ? 10 4 3The Dark Knight 5 9 ? 6 ?Pulp Fiction ? 5 ? ? 10
Schindler’s List ? 6 ? 9 ?
I Netflix competition to predictuser-ratings (2008-09)
I Applications to all kinds of productrecommendations
I No user will have used several products;take advantage of large number of users
24
![Page 41: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/41.jpg)
Reinforcement Learning
I Automatic flying helicopter; self-driving cars
I Cannot program by hand
I Stochastic environment (hard to defineprecisely)
I Must take sequential decisions
I Can define reward functions
I Fun: Playing Atari breakout! [video]
25
![Page 42: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/42.jpg)
Outline
History of Machine Learning
This Class
Some Machine Learning Applications
Some Practical Concerns
![Page 43: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/43.jpg)
Cleaning up data
Spam Classification
I Look for words such as Nigeria, millions, Viagra, etc.
I Features such as the IP, other metadata
I If email addressed by name
Getting Features
I Often hand-crafted features by domain experts
I This class mainly assumes we already have features
I Feature learning using deep networks
26
![Page 44: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/44.jpg)
Some pitfalls
Sample Email
‘‘To build a spam classifier, we look for words such as Nigeria, millions, etc.’’
Training vs Test Data
I Future data should look like past data
I Not true for spam classification
27
![Page 45: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/45.jpg)
Some pitfalls
Sample Email
‘‘To build a spam classifier, we look for words such as Nigeria, millions, etc.’’
Training vs Test Data
I Future data should look like past data
I Not true for spam classification
27
![Page 46: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/46.jpg)
Cats vs Dogs
28
![Page 47: Machine Learning : Introduction · Machine learning - HT 2016 1. Introduction Varun Kanade University of Oxford January 20, 2016](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec5ef9f38130663f40e649a/html5/thumbnails/47.jpg)
Next Class
Linear Regression
I Brush up your linear algebra and calculus!
29