introduction to machine learning and data...
TRANSCRIPT
![Page 1: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/1.jpg)
Introduction to Machine Learning and Data Mining
Advanced Information Systems and Business Analytics for Air TransportationM.Sc. Air Transport Management
May 16-21, 2016
Slides prepared by Prof. N. Kemal Üre
![Page 2: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/2.jpg)
A Framework for Business Analytics
2
![Page 3: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/3.jpg)
What is Machine Learning?
Study of algorithms that can learn and make predictionsfrom data
3
ModelData Prediction
• Also referred to as predictive modeling or predictive analytics• Strong ties with statistics, computer science and optimization• A wide range of applications: spam filtering, optical character recognition
(OCR), search engines and computer vision
![Page 4: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/4.jpg)
What is Machine Learning?
• How is Machine Learning (ML) different than Data Mining and Statistics?
• Statistics– Sub-field of mathematics– Inference of probabilistic models– The main objective is understanding the underlying data generation
process
• Data Mining (DM)– Carried by a person, uses methods from statistics and ML– Usually works with massive datasets with problematics entries– Gain preliminary insight and make predictions
4
![Page 5: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/5.jpg)
ML/DM Process
Source: Kantarzdic 5
![Page 6: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/6.jpg)
ML/DM Process
Source: Kantarzdic 6
![Page 7: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/7.jpg)
Types of Data
7
![Page 8: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/8.jpg)
Data Preparation
• Transformations
– Normalization
• Decimal Scaling
• Min-max normalization
• Standard Deviation Normalization
– Smoothing
Source: Kantarzdic8
![Page 9: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/9.jpg)
Data Preparation
Source: Kantarzdic9
• Missing Data
• Time Dependent Data
![Page 10: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/10.jpg)
Data Preparation• Outliers
Source: Kantarzdic10
![Page 11: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/11.jpg)
Primary ML/DM Problems
• Supervised Learning
– Data is labeled <x_i,y_i>
– Learn the association between x and y
• Unsupervised Learning
– Data is unlabeled, we only have x_i
– Learn the structure and patters in x
• Reinforcement Learning
– Learn how to `control` a dynamic system
11
![Page 12: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/12.jpg)
Supervised Learning
• Classification
• Regression
12
![Page 13: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/13.jpg)
Classification
• Predict the class of the input variable
• Function approximation approach y = f(x)• Probabilistic approach P(y|x)
Source: Murphy 2011 13
![Page 14: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/14.jpg)
Classification Examples
Document Classification, Spam Filtering, Hand-written Digit Recognition
14
![Page 15: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/15.jpg)
Classification Examples
15
Face Detection
Credit Risk Calculation
![Page 16: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/16.jpg)
Classification for Delay Prediction
Source: Rebollo, Balakrishnan 2014 16
![Page 17: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/17.jpg)
Regression• Classification with continuous variables• Curve fitting and model selection
17
![Page 18: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/18.jpg)
Regression• Classification with continuous variables• Curve fitting and model selection
18
![Page 19: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/19.jpg)
Regression• Classification with continuous variables• Curve fitting and model selection
19
![Page 20: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/20.jpg)
Regression• Classification with continuous variables• Curve fitting and model selection
20
![Page 21: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/21.jpg)
Regression• Classification with continuous variables• Curve fitting and model selection
21
![Page 22: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/22.jpg)
Regression
Beware of the noise in the data!
22
![Page 23: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/23.jpg)
Regression Examples
• Predict tomorrow’s stock market price given current market conditions and other possible side information.
• Predict the age of a viewer watching a given video on YouTube.
• Predict the location in 3d space of a robot arm end effector, given control signals (torques) sent to its various motors.
• Predict the amount of prostate specific antigen (PSA) in the body as a function of a number of different clinical measurements.
• Predict the temperature at any location inside a building using weather data, time, door sensors, etc.
Source: Murphy 2011 23
![Page 24: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/24.jpg)
Regression for Predicting Ticket Prices
Source: Gini 2011 24
![Page 25: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/25.jpg)
Unsupervised Learning
• Clustering
• Learning Graphs
• Matrix Completion
25
![Page 26: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/26.jpg)
Clustering
• Segment the data into different groups
26
![Page 27: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/27.jpg)
Clustering Examples
Astronomy Social Networks
27
![Page 28: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/28.jpg)
Clustering for Delivery Network
28
![Page 29: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/29.jpg)
Clustering for Delivery Network
29
![Page 30: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/30.jpg)
Clustering for Delivery Network
30
![Page 31: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/31.jpg)
Clustering for Delivery Network
31
![Page 32: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/32.jpg)
Clustering for Delivery Network
32
![Page 33: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/33.jpg)
Clustering for Delivery Network
33
![Page 34: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/34.jpg)
Clustering for Delivery Network
34
![Page 35: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/35.jpg)
Clustering for Delivery Network
35
![Page 36: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/36.jpg)
36
smart study
prepared fair
pass
p(smart)=.8 p(study)=.6
p(fair)=.9
p(prep|…) smart smart
study .9 .7
study .5 .1
p(pass|…)smart smart
prep prep prep prep
fair .9 .7 .7 .2
fair .1 .1 .1 .1
Query: What is the probability that a student is smart, given that they pass the exam?
Bayesian Networks
![Page 37: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/37.jpg)
37
Bayesian Networks
Visit to Asia
Smoking
Lung CancerTuberculosis
Abnormalityin Chest
Bronchitis
X-Ray Dyspnea
“Asia” network:
![Page 38: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/38.jpg)
BN Application Fare Value and Passenger Behavior
Source: Booz Allen38
What is the expected fare value for a specific passenger behavior?
Can predictive modeling be developed for reservation changes and no-show rates for individual passengers on individual itineraries?
![Page 39: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/39.jpg)
Matrix Completion
Source: Murphy 2011 39
![Page 40: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/40.jpg)
MC for Image Recovery
Source: Murphy 2011 40
![Page 41: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/41.jpg)
MC for Product Recommendation
• Filtering: Given my purchase history, what is my next likely purchase?• Collaborative Filtering: Given the purchase history of customers similar to me,
what is my next likely purchase?
Source: Murphy 2011 41
![Page 42: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/42.jpg)
Collaborative Filtering Challenges
• Data Sparsity
• Scalability
• Synonymy
• Gray Sheep
• Attacks
42
![Page 43: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/43.jpg)
Beyond the User-Item Matrix
Source: Shi 2014 43
![Page 44: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/44.jpg)
Beyond the User-Item Matrix
Source: Shi 2014 44
![Page 45: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/45.jpg)
Product Recommendation System For Airlines
Source: Barth 2014 45
![Page 46: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/46.jpg)
Reinforcement Learning
46
![Page 47: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/47.jpg)
Maze Exploration
Source: Geramifard 2011 47
![Page 48: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/48.jpg)
RL Application - Maintenance Optimization
• A machine/component degradation model
• Maintenance costs money but restores the machine to its original state
• If not maintained, the machine eventually breaks down
• What is the optimal state to repair the machine?
Source: Bertsekas 2006 48
![Page 49: Introduction to Machine Learning and Data Miningaviation.itu.edu.tr/img/aviation/datafiles/Lecture... · May 16-21, 2016 Slides prepared by Prof. N. Kemal Üre. A Framework for Business](https://reader034.vdocument.in/reader034/viewer/2022050200/5f53f586869cb93b844cc9c1/html5/thumbnails/49.jpg)
RL Application – Active Web Advertising
Silver 2013 49