cpsc340 - home | computer science at ubcnando/340-2012/lectures/l1.pdf• avatar movie reported to...
TRANSCRIPT
![Page 1: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/1.jpg)
![Page 2: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/2.jpg)
CPSC340
Nando de FreitasSeptember, 2012University of British Columbia
Machine Learning & Data MiningMachine Learning & Data MiningMachine Learning & Data MiningMachine Learning & Data Mining
![Page 3: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/3.jpg)
Outline of the lectureThis lecture provides an introduction to the course. It covers the following four areas:
1. Definitions of machine learning and data mining2. The big data phenomenon3. Drawing inspiration from neural systems3. Drawing inspiration from neural systems4. Machine learning applicationsand impact
The intent of the lecture is not to explain details of building ML systems, or to tell you what to study for the exam. Rather it is an overview of what can be accomplished with ML. If it inspires you, then you’ll have to take the course and learna lot of cool math in the process!
![Page 4: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/4.jpg)
“tufa”
“tufa”
“tufa”
Can you pick out the tufas?
Josh Tenenbaum
![Page 5: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/5.jpg)
Learning``Learning denotes changes in the system that are adaptive in the sense
that they enable the system to do the task or tasks drawn from the same population more efficiently and more effectively the next time.''
Herbert Simon
Environment
ActionPercept
Agent
![Page 6: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/6.jpg)
Machine learning deals with the problem of extracting features from data so as to solve many different predictive tasks:
�Forecasting (e.g. Energy demand prediction, sales)�Imputing missing data (e.g. Netflix recommendations)�Detecting anomalies (e.g. Intruders, virus mutations) �Classifying (e.g. Credit risk assessment, cancer diagnosis)
Machine learning
�Classifying (e.g. Credit risk assessment, cancer diagnosis)�Ranking (e.g. Google search, personalization)�Summarizing (e.g. News zeitgeist, social media sentiment)�Decision making (e.g. AI, robotics, compiler tuning, trading ) …
![Page 7: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/7.jpg)
When to apply machine learning
�Human expertise is absent (e.g. Navigating on Mars)
�Humans are unable to explain their expertise (e.g. Speech recognition, vision, language)
�Solution changes with time (e.g. Tracking, �Solution changes with time (e.g. Tracking, temperature control, preferences)
�Solution needs to be adapted to particular cases (e.g. Biometrics, personalization)
�The problem size is to vast for our limited reasoning capabilities (e.g. Calculating webpage ranks, matching ads to facebookpages)
![Page 8: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/8.jpg)
• Library of Congress text database of ~20 TB
• AT&T 323 TB, 1.9 trillion phone call records.
• World of Warcraft utilizes 1.3 PB of storage to maintain its game.
• Avatar movie reported to have taken over 1 PB of local storage at WetaDigital for the rendering of the
Big Data!
local storage at WetaDigital for the rendering of the 3D CGI effects.
• Google processes ~24 PB of data per day.
• YouTube: 24 hours of video uploaded every minute. More video is uploaded in 60 days than all 3 major US networks created in 60 years. According to cisco, internet video will generate over 18 EB of traffic per month in 2013.
![Page 9: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/9.jpg)
Machine learning in language
Success stories:
“Large” text dataset:
• 1,000,000 words in 1967• 1,000,000,000,000 words in 2006
• Speech recognition• Machine translation
What is the common thing that makes both of these work well?
• Lots of labeled data• Memorization is a good policy
[Halevy, Norvig & Pereira, 2009]
![Page 10: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/10.jpg)
Scene completion: More data is better
[Efros, 2008]
Given an input image with a missing region, Efros uses matching scenes from a large collection of photographs to complete the image
![Page 11: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/11.jpg)
The semantic challenge
� “We’ve already solved the sociological problem of building a network infrastructure that has encouraged hundreds of millions of authors to share a trillion pages of content.
� We’ve solved the technological problem of aggregating and indexing all this content.
� But we’re left with a scientific problem of interpreting the content”
[Halevy, Norvig & Pereira, 2009]
� It’s not only about how big your data is. It is about understanding it and using this understanding to derive reasonable inferences. Think of citation matching.
![Page 12: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/12.jpg)
A source of inspiration
![Page 13: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/13.jpg)
Selectivity and Topographic maps in V1
![Page 14: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/14.jpg)
The x and y coordinates correspond to the spatial location of a rat.The x and y coordinates correspond to the spatial location of a rat.The x and y coordinates correspond to the spatial location of a rat.The x and y coordinates correspond to the spatial location of a rat.
The red dots indicate the place where a particular neuron fires.The red dots indicate the place where a particular neuron fires.The red dots indicate the place where a particular neuron fires.The red dots indicate the place where a particular neuron fires.
[Hafting et al 2005]
![Page 15: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/15.jpg)
Associative memory
[Jain, Mao & Mohiuddin, 1996]
Example 2: Say the alphabet, …. backward
![Page 16: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/16.jpg)
Neural network: A distributed representation
Hidden units
1 10 0 0
Learnedfeatures
Feature vector
4x4 image patch
features
Insight: We’re assuming edges occur often in nature, but dots don’tWe learn the regular structures in the world
![Page 17: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/17.jpg)
Hidden units
1 10 0 0Feature vector …
image patch
![Page 18: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/18.jpg)
Deep learning with autoencoders
[Russ Salakhutdinov, Geoff Hinton, Yann Lecun, Yoshua Bengio, Andrew Ng, …]
![Page 19: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/19.jpg)
Validating Unsupervised Learningn
euro
n resp
on
ses
1st stage 2nd stage 3rd stage
neu
ron
respo
nses
Ranzato
![Page 20: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/20.jpg)
Top Images For Best Face Neuron
Ranzato
![Page 21: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/21.jpg)
Best Input For Face Neuron
Ranzato
![Page 22: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/22.jpg)
Hierarchical spatial-temporal feature learning
Observed gaze sequence
Model predictions
[Bo Chen et al 2010]
![Page 23: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/23.jpg)
Application: Invariant recognition in natural images
23
[Thomas Serre 2012]
![Page 24: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/24.jpg)
![Page 25: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/25.jpg)
Computer vision successes
[Thomas Serre 2012]
![Page 26: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/26.jpg)
Millions of labeled examples are used to build real-world applications, such as pedestrian detection
[Tomas Serre]
![Page 27: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/27.jpg)
Application: Autonomous driving
Mobileye: Already available on Volvo S60 and soon on most car manufacturers
![Page 28: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/28.jpg)
Application: Information Extraction
[Yoshua Bengio, Jason Weston, Richard Socher]
![Page 29: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/29.jpg)
Application: Speech recognition
[George Dahl et al 2011]
![Page 30: CPSC340 - Home | Computer Science at UBCnando/340-2012/lectures/l1.pdf• Avatar movie reported to have taken over 1 PB of local storage at Weta Digital for the rendering of the Big](https://reader033.vdocument.in/reader033/viewer/2022042916/5f5617033a49d113e1074e64/html5/thumbnails/30.jpg)
Next lecture
In the following lecture we will begin to learn the probabilistic tools we need to understand machine learning andinnovate algorithms, models and applications.