andrew rosenberg- lecture 1.1: introduction csc 84020 - machine learning

Post on 06-Apr-2018

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 1/33

Lecture 1.1: IntroductionCSC 84020 - Machine Learning

Andrew Rosenberg

January 29, 2010

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 2/33

Today

Introductions and Class Mechanics.

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 3/33

Background about me

Me:Graduated from Columbia in 2009

Research Speech and Natural Language Processing(Computational Linguistics)Specically analyzing the intonation of speech.Written papers on Evaluation Measures

All of my research has relied heavily onMachine Learning

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 4/33

Background about you

You:Why are you taking this class?

What is your background in and comfort with:CalculusLinear AlgebraProbability and Statistics

What do you hope to get from this class?

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 5/33

Why does anyone care about Machine Learning?

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 6/33

What IS Machine Learning

Automatically identifying patterns in dataAutomatically making decisions based on data.

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 7/33

Major Tasks of Machine Learning

Major Tasks

ClassicationRegressionClustering

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 8/33

Classication

Identify which of N classes a data point belongs to.

x is a feature vector based on some entity x .

x =

f 0 (x )f 1 (x )

. . .f n − 1 (x )

Also, sometimes,

x =

x 0x 1. . .

x n − 1

l

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 9/33

Target Values

In supervised approaches, in addition to the data point x , we willalso have some target value t .

In classication,t

represents the class of the data point.

Goal of classication.

Identify a function y , such that y (x) = t .

G hi l E l f Cl i i

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 10/33

Graphical Example of Classication

G hi l E l f Cl i i

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 11/33

Graphical Example of Classication

G hi l E l f Cl i ti

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 12/33

Graphical Example of Classication

Graphical E ample of Classication

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 13/33

Graphical Example of Classication

Graphical Example of Classication

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 14/33

Graphical Example of Classication

Graphical Example of Classication

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 15/33

Graphical Example of Classication

Regression

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 16/33

Regression

Regression is another supervised machine learning task.

In classication t was a discrete variable, representing the class of the data point, in regression t is a continuous variable.

Goal of regression.Identify a function y , such that y (x) = t .

Regression

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 17/33

Regression

Regression is another supervised machine learning task.

In classication t was a discrete variable, representing the class of the data point, in regression t is a continuous variable.

Goal of regression.Identify a function y , such that y (x) = t .

If the goals of regression and classication are the same, what isthe difference?

Regression

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 18/33

Regression

Regression is another supervised machine learning task.

In classication t was a discrete variable, representing the class of the data point, in regression t is a continuous variable.

Goal of regression.Identify a function y , such that y (x) = t .

If the goals of regression and classication are the same, what isthe difference?

Evaluation.

Graphical Example of Regression

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 19/33

Graphical Example of Regression

Graphical Example of Regression

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 20/33

Graphical Example of Regression

Graphical Example of Regression

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 21/33

Graphical Example of Regression

Clustering

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 22/33

Clustering

Clustering is an unsupervised task.

Therefore we have no “target” information to learn.

Rather, the goal is to identify groups of similar data points, thatare dissimilar than others.

Technically, identify a partition of the data satisfying these twoconstraints.

1 Points in the same cluster should be similar2 Points in different clusters should be dissimilar

Clustering

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 23/33

Clustering

Clustering is an unsupervised task.

Therefore we have no “target” information to learn.

Rather, the goal is to identify groups of similar data points, thatare dissimilar than others.

Technically, identify a partition of the data satisfying these twoconstraints.

1 Points in the same cluster should be similar2 Points in different clusters should be dissimilar

Now the tricky part: Dene “Similar”.

Graphical Example of Clustering

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 24/33

p p g

Graphical Example of Clustering

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 25/33

p p g

Graphical Example of Clustering

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 26/33

p p g

How do we do this?

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 27/33

Mechanisms of Machine Learning.

Feature ExtractionStatistical Estimation

Mathematical Underpinnings

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 28/33

p g

What Math will we use?

Probability and StatisticsCalculusLinear Algebra

Why do we need such complicated math?

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 29/33

How much math?

A lot.One common function we will use is the Gaussian Distribution.

N (x |µ, σ2 ) =

1√2πσ

2exp −

12σ

2 (x −µ )2

We will be differentiating and integrating over this function.

Why do we need such complicated math?

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 30/33

How much math?

A lot.We also look at higher-dimensional Gaussians

N (x |µ, Σ) =1

(2π )D / 2 |Σ |1 / 2 exp −12

(x −µ )T Σ − 1 (x −µ )

We will be differentiating and integrating over this function, too.

Policies and Structure

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 31/33

Course website:http://eniac.cs.qc.cuny.edu/andrew/gcml/syllabus.html

Data Data Data

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 32/33

All of the work we will do in this class relies on the availability of data to process.

UCI: http://archive.ics.uci.edu/ml/

Netix Prize:http://archive.ics.uci.edu/ml/datasets/Netix+PrizeLDC (Linguistic Data Consortium):http://www.ldc.upenn.edu/

Bye

8/3/2019 Andrew Rosenberg- Lecture 1.1: Introduction CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-11-introduction-csc-84020-machine-learning 33/33

NextProbability Review!

Frequentists v. BayesiansBayes Rule

top related