classification: perceptron · iterations of perceptron 1. randomly assign 𝜔 2. one iteration of...

Post on 21-Mar-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Classification: Perceptron

Prof. Seungchul Lee

Industrial AI Lab.

Classification

• Where 𝑦 is a discrete value– Develop the classification algorithm to determine which class a new input should fall into

• Start with a binary class problem– Later look at multiclass classification problem, although this is just an extension of binary

classification

• We could use linear regression– Then, threshold the classifier output (i.e. anything over some value is yes, else no)

– linear regression with thresholding seems to work

2

Classification

• We will learn– Perceptron

– Support vector machine (SVM)

– Logistic regression

• To find

a classification boundary

3

Perceptron

• For input 𝑥 =

𝑥1⋮𝑥𝑑

'attributes of a customer’

• Weights 𝜔 =

𝜔1

⋮𝜔𝑑

4

Perceptron

• Introduce an artificial coordinate 𝑥0 = 1 :

• In a vector form, the perceptron implements

5

Perceptron

• Works for linearly separable data

• Hyperplane– Separates a D-dimensional space into two half-spaces

– Defined by an outward pointing normal vector

– 𝜔 is orthogonal to any vector lying on the hyperplane

– Assume the hyperplane passes through origin, 𝜔𝑇𝑥 = 0 with 𝑥0 = 1

6

Distance from a Line

7

𝝎

• If Ԧ𝑝 and Ԧ𝑞 are on the decision line

8

Signed Distance 𝒅 from the Origin

• If 𝑥 is on the line and 𝑥 = 𝑑𝜔

𝜔(where 𝑑 is a normal distance

from the origin to the line)

9

Distance from a Line: 𝒉

• for any vector of 𝑥

10

Sign

• Sign with respect to a line

11

How to Find 𝝎

• All data in class 1 (𝑦 = 1)– 𝑔 𝑥 > 0

• All data in class 0 (𝑦 = −1)– 𝑔 𝑥 < 0

12

Perceptron Algorithm

• The perceptron implements

• Given the training set

1) pick a misclassified point

2) and update the weight vector

13

Perceptron Algorithm

• Why perceptron updates work ?

• Let's look at a misclassified positive example 𝑦𝑛 = +1

– Perceptron (wrongly) thinks 𝜔𝑜𝑙𝑑𝑇 𝑥𝑛 < 0

– Updates would be

– Thus 𝜔𝑛𝑒𝑤𝑇 𝑥𝑛 is less negative than 𝜔𝑜𝑙𝑑

𝑇 𝑥𝑛

14

Iterations of Perceptron

1. Randomly assign 𝜔

2. One iteration of the PLA (perceptron learning algorithm)

where (𝑥, 𝑦) is a misclassified training point

3. At iteration 𝑖 = 1, 2, 3,⋯ , pick a misclassified point from

4. And run a PLA iteration on it

5. That's it!

15

Diagram of Perceptron

16

Perceptron Loss Function

• Loss = 0 on examples where perceptron is correct,

i.e., 𝑦𝑛 ∙ 𝜔𝑇𝑥𝑛 > 0

• Loss > 0 on examples where perceptron is misclassified,

i.e., 𝑦𝑛 ∙ 𝜔𝑇𝑥𝑛 < 0

• Note:

– sign 𝜔𝑇𝑥𝑛 ≠ 𝑦𝑛 is equivalent to 𝑦𝑛 ∙ 𝜔𝑇𝑥𝑛 < 0

17

Perceptron Algorithm in Python

18

Perceptron Algorithm in Python

• Unknown parameters 𝜔

19

Perceptron Algorithm in Python

20

where 𝑥, 𝑦 is a misclassified training point

Perceptron Algorithm in Python

21

Perceptron Algorithm in Python

22

Perceptron Algorithm in Python

23

Scikit-Learn for Perceptron

24

The Best Hyperplane Separator?

• Perceptron finds one of the many possible hyperplanes separating the data if one exists

• Of the many possible choices, which one is the best?

• Utilize distance information

• Intuitively we want the hyperplane having the maximum margin

• Large margin leads to good generalization on the test data

– we will see this formally when we discuss Support Vector Machine (SVM)

• Utilize distance information from all data samples

– We will see this formally when we discuss the logistic regression

• Perceptron will be shown to be a basic unit for neural networks and deep learning later

25

top related