playlist?list=plzhqobowtqdpd3mizzm2xvfitgf8he absandra/pdf/class/2019-2/mc886/2019-1… · using...
TRANSCRIPT
![Page 1: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/1.jpg)
![Page 2: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/2.jpg)
https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
![Page 3: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/3.jpg)
PCA AlgorithmBy Singular Value Decomposition
![Page 4: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/4.jpg)
Data Preprocessing
Training set: x(1), x(2), ..., x(m)
Preprocessing (feature scaling/mean normalization):
Replace each with .
If different features on different scales, scale features to have comparable range of values.
Center the data
![Page 5: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/5.jpg)
Data Preprocessing
Credit: http://cs231n.github.io/neural-networks-2/
![Page 6: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/6.jpg)
PCA Algorithm
Reduce data from n-dimensions to k-dimensions
Compute “covariance matrix”:
n ⨉ n matrix
![Page 7: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/7.jpg)
PCA Algorithm
Reduce data from n-dimensions to k-dimensions
Compute “covariance matrix”:
Compute “eigenvectors” of matrix 𝚺:
[U, S, V] = svd(sigma) Singular Value Decomposition
n ⨉ n matrix
![Page 8: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/8.jpg)
PCA Algorithm
Reduce data from n-dimensions to k-dimensions
Compute “covariance matrix”:
Compute “eigenvectors” of matrix 𝚺:
[U, S, V] = svd(sigma) Singular Value Decomposition
n ⨉ n matrix
eigenvalues
![Page 9: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/9.jpg)
PCA Algorithm
From [U, S, V] = svd(sigma), we get:
∈ ℝnxn
![Page 10: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/10.jpg)
PCA Algorithm
From [U, S, V] = svd(sigma), we get:
∈ ℝnxn
k
x ∈ ℝn → z ∈ ℝk
![Page 11: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/11.jpg)
PCA Algorithm
From [U, S, V] = svd(sigma), we get:
∈ ℝnxn
k
z = x
T
x ∈ ℝn → z ∈ ℝk
k ⨉ n n⨉1
![Page 12: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/12.jpg)
PCA Algorithm
After mean normalization and optionally feature scaling:
[U, S, V] = svd(sigma)
z = (Ureduce)T x x
![Page 13: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/13.jpg)
PCA AlgorithmBy Eigen Decomposition
![Page 14: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/14.jpg)
PCA in a Nutshell (Eigen Decomposition)
1. Center the data (and normalize)
2. Compute covariance matrix 𝚺3. Find eigenvectors u and eigenvalues 𝜆4. Sort eigenvalues and pick first k eigenvectors
5. Project data to k eigenvectors
![Page 15: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/15.jpg)
Using PCA (Iris Dataset)
150 iris flowers from three different species.
The three classes in the Iris dataset:1. Iris-setosa (n=50)2. Iris-versicolor (n=50)3. Iris-virginica (n=50)
The four features of the Iris dataset:1. sepal length in cm2. sepal width in cm3. petal length in cm4. petal width in cm
![Page 16: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/16.jpg)
Linear Discriminant AnalysisMachine Learning
MC886, October 2, 2019
Prof. Sandra AvilaInstitute of Computing (IC/Unicamp)
REC Dreasoning for complex data
![Page 17: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/17.jpg)
Today’s Agenda
● Linear Discriminant Analysis
○ PCA vs LDA
○ LDA: Simple Example
○ LDA Algorithm
○ LDA Step by Step
![Page 18: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/18.jpg)
Linear Discriminant Analysis
![Page 19: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/19.jpg)
Linear Discriminant Analysis (LDA)
● LDA pick a new dimension that gives:
○ Maximum separation between means of projected classes
○ Minimum variance within each projected class
● Solution: eigenvectors based on between-class and within-class covariance matrix
![Page 20: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/20.jpg)
LDA: Simple Example
Reducing 2D to 1D
PCA
![Page 21: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/21.jpg)
LDA: Simple Example
Reducing 2D to 1D
LDA
![Page 22: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/22.jpg)
LDA: Simple Example
Reducing 2D to 1D
LDA
![Page 23: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/23.jpg)
LDA: Simple Example
Reducing 2D to 1D
LDA
![Page 24: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/24.jpg)
LDA: Simple Example
Reducing 2D to 1D
LDA
![Page 25: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/25.jpg)
How LDA create a new axis?
![Page 26: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/26.jpg)
How LDA create a new axis?
Reducing 2D to 1D
LDA
![Page 27: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/27.jpg)
How LDA create a new axis?
The new axis is created according two criteria:
![Page 28: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/28.jpg)
How LDA create a new axis?
The new axis is created according two criteria:
1. Maximize the distance between the means:
μμ
![Page 29: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/29.jpg)
How LDA create a new axis?
The new axis is created according two criteria:
1. Maximize the distance between the means:
μμ
2. Minimize the variation (which LDA calls scatter) within each class.
![Page 30: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/30.jpg)
How LDA create a new axis?
The new axis is created according two criteria:
1. Maximize the distance between the means:
μμ
s2 This is the scatter around the pink dots.
2. Minimize the variation (which LDA calls scatter) within each class.
![Page 31: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/31.jpg)
How LDA create a new axis?
The new axis is created according two criteria:
1. Maximize the distance between the means:
μμ
s2s2 This is the scatter around the blue dots.
2. Minimize the variation (which LDA calls scatter) within each class.
![Page 32: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/32.jpg)
How LDA create a new axis?
The new axis is created according two criteria:
1. Maximize the distance between the means:
μμ
s2s2
s2 + s2(μ ₋ μ)2
2. Minimize the variation (which LDA calls scatter) within each class.
![Page 33: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/33.jpg)
How LDA create a new axis?
The new axis is created according two criteria:
1. Maximize the distance between the means:
μμ
s2s2
s2 + s2(μ ₋ μ)2 Ideally large
Ideally small
2. Minimize the variation (which LDA calls scatter) within each class.
![Page 34: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/34.jpg)
How LDA create a new axis?
The new axis is created according two criteria:
1. Maximize the distance between the means:
μμ
s2s2
s2 + s2(μ ₋ μ)2 Ideally large
Ideally small
Let’s call (μ ₋ μ) d for distance.
2. Minimize the variation (which LDA calls scatter) within each class.
![Page 35: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/35.jpg)
How LDA create a new axis?
The new axis is created according two criteria:
1. Maximize the distance between the means:
μμ
s2s2
s2 + s2d2 Ideally large
Ideally small
Let’s call (μ ₋ μ) d for distance.
2. Minimize the variation (which LDA calls scatter) within each class.
![Page 36: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/36.jpg)
Why both distance and scatter are important?
If we only maximize the distance between means ...
![Page 37: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/37.jpg)
Why both distance and scatter are important?
If we only maximize the distance between means ...
If we optimize the distance between means and scatter
![Page 38: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/38.jpg)
What if we have 3 classes?
![Page 39: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/39.jpg)
What if we have 3 classes?
How we measure the distance among the means?
![Page 40: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/40.jpg)
What if we have 3 classes?Find the point that is central to data.
![Page 41: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/41.jpg)
What if we have 3 classes?
Then measure the distance between a point that is central in each class and the main central point.
d2
d2
d2
![Page 42: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/42.jpg)
What if we have 3 classes?
Now maximize the distance between each class and the central point while minimize the scatter for each class.
d2
d2
d2
![Page 43: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/43.jpg)
What if we have 3 classes?
s2 + s2 + s2d2 + d2 + d2
d2
d2
d2
![Page 44: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/44.jpg)
Today’s Agenda
● Linear Discriminant Analysis
○ PCA vs LDA
○ LDA: Simple Example
○ LDA Algorithm
○ LDA Step by Step
![Page 45: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/45.jpg)
LDA in a Nutshell (Eigen Decomposition)
1. Compute the d-dimensional mean vectors for the different classes.2. Compute the scatter matrices (between-class SB and within-class SW).3. Compute the eigenvectors (u1,u2,...,ud) and eigenvalues (λ1,λ2,...,λd) for
the scatter matrices SW-1SB.
4. Sort the eigenvectors by decreasing eigenvalues and choose k eigenvectors.
5. Use this d×k eigenvector matrix to transform the samples onto the new subspace.
![Page 46: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/46.jpg)
LDA Step by Step
150 iris flowers from three different species.
The three classes in the Iris dataset:1. Iris-setosa (n=50)2. Iris-versicolor (n=50)3. Iris-virginica (n=50)
The four features of the Iris dataset:1. sepal length in cm2. sepal width in cm3. petal length in cm4. petal width in cmhttp://sebastianraschka.com/Articles/2014_python_lda.html
![Page 47: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/47.jpg)
LDA Step by Step
![Page 48: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/48.jpg)
LDA Step by Step
1. Compute the d-dimensional mean vectors for the different classes.
![Page 49: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/49.jpg)
LDA Step by Step
1. Compute the d-dimensional mean vectors for the different classes.
𝜇1 : [ 5.01 3.42 1.46 0.24 ]
𝜇2 : [ 5.94 2.77 4.26 1.33 ]
𝜇3 : [ 6.59 2.97 5.55 2.03 ]
![Page 50: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/50.jpg)
LDA Step by Step
2. Compute the scatter matrices (between-class SB and within-class SW)
Within-class scatter matrix SW:
, where
![Page 51: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/51.jpg)
LDA Step by Step
2. Compute the scatter matrices (between-class SB and within-class SW)
Within-class scatter matrix SW:
38.96 13.68 24.61 5.66 13.68 17.04 08.12 4.91 24.61 08.12 27.22 6.25 05.66 04.91 06.25 6.18
![Page 52: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/52.jpg)
LDA Step by Step
2. Compute the scatter matrices (between-class SB and within-class SW)
Between-class scatter matrix SB:
where 𝜇 is the overall mean, and 𝜇i and Ni are the sample mean and sizes of the respective classes.
![Page 53: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/53.jpg)
LDA Step by Step
2. Compute the scatter matrices (between-class SB and within-class SW)
Between-class scatter matrix SB:
063.21 -19.53 165.16 71.36 -19.53 10.98 -56.05 -22.49 165.16 -56.05 436.64 186.91 071.36 -22.49 186.91 80.60
![Page 54: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/54.jpg)
LDA Step by Step
3. Compute the eigenvectors (u1,u2,...,ud) and eigenvalues (λ1,λ2,...,λd) for the scatter matrices SW
-1SB.
u1: -0.205 -0.387 -0.546 -0.714
λ1: 3.23e+01
u2: -0.009 -0.589 -0.254 -0.767
λ2: 2.78e-01
u3: -0.179 -0.318 -0.366 -0.601
λ3: -4.02e-17
u4: -0.179 -0.318 -0.366 -0.601
λ4: -4.02e-17
![Page 55: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/55.jpg)
LDA Step by Step
4. Sort the eigenvectors by decreasing eigenvalues and choose k eigenvectors.
Eigenvalues in decreasing order:32.270.275.71e-155.71e-15
![Page 56: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/56.jpg)
LDA Step by Step
4. Sort the eigenvectors by decreasing eigenvalues and choose k eigenvectors.
Eigenvalues in decreasing order:32.270.275.71e-155.71e-15
Variance explained:λ1: 99.15%λ2: 0.85%λ3: 0.00%λ4: 0.00%
![Page 57: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/57.jpg)
LDA Step by Step
4. Sort the eigenvectors by decreasing eigenvalues and choose k eigenvectors.
-0.205 -0.009 -0.387 -0.589 -0.546 -0.254 -0.714 -0.767
u1: -0.205 -0.387 -0.546 -0.714
λ1: 3.23e+01
u2: -0.009 -0.589 -0.254 -0.767
λ2: 2.78e-01
![Page 58: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/58.jpg)
LDA Step by Step
5. Use this d×k eigenvector matrix to transform the samples onto the new subspace.
![Page 59: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/59.jpg)
LDA Step by Step
5. Use this d×k eigenvector matrix to transform the samples onto the new subspace.
http://sebastianraschka.com/Articles/2014_python_lda.html
![Page 60: playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE absandra/pdf/class/2019-2/mc886/2019-1… · Using PCA (Iris Dataset) 150 iris flowers from three different species. The three classes](https://reader034.vdocument.in/reader034/viewer/2022050112/5f494ddbc3dbe2083a1bf5fa/html5/thumbnails/60.jpg)
References
Machine Learning Books
● Hands-On Machine Learning with Scikit-Learn and TensorFlow, Chap. 8
“Dimensionality Reduction”
● Pattern Recognition and Machine Learning, Chap. 12 “Continuous Latent Variables”
● Pattern Classification, Chap. 10 “Unsupervised Learning and Clustering”