ece 471/571 - lecture 19 review 11/12/15. a roadmap 2 pattern classification statistical...

ECE 471/571 - Lecture 19Review

11/12/15

A Roadmap

2

Pattern Classification

Statistical Approach Non-Statistical Approach

Supervised UnsupervisedBasic concepts: Distance Agglomerative method

Basic concepts: Baysian decision rule (MPP, LR, Discri.)

Parameter estimate (ML, BL)

Non-Parametric learning (kNN)

LDF (Perceptron)

k-means

Winner-take-all

Kohonen maps

Dimensionality Reduction FLD, PCA

Performance Evaluation ROC curve (TP, TN, FN, FP) cross validation

Classifier Fusion majority voting NB, BKS

Stochastic Methods local opt (GD) global opt (SA, GA)

Decision-tree

Syntactic approach

NN (BP, Hopfield, DL)

Support Vector Machine

3

Q&AWhat is the common feature of statistical pattern classification?What is supervised pattern classification? How is it different from unsupervised classification?When performing classification, what is the general rule that Baysian decision methods follow? (MPP, Likelihood ratio, Discriminant function)What is the difference between parametric classification and non-parametric classification?What is maximum likelihood estimation?Can you verbally describe kNN? And possible improvements?Both FLD and PCA reduce the dimension of a dataset. However, different projection directions are generated from the two approaches. Comment on the fundamental differences.Learn to plot and use and explain an ROC curveBe familiar with the relationship between TP, FP, TN, FN, precision, recall, specificity, sensitivity, and accuracy

4

Q&A - Cont’dWhat’s the difference between feed-forward and feed-backward NN? Examples of each?What is LDF? How to extend from 2-class case to multi-class separation? What’s the problem with c 2-class separation and c(c-1)/2 2-class separation?Performance wise, what is the difference between perceptron and bp?

Pros and cons How do they work? (need to know the detailed procedure)

(no proof needed) Know of different ways to improve the performance of bp

E.g. what is sigmoid? Why is it better? What is optimal learning rate? Why?

The biological structure of a neuron

Q&A – Cont’dStochastic method

What is gradient descent? Know the detailed algorithm. What’s the problem?

How do SA and GA achieve global optimum as compared to the principle of GD?

GA (natural selection with binary tournament mating, exploration with recombination and mutation)

Clustering What is unsupervised learning? What is the difference between dmin, dmax, and dmean? What is agglomerative clustering? Understand k-means, winner-take-all, and Kohonen maps

Performance evaluation approaches What is confusion matrix?

Fusion Majority voting Naïve Bayes

5

6

Forming Objective Functions

Maximum a-posteriori probabilityMaximum likelihood estimateFisher’s linear discriminantPrincipal component analysiskNNPerceptron Back propagation neural networkHopfield networkROC curveVarious clustering algorithmNB

7

Distance MetricsMinkowski distance Manhattan distance (city-block distance) Euclidean distanceMahalanobis distanceDistance between clusters (dmin, dmax, vs. dmean) Classify the following samples into two

clusters using dmin, dmax, and dmean. The samples are (0,1), (0,2), (0,3), (0,4), (0,5), (0,6), (0,7), (0,8)

Course evaluation (5 pts) counted in Test 2.

8

ece 471/571 - lecture 19 review 11/12/15. a roadmap 2 pattern classification statistical...

Documents