a.i. algorithms cogs 188

69
A.I. Algorithms Cogs 188

Upload: others

Post on 16-Jan-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A.I. Algorithms Cogs 188

A.I. AlgorithmsCogs 188

Page 2: A.I. Algorithms Cogs 188

• 1 Midterm: 20%

• 1 Final Exam: 30%

• Assignment 0: 5%

• Assignment 1: 15%

• Assignment 2: 15%

• Assignment 3: 15%

Grades

Assignments are to be done individually (not in groups). Late assignments will have 33% penalty per day that they are late. So if you submit an assignment 1 minute late, you will have lost 33% of the points, if you submit an assignment 24 hours and 1 minute late, you will lose 66% of the points.

Page 3: A.I. Algorithms Cogs 188

Tentative ScheduleDate Day Topics Covered Assignments

October 1st Thursday Machine Learning overview. Assignment 0 Assigned

October 6th Tuesday K-NN

October 8th Thursday Linear Regression - Objective Function

October 13th Tuesday Gradient Descent Assignment 1 Assigned

October 15th Thursday Perceptron

October 20th Tuesday Perceptron Revision

October 22nd Thursday Statistics & Probability - Distributions

October 27th Tuesday K-Means Assignment 1 Due

October 29th Thursday Midterm

November 3rd Tuesday Review and Hierarchical Clustering Assignment 2 Assigned

November 5th Thursday EM-Algorithm

November 10th Tuesday EM-Algorithm Cont.

November 12th Thursday EM-Algorithm Revision

November 17th Tuesday Genetic Algorithms Assignment 2 Due

November 19th Thursday Genetic Algorithms - Cont. Assignment 3 Assigned

November 24th Tuesday Genetic Algorithms - Examples

November 26th Thursday No class, happy thanksgiving!

December 1st Tuesday Bayes Theorem

December 3rd Thursday Naïve Bayes Classification

December 8th Tuesday A.I. In Healthcare

December 10th Thursday Review Assignment 3 Due

December 16th Wednesday Final Exam

Page 4: A.I. Algorithms Cogs 188

• Instructor:– Dr. Anjum Gupta, [email protected]

• TA:– Qiyuan: [email protected]

• If you are sending an email to us, please send all theemails to both addresses, however posting yourquestions on Canvas is recommended, wheneverpossible.

Teaching Staff

Page 5: A.I. Algorithms Cogs 188

1. Probability and Statistics2. Python / Jupyter Notebook (TA sections)3. Nearest Neighbor 4. Linear Regression, Logistic Regression5. Perceptron6. Bayes Theorem7. K-means, Hierarchical Clustering8. Genetic Algorithm9. EM Algorithm

Syllabus

Page 6: A.I. Algorithms Cogs 188

Learning.

You are learning, if you improve your performance with experience.

Page 7: A.I. Algorithms Cogs 188

Big Picture

Input Data

Statistics

Algorithms

Graph TheoryInformation

Theory

Probability Theory

Game Theory

Linear Algebra

Analytical Geometry

Output

Machine Learning Tools

Computer Science

Domain Expertise

Page 8: A.I. Algorithms Cogs 188

Things you can do with Machine Learning

• Given voice stream, identify the speaker/language.

• Recognize handwritten numbers.

• Evaluating the “lifetime value” of a customer (or sales lead)

• Face or object recognition in a video stream

• Given symptoms, diagnose a disease.

• Adjusting stock portfolio based on sentiment and clustering

• Distinguish between a weed and a plant sapling.

• Hand gesture analysis, a glove that sends text messages.

• Too many to count individually. That what makes machine learning so useful.

Page 9: A.I. Algorithms Cogs 188

Machine Learning in Agriculture

Blue River Technologies: Differentiating weed vs plant saplings

Root AI: Identifying ripe tomatoes to pick.

Page 10: A.I. Algorithms Cogs 188

Let’s start with our canonical two broad categories!

• Supervised – Discriminant Models

• Unsupervised – Generative Models

For tasks, we, humans can technically do ourselves, but it will be nice to get some help and automate it!

Classifying Data

For tasks that we, humans, cannot do. E.g. Potentially generating new insights and extracting hidden information that data contains.

Understanding Data

Page 11: A.I. Algorithms Cogs 188

Generally Speaking…

• Discriminative Models – Classifying Data

– Spam filter (Spam, Not Spam)

– Identify language from a voice stream

– Facial expression recognition

– Classify species according to some physical features

• Generative Models – Understanding Data

– Detect anomalies

– Finding probability of a scenario

– Predicting future outcomes

– Completing the missing data

Page 12: A.I. Algorithms Cogs 188

Optimization Algorithms

• We will also learn two specific optimization algorithms.

– Gradient Descent

– Genetic Algorithms

Page 13: A.I. Algorithms Cogs 188

Hand Written Digits example

Database of 20,000 images of handwritten digits, each labeled by a human (Supervised Learning)

Use these to learn a classifier which will label digit-images automatically…

Page 14: A.I. Algorithms Cogs 188

Classification

Image What is the number?

?

?

Page 15: A.I. Algorithms Cogs 188

Handwritten Digits example

Database of 20,000 images of handwritten digits, each labeled by a human (Supervised Learning)

Use these 20,0000 images to understand something about the digits and handwriting.

Page 16: A.I. Algorithms Cogs 188

Understanding Data:Being able to generate the numbers!

These results are from one of the projects I worked on as with a fellow graduate student, Eric Wiewiora.

Regenerated using model of digit 2 Regenerated using model of digit 5

Page 17: A.I. Algorithms Cogs 188

Naïve Bayes Example: Fishing data

Day Outlook Water Temperature

Pollutants in Water

Wind Fish Present

Day1 Sunny Hot High Weak No

Day2 Sunny Hot High Strong No

Day3 Overcast Hot High Weak Yes

Day4 Rain Mild High Weak Yes

Day5 Rain Cool Normal Weak Yes

Day6 Rain Cool Normal Strong No

Day7 Overcast Cool Normal Strong Yes

Day8 Sunny Mild High Weak No

Day9 Sunny Cool Normal Weak Yes

Day10 Rain Mild Normal Weak Yes

Day11 Sunny Mild Normal Strong Yes

Day12 Overcast Mild High Strong Yes

Day13 Overcast Hot Normal Weak Yes

Day14 Rain Mild High Strong No

Page 18: A.I. Algorithms Cogs 188

Bayesian Networks are for Bayesians, although frequentists are also welcome.

You are given various variables. For example: imagine you are going fishing.

DepthTemperature

Light

Corals

Food Source

Fish Present

You can map out some relationship among them through “expert knowledge,” then refine it and learn the exact parameters.

Now you can ask, What is the probability of Fish in a shallow & cold water with corals?

Page 19: A.I. Algorithms Cogs 188

Complete Bayes Net has the Graph and the Prob. Tables with it.

Page 20: A.I. Algorithms Cogs 188

Liver Disorder Diagnostic Bayesian Network

Page 21: A.I. Algorithms Cogs 188

Bayesian Vs Frequentist

Sherlock Holmes was apparently a frequentist.I have no data yet. It is a capital mistake to theorize before

one has data.

(A Scandal in Bohemia)

This sounds like a Bayesian.

Intuition becomes increasingly valuable in the new information society precisely because there is so much data.

John Naisbitt (Author)

Page 22: A.I. Algorithms Cogs 188

Complex Bayesian Networks have given way to “deep learning”

• New algorithms came along and replaced the idea of Bayesian networks.

• They still influence many unsupervised learning algorithms.

• Other algorithms such as EM-algorithm also helps us model the “true” nature of the data.

Page 23: A.I. Algorithms Cogs 188

• We will visit some of the generative algorithms later in the course

• Let’s start with the classification algorithms first.

Page 24: A.I. Algorithms Cogs 188

Classification

• Can a computer learn to recognize objects?

• Shown 10,000 flowers, can a computer “understand” flowers? Can it say if the new photograph shown is a flower?

Iris Setosa Iris Versicolor Iris Virginica

Page 25: A.I. Algorithms Cogs 188

Let’s try our brain’s algorithm!

Iris SetosaIris Versicolor Iris Virginica

???

Page 26: A.I. Algorithms Cogs 188

What is Similarity?The quality or state of being similar; likeness; resemblance; as, a similarity of features. Webster's Dictionary

For example, for someone who is writing a software for healthcare industry,They may have to deal with the questions of “how similar are two patients.”

It depends on what you are comparing the two objects for.

Whole lot of research and Ph.D. thesis, just on the concept of similarity.

Page 27: A.I. Algorithms Cogs 188

1. Patient Similarity Networks for Precision Medicine

2. Patient Similarity: Emerging Concepts in Systems and Precision Medicine

3. Machine learning of patient similarity: A case study on predicting survival in cancer patient after locoregional chemotherapy

Page 28: A.I. Algorithms Cogs 188

Fish Sorting: For Packaging

salmon

sea bass

sortingchamber

classifier

Page 29: A.I. Algorithms Cogs 188

Pattern Classification, Chapter 1

29

An Example

• “Sorting incoming Fish on a conveyor according to species using optical sensing”

Sea bass

Species

Salmon

Page 30: A.I. Algorithms Cogs 188

Pattern Classification, Chapter 1

30

• Problem Analysis

– Set up a camera and take some sample images to extract features

• Length

• Lightness

• Width

• Number and shape of fins

• Position of the mouth, etc…

• This is the set of all suggested features to explore for use in our classifier!

Page 31: A.I. Algorithms Cogs 188

Pattern Classification, Chapter 1

31

• Classification

– Select the length of the fish as a possible feature for discrimination

Page 32: A.I. Algorithms Cogs 188

Pattern Classification, Chapter 1

32

Page 33: A.I. Algorithms Cogs 188

Pattern Classification, Chapter 1

33

The length is a poor feature alone!

Select the lightness as a possible feature.

Page 34: A.I. Algorithms Cogs 188

Pattern Classification, Chapter 1

34

Page 35: A.I. Algorithms Cogs 188

Pattern Classification, Chapter 1

35

• Adopt the lightness and add the width of the fish

Fish xT = [x1, x2]

Lightness Width

Page 36: A.I. Algorithms Cogs 188

• Plotting Salmon and Seabass based on two-dimensional feature vector.

Page 37: A.I. Algorithms Cogs 188

Feature extraction

Task: to extract features which are good for classification.

Good features: • Objects from the same class have similar feature values.

• Objects from different classes have different values.

“Good” features “Bad” features

Page 38: A.I. Algorithms Cogs 188

Basic concepts

y x=

nx

x

x

2

1Feature vector

- A vector of observations (measurements).

- is a point in feature space .

Hidden state

- Cannot be directly measured.

- Patterns with equal hidden state belong to the same class.

Xx

x X

Yy

Task

- To design a classifer (decision rule)

which decides about a hidden state based on an onbservation.YX →:q

Pattern

Page 39: A.I. Algorithms Cogs 188

Text Classification

• Representing Text as a Vector.• Stem words used, such that “computer, computes ..” all get noted under

“compute.”• The number in the vector is actually divided by the number of documents that number

appears in. “Inverse Document Frequency”

Page 40: A.I. Algorithms Cogs 188

Grasshoppers

KatydidsLet’s go back to agriculture!

Given a collection of annotated data. In this case 5 instances Katydids of and five of Grasshoppers, decide what type of insect the unlabeled example is.

Katydid or Grasshopper?

Page 41: A.I. Algorithms Cogs 188

Thorax Length

Abdomen Length Antennae

Length

MandibleSize

SpiracleDiameter

Leg Length

For any domain of interest, we can measure features

Color {Green, Brown, Gray, Other} Has Wings?

Page 42: A.I. Algorithms Cogs 188

Insect ID

Abdomen Length

Antennae Length

Insect Class

1 2.7 5.5 Grasshopper

2 8.0 9.1 Katydid

3 0.9 4.7 Grasshopper

4 1.1 3.1 Grasshopper

5 5.4 8.5 Katydid

6 2.9 1.9 Grasshopper

7 6.1 6.6 Katydid

8 0.5 1.0 Grasshopper

9 8.3 6.6 Katydid

10 8.1 4.7 Katydids

11 5.1 7.0 ???????

We can store features in a database.

My_Collection

The classification problem can now be expressed as:

• Given a training database (My_Collection), predict the class label of a previously unseen instance

previously unseen instance =

Page 43: A.I. Algorithms Cogs 188

An

ten

na

Le

ng

th

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Grasshoppers Katydids

Abdomen Length

Page 44: A.I. Algorithms Cogs 188

Katydid or Grasshopper?

An

ten

na

Le

ng

th

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Abdomen Length

Page 45: A.I. Algorithms Cogs 188

An

ten

na

Le

ng

th

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Grasshoppers Katydids

Abdomen Length

We will also use this lager dataset as a motivating example…

Each of these data objects are called…• exemplars• (training) examples• instances• tuples

Page 46: A.I. Algorithms Cogs 188

An

ten

na

Le

ng

th

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Grasshoppers Katydids

Abdomen Length

We will also use this lager dataset as a motivating example…

Each of these data objects are called…• exemplars• (training) examples• instances• tuples

????

Page 47: A.I. Algorithms Cogs 188

Nearest Neighbor Classifier

If the nearest instance to the previously unseen instance is a Katydid

class is Katydidelse

class is Grasshopper

KatydidsGrasshoppers

An

ten

na

Le

ng

th

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Abdomen Length

Page 48: A.I. Algorithms Cogs 188

Hand Written Digits example

Database of 20,000 images of handwritten digits, each labeled by a human (Supervised Learning)

[28 x 28 greyscale; pixel values 0-255; labels 0-9]

Use these to learn a classifier which will label digit-images automatically…

Page 49: A.I. Algorithms Cogs 188

Nearest neighbor

Image to label Nearest neighbor

Page 50: A.I. Algorithms Cogs 188

Nearest neighbor

Image to label Nearest neighbor

Overall:

error rate = 6%

(on test set)

Page 51: A.I. Algorithms Cogs 188

Grasshoppers Katydids

An

ten

na

Le

ng

th

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Abdomen Length

Classifying Insects

Each of these data objects are called…• exemplars• (training) examples• instances• tuples

Page 52: A.I. Algorithms Cogs 188

An

ten

na

Le

ng

th

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Grasshoppers Katydids

Abdomen Length

We will also use this lager dataset as a motivating example…

Each of these data objects are called…• exemplars• (training) examples• instances• tuples

????

Page 53: A.I. Algorithms Cogs 188

What else do we want?

• K-NN (K-Nearest Neighbors) is great!

• What is one obvious way we can improve our grasp on classification problem?

Page 54: A.I. Algorithms Cogs 188

Lets try to study the classification problem with some examples.

I am going to show you some classification problems which were shown to pigeons!

Let us see if you are as smart as a pigeon!

Page 55: A.I. Algorithms Cogs 188

Examples of

class A

3 4

1.5 5

6 8

2.5 5

Examples of

class B

5 2.5

5 2

8 3

4.5 3

8 1.5

4.5 7

What class is this object?

What about this one, A or B?

Pigeon Problem 1

Page 56: A.I. Algorithms Cogs 188

Examples of

class A

3 4

1.5 5

6 8

2.5 5

Examples of

class B

5 2.5

5 2

8 3

4.5 3

8 1.5

This is a B!Pigeon Problem 1

Here is the rule.If the left bar is smaller than the right bar, it is an A, otherwise it is a B.

Page 57: A.I. Algorithms Cogs 188

Examples of

class A

4 4

5 5

6 6

3 3

Examples of

class B

5 2.5

2 5

5 3

2.5 3

7 7

Pigeon Problem 2

So this one is an A.

The rule is as follows, if the two bars are equal sizes, it is an A. Otherwise it is a B.

Page 58: A.I. Algorithms Cogs 188

Examples of

class A

4 4

1 5

6 3

3 7

Examples of

class B

5 6

7 5

4 8

7 7

6 6

Pigeon Problem 3

This one is really hard!What is this, A or B?

Page 59: A.I. Algorithms Cogs 188

Examples of

class A

4 4

1 5

6 3

3 7

Examples of

class B

5 6

7 5

4 8

7 7

6 6

Pigeon Problem 3 It is a B!

The rule is as follows, if the square of the sum of the two bars is less than or equal to 100, it is an A. Otherwise it is a B.

Page 60: A.I. Algorithms Cogs 188

Why did we spend so much time with this game?

Because we wanted to show that almost all classification problems have a geometric interpretation, check out the next 4 slides…

Page 61: A.I. Algorithms Cogs 188

Examples of

class A

3 4

1.5 5

6 8

2.5 5

Examples of

class B

5 2.5

5 2

8 3

4.5 3

Pigeon Problem 1

Here is the rule again.If the left bar is smaller than the right bar, it is an A, otherwise it is a B.

Lef

t B

ar

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Right Bar

Page 62: A.I. Algorithms Cogs 188

Examples of

class A

4 4

5 5

6 6

3 3

Examples of

class B

5 2.5

2 5

5 3

2.5 3

Pigeon Problem 2

Lef

t B

ar

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

Right Bar

Let me look it up… here it is.. the rule is, if the two bars are equal sizes, it is an A. Otherwise it is a B.

Page 63: A.I. Algorithms Cogs 188

Examples of

class A

4 4

1 5

6 3

3 7

Examples of

class B

5 6

7 5

4 8

7 7

Pigeon Problem 3

Lef

t B

ar

100

10 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

Right Bar

The rule again:if the square of the sum of the two bars is less than or equal to 100, it is an A. Otherwise it is a B.

Page 64: A.I. Algorithms Cogs 188

Examples of

class A

2 2

1 7

7 3

3 8

Examples of

class B

Pigeon Problem 4

The rule again:If both squares are bigger than 6, it is an B. Otherwise it is a A.

Lef

t B

ar

100

10 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

Right Bar

8 6

7 6

7 5

Page 65: A.I. Algorithms Cogs 188

10

1 2 3 4 5 6 7 8 9 10

123456789

100

10 20 30 40 50 60 70 80 90100

10

20

30

40

50

60

70

80

90

1

0

1 2 3 4 5 6 7 8 9 10

123456789

Which of the “Pigeon Problems” can be

solved by the Simple Linear Classifier?

1) Perfect

2) Useless

3) Perfect

4) Not so good

Lef

t B

ar

100

10 20 3040 506070 8090100

102030405060708090

Right Bar

Page 66: A.I. Algorithms Cogs 188

Nearest neighbor: pros and cons

Pros• Simple• No assumptions about the distribution or shape of different classes.• Excellent performance on a wide range of tasks• Effective with large training set

Cons• Time consuming – with n training points in Rd, time to label a new

point is O(nd)• No insight into the domain.• Would prefer a compact classifier• No good way to determine parameter “k.”• Dependant highly on the distance measure used.

Page 67: A.I. Algorithms Cogs 188

Some Variants

• K-nearest Neighbors• Pick K nearest Neighbors and take the majority vote

• Parzen Window• Pick an area around a point, look at the majority of points in

that window

• Many other variants. Nearest Neighbor search is elementary but deserves proper attention. Best accuracy for the digits data is using a variant of nearest neighbor.

Page 68: A.I. Algorithms Cogs 188

Distance Measures

How many clusters does this have? Which two points are the neighbors?

Page 69: A.I. Algorithms Cogs 188

A Famous ProblemR. A. Fisher’s Iris Dataset.

• 3 classes

• 50 of each class

The task is to classify Iris plants into one of 3 varieties using the Petal Length and Petal Width.

Iris Setosa Iris Versicolor Iris Virginica

Setosa

Versicolor

Virginica