chapter 9: matching and ranking cases

15
CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 1 Chapter 9: Matching and Ranking Cases Matching is the process of comparing two cases to each other and determining their degree of similarity Ranking is the process of ordering partially matching cases according to the goodness of match, or usefulness To compute the degree of match between cases, you need to: Determine which features of two cases correspond to each other Compute the degree of match between each pair of corresponding features Determine how important each feature is in assigning an overall degree of match

Upload: adonai

Post on 05-Feb-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Chapter 9: Matching and Ranking Cases. Matching is the process of comparing two cases to each other and determining their degree of similarity Ranking is the process of ordering partially matching cases according to the goodness of match, or usefulness - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 1

Chapter 9: Matching and Ranking Cases

• Matching is the process of comparing two cases to each other and determining their degree of similarity

• Ranking is the process of ordering partially matching cases according to the goodness of match, or usefulness

• To compute the degree of match between cases, you need to:

• Determine which features of two cases correspond to each other

• Compute the degree of match between each pair of corresponding features

• Determine how important each feature is in assigning an overall degree of match

Page 2: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 2

Types of Matching Schemes

• Dimensional matching is the ability to compare two individual features

• Aggregate matching is the abilitly to compare two whole cases

• Aggregate matching involves dimensional matching

• Dimensional matching can be used alone, as in traversing a hierarchical memory structure and comparing the dimensions stored at each node

• In static matching schemes, the matching criteria are established in advance and hard coded

• In dynamic matching schemes, the criteria may change according to the present purpose

• Sometimes, you can hardcode different schemes and choose among them dynamically

• Sometimes, important features are determined on the fly, during situation assessment

• Some flexibility can be achieved by determining the important features in advance, but weighting them differently each time

Page 3: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 3

Types of Matching Schemes (continued)

• In absolute matching, you compute a score for how well each cases matches the new one, independently of all the other cases

• In relative matching, you arrange the cases in order from best to worst, without quantifying the goodness of each one

• This requires dynamically comparing and contrasting cases to each other, and so is more difficul than absolute matching

• If you use absolute matching, ranking becomes trivial

• Any sort routine can arrange cases from best to worst based on their absolute scores

Page 4: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 4

Input to Matching and Ranking Functions

• The inputs to matching and ranking functions are:

1) The new case, analyzed in terms of its important features, or indexes

2) The purpose you have in using the new case -- you can skip this is your system always performs the same task

3) The recalled cases -- this may be a subset of the case base or all cases

4) The indexes of the recalled cases

5) Reasonable criteria for determining the goodness of match

• You may want the best case, all relevant cases, or any case that could be adapted to your purpose

Page 5: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 5

Feature Correspondence

• To match and rank cases, you need to know which features correspond to each other

• In some domains, this is very easy

• Example: To help a buyer select a new car, desired price corresponds to actual price, desired make and model correspond to actual make and model, and so on

• In some domains, this can be hard

• CASEY had to compute correspondences, because its problem description was just a list of patient symptoms -- symptoms that were not identical could still correspond, due to the nature of heart disease

Page 6: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 6

Computing Similarity Among Corresponding Features

• Next, you need to determine how similar the values are for corresponding features

• You are usually looking for some measure of distance on a qualitative or quantitative scale

• Most systems hard code this on a feature by feature basis• Example: A user might be asked if a desired restaurant should be

Inexpensive, Moderate, Expensive, or Very Expensive• This might be translated to < $15, $15 - $30, $30 - $50, and > $50,

with each restaurant classified as belonging to one category• If a user asks for a category, say Moderate, and a restaurant is in

that category, we have an exact match• If the restaurant is one category away, say Inexpensive or

Expensive, we have a partial match• If the restaurant is more than one category away, we have no match

• You can use four or five categories if it makes sense for your domain

Page 7: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 7

Numeric Features

• When values for features are naturally represented as numbers, you may still need special routines for comparing the numbers

• Pitfalls to avoid are using absolute comparison and ranges

• Example: Say your feature is age and two patients are ten years apart

• If one is 60 and the other 70, you have at least a partial match and maybe a pretty good match

• If one is 1 and the other 11, you may have no match at all

• Example: Say you try to set up ranges, like young < 30, old > 50, and middle-aged everything in between

• Then, 31-year-olds will match 49-year-olds better than they match 29-year-olds

• We deal with this using normalization and/or point ranges

• We could say ages within 5 years are close matches for adults and ages within 1 year are close matches for children

Page 8: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 8

Abstraction Hierarchies

• When qualitative or quantitative measures don’t suit your domain, you may need to organize values hierarchically

• How can a system tell if spinach is closer to broccoli or to hamburger?

• In general, the higher up you have to go in a hierarchy to find a node in common, the worse match you have

• Exactly how you traverse the hierarchy to find degree of match will depend on your domain

Page 9: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 9

Example: Abstraction Hierarchy

Food

Meat Vegetable Fruit

Lamb

Beef

Pork

Veal

Green

Yellow Citrus Berry

Steak

Hamburger

Chop

Roast

Strawberry

Orange

Squash

Spinach

BroccoliPeas

Page 10: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 10

Importance of Features

• Besides considering how well features match, we also need to consider how important each feature is

• In an Ecommerce application, you could ask the user how important each feature is to them

• In some systems, the same features keep the same importance

• In other systems, features change in importance depending on the task at hand

• Kolodner uses the example of determining a salary for a professional baseball player

• If you want to hire a fielder, then how well he bats is important

• If you want to hire a pitcher, then batting is unimportant, but the speed of his fast ball becomes very important

Page 11: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 11

Matching and Importance

• We need to get a handle on two things at once

• How close features match

• How important it is for them to match

• Numeric conventions are used to indicate degree of match and degree of importance

• 0 means no match, 1 means exact match, and numbers in between indicate degree of partial match

• 0 means unimportant, 1 means of utmost importance, and numbers in between indicate degree of importance

• Note: This is not something that can be carried out to many decimal places. We often use rough estimates like .25, .5, and .75.

• The nearest neighbor algorithm is often used in practice to combine feature similarity and importance

Page 12: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 12

Page 13: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 13

Page 14: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 14

Page 15: Chapter 9: Matching and Ranking Cases

CS 682, AI:Case-Based Reasoning, Prof. Cindy Marling 15

Pitfalls of Applying Nearest Neighbor

• The example was purposely chosen to point out some pitfalls of the nearest neighbor algorithm

• The obvious problem is that the three old cases are equally similar to the new case. When this happens:• It’s possible that the cases really are very similar, and our case base

is just too small to contain distinguishable players• It’s also possible that we’re comparing the wrong features,

computing the wrong comparison values, or using the wrong importance weights

• In our example, we’re using the wrong features for comparison• We are comparing RBIs and strikeouts without considering how

many games a player has been in or how many at bats he’s had• We need to consider the ratio of successful attempts to opportunities

• Moral of Story: It’s easy to crunch numbers, but it’s not easy to know which numbers to crunch