association rules (market basket analysis)

30
Association Rules Association Rules (market basket analysis) (market basket analysis) Retail shops are often interested in associations between different items that people buy. Someone who buys bread is quite likely also to buy milk A person who bought the book Database System Concepts is quite likely also to buy the book Operating System Concepts. Associations information can be used in several ways. E.g. when a customer buys a particular book, an online shop may suggest associated books. Association rules: bread milk DB-Concepts, OS-Concepts Networks Left hand side: antecedent, right hand side: consequent An association rule must have an associated population; the population consists of a set of instances E.g. each transaction (sale) at a shop is an instance, and the set of all transactions is the population

Upload: hammett-douglas

Post on 31-Dec-2015

54 views

Category:

Documents


4 download

DESCRIPTION

Retail shops are often interested in associations between different items that people buy. Someone who buys bread is quite likely also to buy milk A person who bought the book Database System Concepts is quite likely also to buy the book Operating System Concepts . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Association Rules (market basket analysis)

Association RulesAssociation Rules(market basket analysis)(market basket analysis)

Retail shops are often interested in associations between different items that people buy. • Someone who buys bread is quite likely also to buy milk

• A person who bought the book Database System Concepts is quite likely also to buy the book Operating System Concepts.

Associations information can be used in several ways. • E.g. when a customer buys a particular book, an online shop may

suggest associated books.

Association rules:

bread milk DB-Concepts, OS-Concepts Networks• Left hand side: antecedent, right hand side: consequent

• An association rule must have an associated population; the population consists of a set of instances

• E.g. each transaction (sale) at a shop is an instance, and the set of all transactions is the population

Page 2: Association Rules (market basket analysis)

Association Rule DefinitionsAssociation Rule Definitions

Set of items: I={I1,I2,…,Im}

Transactions: D={t1,t2, …, tn}, tj I

Itemset: {Ii1,Ii2, …, Iik} I

Support of an itemset: Percentage of transactions which contain that itemset.

Large (Frequent) itemset: Itemset whose number of occurrences is above a threshold.

Page 3: Association Rules (market basket analysis)

Association Rules ExampleAssociation Rules Example

I = { Beer, Bread, Jelly, Milk, PeanutButter}

Page 4: Association Rules (market basket analysis)

Association Rule DefinitionsAssociation Rule Definitions

Association Rule (AR): implication X Y where X,Y I and X Y = the null set;

Support of AR (s) X Y: Percentage of transactions that contain X Y

Confidence of AR () X Y: Ratio of number of transactions that contain X Y to the number that contain X

Page 5: Association Rules (market basket analysis)

Association Rules Ex (cont’d)Association Rules Ex (cont’d)

Page 6: Association Rules (market basket analysis)

Association Rules Ex (cont’d)Association Rules Ex (cont’d)

Of 5 transactions, 3 involve both Bread and PeanutButter, 3/5 = 60%

Of the 4 transactions that involve Bread, 3 of them also involve PeanutButter 3/4 = 75%

Page 7: Association Rules (market basket analysis)

Association Rule ProblemAssociation Rule Problem

Given a set of items I={I1,I2,…,Im} and a database of transactions D={t1,t2, …, tn} where ti={Ii1,Ii2, …, Iik} and Iij I, the Association Rule Problem is to identify all association rules X Y with a minimum support and confidence (supplied by user).

NOTE: Support of X Y is same as support of X Y.

Page 8: Association Rules (market basket analysis)

Association Rule Algorithm (Basic Idea)Association Rule Algorithm (Basic Idea)

1. Find Large Itemsets.

2. Generate rules from frequent itemsets.

This is the simple naïve algorithm, better algorithms exist.

Page 9: Association Rules (market basket analysis)

Association Rule AlgorithmAssociation Rule Algorithm

We are generally only interested in association rules with reasonably high support (e.g. support of 2% or greater)

Naïve algorithm

1. Consider all possible sets of relevant items.

2. For each set find its support (i.e. count how many transactions purchase all items in the set).

• Large itemsets: sets with sufficiently high support

• Use large itemsets to generate association rules.

• From itemset A generate the rule A - {b} b for each b A.

• Support of rule = support (A).

• Confidence of rule = support (A ) / support (A - {b})

Page 10: Association Rules (market basket analysis)

• From itemset A generate the rule A - {b} b for each b A.

• Support of rule = support (A).

• Confidence of rule = support (A ) / support (A - {b})

Lets say itemset A = {Bread, Butter, Milk}

Then A - {b} b for each b A includes 3 possibilities

{Bread, Butter} Milk

{Bread, Milk} Butter

{Butter, Milk} Bread

Page 11: Association Rules (market basket analysis)

AprioriApriori

Large Itemset Property:

Any subset of a large itemset is large.

Contrapositive:

If an itemset is not large,

none of its supersets are large.

Page 12: Association Rules (market basket analysis)

Large Itemset PropertyLarge Itemset Property

Page 13: Association Rules (market basket analysis)

Large Itemset PropertyLarge Itemset Property

If B is not frequent, then none of the supersets of B can be frequent.

If {ACD} is frequent, then all subsets of {ACD} ({AC}, {AD}, {CD}) must be frequent.

If {ACD} is frequent, then all subsets of ({A}, {A}, {C}) must be frequent.

Page 14: Association Rules (market basket analysis)

My Personal View of Association Rules My Personal View of Association Rules

Vastly over studied problem, of dubious utility

Page 15: Association Rules (market basket analysis)

Student PresentationsStudent Presentations

Starting next week students will be giving presentations

Presentation can be on

The student project

A paper chosen by the student (per my approval)

The presentation should last 8 to15 minutes. You need to tell me in advance how long the talk will be.

You must email me the slides by midnight, before the talk

There will be a signup sheet (topic and date) on my door tomorrow.

Page 16: Association Rules (market basket analysis)

Tips for Giving a Good TalkTips for Giving a Good Talk

Winter 2003Winter 2003

Dr Eamonn KeoghDr Eamonn KeoghComputer Science & Engineering Department

University of California - RiversideRiverside,CA [email protected]

Modified from the notes of Edward R. Tufte, Craig S. Kaplan, Eamonn Keogh and others

Page 17: Association Rules (market basket analysis)

OutlineOutline

Advice on giving talksAdvice on giving talks

• General advice• Organization• Making clear overheads• Avoiding common pitfalls

ConclusionConclusion

Page 18: Association Rules (market basket analysis)

• Show up early. You may have a chance to head off some technical or ergonomic problem.

• Have a backup plan. If your lecture is based on a PowerPoint presentation, have overhead backups of each page.

• Check out the room ahead of time. Before your talk, check out the room, and make sure it has everything you need.

General Advice IGeneral Advice I

Page 19: Association Rules (market basket analysis)

•Never apologize. Most people wouldn’t have noticed the issues for which you’re apologizing—and it just sounds lame.

• Invest in a laser pointer. They are inexpensive, and are extremely useful.

• Rehearse timing. This is the most common sin!!!

General Advice IIGeneral Advice II

Page 20: Association Rules (market basket analysis)

Overheads IOverheads I

• Use large fonts. Use the biggest fonts realistically possible. Small fonts are hard to read

• Use highly contrasting colors.

• Avoid busy backgrounds. Too much in the background makes the text hard to read

Page 21: Association Rules (market basket analysis)

Overheads IIOverheads II

• Avoid using red text. Red text is often hard to read.

• AVOID ALL CAPS! All caps look like you're shouting.

…Include a good combination of words, pictures, and graphics. A variety keeps the presentation interesting

…Include a good combination of words, pictures, and graphics. A variety keeps the presentation interesting

Page 22: Association Rules (market basket analysis)

Overheads IIIOverheads III

• Be Terse

• The sales forecasts show an increase on the horizon. • Sales are up.

• Use bullets or numbered items appropriately

Goals• Ease of use • Reusability • Reliability

Outline of our method1. Design 2. Implementation 3. Testing

Page 23: Association Rules (market basket analysis)

Overheads IIIIOverheads IIII

• Begin with an introduction slide (Who you are, why you are giving a talk, the title of the talk)

• Next, give an outline (“roadmap”). For a short talk, you might want to combine this with the above

• State your point (one simple slide)• Demonstrate your point (a few slides)• Review your point (one simple slide)

Page 24: Association Rules (market basket analysis)

Overheads VOverheads V

• End with a slide that reviews the entire talk…

• We introduced the TSP problem• We explained why it is an important problem• We explained why it is a hard problem• We introduced a new heuristic to solve TSP• We empirically demonstrated the utility of our approach

• End “cleanly”, don’t fade away.

Page 25: Association Rules (market basket analysis)

Overheads VIOverheads VI

• Avoid using “standard” clipart/ background etc

I have seen this at least 20 times in conference presentations.

Page 26: Association Rules (market basket analysis)

Overheads VIIOverheads VII

• Be careful with Acronyms…

C_max

C_min

Rangei, Diameteri

R1, D1

R2, D2

Neighboring Unlabeled Token:

sskh f dhfa

Page 27: Association Rules (market basket analysis)

Annoying Personal Habits IAnnoying Personal Habits I(This means you)(This means you)

• Playing with jewelry • Licking and/or biting your lips • Constantly adjusting your glasses • Popping the top of a pen • Playing with facial hair (men)• Playing with/twirling your hair (women)

Page 28: Association Rules (market basket analysis)

Annoying Personal Habits IIAnnoying Personal Habits II(This means you)(This means you)

• Jingling change in your pocket • Leaning against anything for support• Fillers: “ah”, “um”, and “and”• Starting every sentence with the same word • Sticky floor syndrome• Avoiding eye contact• Lack of enthusiasm “Basically” and

“essentially” seem to be the current favorites.

Page 29: Association Rules (market basket analysis)

ConclusionConclusion

• We have motivated the need for a high quality talk

• We have seen various tips on creating high quality overheads

• We have seen various hints on avoiding common pitfalls