apriori.rb - la ruby presentation

Post on 22-Apr-2015

4.024 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Nate Murray's talk on Apriori.rb - A ruby gem/wrapper around Christian Borgelt’s apriori.c software for finding frequently purchased itemsets

TRANSCRIPT

AprioriA Ruby wrapper for

Christian Borgelt’s implementationof Agrawal et al.’s algorithm

what does it do?

picture of a grocery store

overview

overview

• Find regularities in shopping behavior

overview

• Find regularities in shopping behavior

• Market Basket Analysis

overview

• Find regularities in shopping behavior

• Market Basket Analysis

• Sets of products

suggest items to a customer

association rules

association rules

association rules

“A customer who buys apples buys cheese with 30% certainty”

association rules

“A customer who buys apples buys cheese with 30% certainty”

Confidence

why would we want to do this?

picture of “buy this too”

Example

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Item

s

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Customers

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

purchased

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

problem:

too many possible rules

solution:

Don’t look at all the rules(which is how Apriori works)

term:

Itemset:a combination of one or more items

examples of itemsets

examples of itemsets

examples of itemsets

examples of itemsets

examples of itemsets

Step 1) Build a prefix tree

prefix

prefix

prefix

prefix

prefix

prefix

Step 2) Prune statistically insignificant rules

statistically significant

term:

Support:the percentage of transactions that a rule/itemset can be applied to

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

3/5

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

3/5 = 60%

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

3/5 = 60%

Support

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

2/5

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

2/5 = 40%

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

2/5 = 40%

Support

support of:

= 40% support of:

The key optimization:

= 40% support of:

= 40% support of:

support of:

= 40% support of:

support of:

= 40% support of:

support of:

+

= 40% support of:

support of:

+

= 40% support of:

support of:

+ <= 40%

Step 3) Find “good” rules

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

need a way to calculate “goodness”

term:

term:

Confidence:number of cases in which the rule is correct relative to the number of cases in which it is applicable

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

2/2

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

2/2 = 100%

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

2/2 = 100%

Confidence

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

1/3

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

1/3 = 33%

intermission

Apriori(in ruby)

code example

available today

gem install apriori

requires: rubygems >= 1.2.0

gem update --system

AprioriA Ruby wrapper for

Christian Borgelt’s implementationof Agrawal et al.’s algorithm

AprioriA Ruby wrapper for

Christian Borgelt’s implementationof Agrawal et al.’s algorithm

top related