market basket analysis · market basket analysis introduction to market basket analysis and a sas...
TRANSCRIPT
![Page 1: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/1.jpg)
Market Basket AnalysisIntroduction to Market Basket Analysis and
a SAS Implementation of the Apriori Algorithm
Bill Qualls
DePaul University, Spring 2013
ECT584 –Web Data Mining for Business Intelligence
Professor Jonathan Gemmell
1
![Page 2: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/2.jpg)
Outline
� Introduction
� Sample market basket
� Support
� Pairs
� Iterations
� Association Rules
� Confidence
� Lift
� SAS macro
2
![Page 3: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/3.jpg)
Introduction
"An apocryphal early illustrative example for this was when one super market chain discovered in its analysis that customers that bought diapers often bought beer as well, have put the diapers close to beer coolers, and their sales increased dramatically. Although this urban legend is only an example that professors use to illustrate the concept to students, the explanation of this imaginary phenomenon might be that fathers that are sent out to buy diapers often buy a beer as well, as a reward." (Retrieved May 5, 2013 from http://en.wikipedia.org/wiki/Market_basket)
3
![Page 4: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/4.jpg)
Items Soldbananasbolognabreadbunsbuttercerealcheesechips
4
eggshotdogsmayomilkmustardorangespicklessoda
![Page 5: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/5.jpg)
Sales Transactions#1breadbuttereggsmilk
#2bolognabreadcheesechipsmayosoda
#3bananasbreadbuttercheeseoranges
#4bunschipshotdogsmustardsoda
#5bunschipshotdogsmustardpicklessoda
#6breadbuttercerealeggsmilk
Our customers are trained to shop alphabetically. ☺
#7bananascerealeggsmilkoranges
#8bolognabreadbunscheesechipshotdogsmayomustardsoda
#9bananasbolognabreadcheesemilkorangessoda
#10breadbuttercerealeggsmilk
#11bananaschipssoda
#12breadbuttereggsmilkoranges
#13bananasbolognabreadcheesemayomustard
#14breadcerealeggsmilk
#15bolognabreadcheesechipsmayomustardsoda
#16breadbuttereggsmilkoranges
#17bunschipshotdogssoda
#18bunscheesechipshotdogsmustardsoda
#19chipspicklessoda
#20bolognabreadcheesechipsmayomustardsoda
5
![Page 6: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/6.jpg)
Creating the Sales dataset
data work.sales;
input tid item $;
datalines;
1 bread
1 butter
1 eggs
1 milk
2 bologna
2 bread
…
20 mustard
20 soda
;
run;
6
![Page 7: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/7.jpg)
Support
� The support of an item is the number of transactions containing that item.
� Minimum support is one of the parameters to the MBA macro.
� Items not meeting the minimum support criteria are excluded from further analysis.
� Support can be expressed as a count, or as a percentage of all tranactions.
� For our purposes, we will assume a minimum support requirement of four.
7
![Page 8: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/8.jpg)
Support#1breadbuttereggsmilk
#2bolognabreadcheesechipsmayosoda
#3bananasbreadbuttercheeseoranges
#4bunschipshotdogsmustardsoda
#5bunschipshotdogsmustardpicklessoda
#6breadbuttercerealeggsmilk
pickles do not meetour minimum supportrequirement (4).
#7bananascerealeggsmilkoranges
#8bolognabreadbunscheesechipshotdogsmayomustardsoda
#9bananasbolognabreadcheesemilkorangessoda
#10breadbuttercerealeggsmilk
#11bananaschipssoda
#12breadbuttereggsmilkoranges
#13bananasbolognabreadcheesemayomustard
#14breadcerealeggsmilk
#15bolognabreadcheesechipsmayomustardsoda
#16breadbuttereggsmilkoranges
#17bunschipshotdogssoda
#18bunscheesechipshotdogsmustardsoda
#19chipspicklessoda
#20bolognabreadcheesechipsmayomustardsoda
8
![Page 9: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/9.jpg)
Support
� Items meeting the minimum support requirement are included in subsequent processing.
9
![Page 10: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/10.jpg)
Pairs
� We then create all possible pairings of the survivingitems and see if the pair of items meets the minimum support requirement.
� This limiting ourselves to the surviving items is the key point of the apriori algorithm.
� The support of each pair of items is the number of transactions containing that pair.
� Pairs of item not meeting the minimum support criteria are excluded from further analysis.
� Consider the following pairings…
10
![Page 11: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/11.jpg)
Pairs#1breadbuttereggsmilk
#2bolognabreadcheesechipsmayosoda
#3bananasbreadbuttercheeseoranges
#4bunschipshotdogsmustardsoda
#5bunschipshotdogsmustardpicklessoda
#6breadbuttercerealeggsmilk
bananas ���� support = 5oranges ���� support = 5bananas, oranges ���� support = 3insufficient support
#7bananascerealeggsmilkoranges
#8bolognabreadbunscheesechipshotdogsmayomustardsoda
#9bananasbolognabreadcheesemilkorangessoda
#10breadbuttercerealeggsmilk
#11bananaschipssoda
#12breadbuttereggsmilkoranges
#13bananasbolognabreadcheesemayomustard
#14breadcerealeggsmilk
#15bolognabreadcheesechipsmayomustardsoda
#16breadbuttereggsmilkoranges
#17bunschipshotdogssoda
#18bunscheesechipshotdogsmustardsoda
#19chipspicklessoda
#20bolognabreadcheesechipsmayomustardsoda
11
![Page 12: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/12.jpg)
Pairs#1breadbuttereggsmilk
#2bolognabreadcheesechipsmayosoda
#3bananasbreadbuttercheeseoranges
#4bunschipshotdogsmustardsoda
#5bunschipshotdogsmustardpicklessoda
#6breadbuttercerealeggsmilk
bologna ���� support = 6chips ���� support = 10bologna, chips ���� support = 4sufficient support
#7bananascerealeggsmilkoranges
#8bolognabreadbunscheesechipshotdogsmayomustardsoda
#9bananasbolognabreadcheesemilkorangessoda
#10breadbuttercerealeggsmilk
#11bananaschipssoda
#12breadbuttereggsmilkoranges
#13bananasbolognabreadcheesemayomustard
#14breadcerealeggsmilk
#15bolognabreadcheesechipsmayomustardsoda
#16breadbuttereggsmilkoranges
#17bunschipshotdogssoda
#18bunscheesechipshotdogsmustardsoda
#19chipspicklessoda
#20bolognabreadcheesechipsmayomustardsoda
12
![Page 13: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/13.jpg)
Pairs
� Pairs of items meeting the minimum support requirement are included in subsequent processing.
13
![Page 14: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/14.jpg)
Iterate
� We then repeat the process, iterating with itemsets of size 3, itemsets of size 4, etc., until:
◦ we are unable to find any itemsets with sufficient support, or
◦ we reach the indicated maximum number of iterations (this is one of the parameters to the MBA macro).
14
![Page 15: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/15.jpg)
Association rules
� Our final results are expressed using association rules:
{LHS} � {RHS} [support, confidence]
� LHS stands for Left Hand Side
� RHS stands for Right Hand Side
� Example:
◦ {bologna} � {chips} [0.2, (discussed next)]
◦ {chips} � {bologna} [0.2, (discussed next)]
15
0.2 = 4 / 20
![Page 16: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/16.jpg)
Confidence
� Confidence is defined as the conditional probablity that a transaction containing the LHS will also contain the RHS.
��������� ∪ �
���������
� Confidence for {bologna} � {chips}
���������������, �����
��������������� �4 20⁄
6 20⁄� 0.67
� Confidence for {chips} � {bologna}
�������������, �������
������������� �
4 20⁄
10 20⁄� 0.40
16
![Page 17: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/17.jpg)
Lift
� Lift is a measure of the improvement in occurrence of the RHS given the association rule over the occurence of the RHS regardless. We'd like to see a lift value greater than one.
��� �!"��"�� → �
���������
� Lift for {bologna} � {chips}
��� �!"��"�������� → �����
������������� �
0.67
10 20⁄� 1.33
� Lift for {chips} � {bologna}
��� �!"��"������ → �������
��������������� �
0.40
6 20⁄� 1.33
17
![Page 18: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/18.jpg)
Running the SAS MBA macro%mba(work.sales, SAS transactions dataset
"Y", is item id a string? "Y" or "N".
Work.Results, name of SAS results dataset
5, maximum iterations
0.2, minimum support (0.2 = 20%)
"C:\temp\mba.html"); web page output
run;
18
![Page 19: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/19.jpg)
Running the SAS MBA macro
19
![Page 20: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/20.jpg)
Running the SAS MBA macro
20
![Page 21: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/21.jpg)
Running the SAS MBA macro
21
(RHS: bologna (continued)
![Page 22: Market Basket Analysis · Market Basket Analysis Introduction to Market Basket Analysis and a SAS Implementation of the Apriori Algorithm Bill Qualls ... Sales Transactions #1 bread](https://reader036.vdocument.in/reader036/viewer/2022081507/5edd76fdad6a402d666891b0/html5/thumbnails/22.jpg)
22
Questions?