1.9.association mining 1
TRANSCRIPT
![Page 1: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/1.jpg)
Association Rule Mining 1
Association Rule Mining
![Page 2: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/2.jpg)
Association Rule Mining 2
Association Rules
Finds Interesting associations / correlation relationships among large sets of data
Business Decision Making Example – Market Basket Analysis
Items likely to be purchased Advertising strategy, Catalog Design, Store layout
![Page 3: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/3.jpg)
Association Rule Mining 3
Association Rules
Forming Association rules Universe – all items Boolean Vector
Example: Computer Accounting_Software
[support = 5%, confidence = 60%] Minimum support and Confidence threshold
![Page 4: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/4.jpg)
Association Rule Mining 4
Basic Concepts
I = {i1, i2, …im} – Set of Items D – Set of database Transactions T – Transaction contains a set of items and T I Association rule – A B where A I B I and A B = Support – Percentage of transactions in D containing both A and B -
P(AB) Confidence – Percentage of transactions in D containing A that also
contain B – P(B/A) Confidence(A B) = Support(A B) / Support (A)
![Page 5: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/5.jpg)
Association Rule Mining 5
Basic Concepts
Itemset K-Itemset Occurrence frequency of an itemset
Frequency, support_count (absolute support) or count Itemset satisfies minimum support when count >=
min_sup * number of transactions in D Minimum Support Count Frequent Itemset
![Page 6: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/6.jpg)
Association Rule Mining 6
Association Rule Mining Process Find all frequent itemsets Generate strong association rules from
frequent itemsets Satisfy Minimum Support and Minimum
Confidence
![Page 7: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/7.jpg)
Association Rule Mining 7
Itemsets Complete Itemsets Closed Frequent Itemset
X is closed in a data set S if there exists no proper super itemset Y such that Y has the same support count as X in S
X is frequent
Maximal Frequent Itemset X is Frequent and there exists no super-itemset Y such that X Y
and Y is frequent in S
Example: T = { {a1,a2,…a100}, {a1,a2,…a50}}, min_sup = 1
Closed frequent itemsets : Both {{a1,a2,…a100}:1, {a1,a2,…a50}: 2}
Maximal frequent itemset: {a1,a2,…a100}
![Page 8: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/8.jpg)
Association Rule Mining 8
Types of Association Rules Types of Values
Boolean, Quantitative Association Rule
Dimensions of data Single Dimensional, Multi-dimensional
Level of abstraction Multilevel association rules
Based on kinds of rules Association rules, Correlation rules, Strong gradient relationships
Based on completeness of patterns Complete, Closed, Maximal, top-k, constrained, approximate…
![Page 9: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/9.jpg)
Association Rule Mining 9
Mining Single Dimensional Boolean Association Rules
Apriori Algorithm – Finding Frequent Itemsets using Candidate Generation Uses prior knowledge of frequent itemset properties Level wise search
K itemsets used for exploring k+1 itemsets Frequent 1-itemsets – L1
L1 is used to find L2
![Page 10: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/10.jpg)
Association Rule Mining 10
Apriori Property
Reduces Search space
All non empty subsets of a frequent itemset must also be
frequent
If P(I) < min_sup then P(I U A) < min_sup
Anti-monotone property – If a set cannot pass a test all
of its supersets will fail the test as well.
Any subset of a frequent itemset must be frequent
![Page 11: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/11.jpg)
Association Rule Mining 11
Apriori property application
Join Step To find Lk - join Lk-1 with itself - Ck
li[j] – jth item in li
Members of Lk-1 are joinable if their first (k-2) items are
common
Members l1 and l2 of Lk-1 are joinable if (l1[1]=l2[1])
(l1[2]=l2[2]) …(l1[k-2]=l2[k-2]) (l1[k-1]< l2[k-1])
Resulting itemset is l1[1], l1[2], … l1[k-1], l2[k-1]
![Page 12: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/12.jpg)
Association Rule Mining 12
Apriori property application
Prune Step Ck is a superset of Lk
Determine the count of each candidate of Ck
To reduce the size of Ck - if any (k-1) subset is not in Lk-1 it can be
removed from Ck
![Page 13: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/13.jpg)
Association Rule Mining 13
The Apriori Algorithm
Pseudo-code:Ck: Candidate itemset of size kLk : frequent itemset of size kL1 = {frequent items};for (k = 2; Lk-1 !=; k++) do begin Ck = candidates generated from Lk-1; for each transaction t in database do
increment the count of all candidates in Ck that are contained in t
Lk = candidates in Ck with min_support endreturn k Lk;
![Page 14: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/14.jpg)
Association Rule Mining 14
The Apriori Algorithm—An Example
Database TDBTid Items
T100 I1,I2,I5
T200 I2,I4
T300 I2,I3
T400 I1, I2, I4
T500 I1, I3
T600 I2, I3
T700 I1, I3
T800 I1, I2, I3, I5
T900 I1, I2, I3
Minimum Support = 2 / 9 = 22%
![Page 15: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/15.jpg)
Association Rule Mining 15
Apriori Algorithm
Input: Database of transactions – D, min_sup Output: L, frequent itemsetsL1 = find_frequent_1-itemsets(D);for(k=2;Lk-1≠ Ø; k++){
Ck = apriori_gen(Lk-1, min_sup);for each transaction t Є D {
Ct = subset(Ck, t)for each candidate c Є Ct
c.count++;}Lk = {c Є Ck | c.count >= min_sup }
}return L = UkLk;
![Page 16: 1.9.association mining 1](https://reader035.vdocument.in/reader035/viewer/2022081001/55ca3b05bb61ebb20d8b4713/html5/thumbnails/16.jpg)
Association Rule Mining 16
Apriori Algorithm
procedure apriori_gen(Lk-1 , min_sup)for each itemset l1 Є Lk-1
for each itemset l2 Є Lk-1
if(l1[1]=l2[1]) (l1[2]=l2[2]) … (l1[k-1]< l2[k-1]){c = l1 join l2; // Join stepif has_infrequent_subset(c, Lk-i) then
delete c;// Prune stepelse add c to Ck; }
return Ck
procedure has_infrequent_subset(c, Lk-1)for each (k-1) subset s of c
if s is not an element of Lk-1 then return TRUE;return false;