chapter 05 association rule mining - sherbold.github.ioΒ Β· example for generating rules β€’ support...

Post on 02-Feb-2021

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

  • Chapter 05

    Association Rule Mining

    Dr. Steffen Herbold

    herbold@cs.uni-goettingen.de

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

  • Outline

    β€’ Overview

    β€’ The Apriori Algorithm

    β€’ Summary

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

  • Example of Association Rules

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    Items already in basket Available items Item likely to be added+

  • The General Problem

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    Set of Transactions

    β€’ item1,item2,item3

    β€’ item2,item4

    β€’ item1,item5

    β€’ item6,item7

    β€’ item2,item3,item4,item7

    β€’ item2,item3,item4,item8

    β€’ item2,item4,item5

    β€’ item2,item3,item4

    β€’ item4,item5

    β€’ …

    Association

    Rule Mining

    Association Rules

    β€’ item2 β†’ item3

    β€’ item2 β†’ item4

    β€’ item2,item3 β†’ item4

    β€’ …

    Rules describe

    β€žinteresting relationshipsβ€œ

  • The Formal Problem

    β€’ Itemsβ€’ 𝐼 = {𝑖1, 𝑖2, … , π‘–π‘š}

    β€’ Transactionsβ€’ 𝑇 = {𝑑1, … , 𝑑𝑛} with 𝑑𝑖 βŠ† 𝐼

    β€’ Rulesβ€’ 𝑋 β‡’ π‘Œ such that 𝑋, π‘Œ βŠ† 𝐼 and 𝑋 ∩ π‘Œ = βˆ…

    β€’ 𝑋 is also called antecedent or left-hand-side

    β€’ π‘Œ is also called consequent or right-hand-side

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    How do you

    get good

    rules?

  • Defining β€žInteresting Relationshipsβ€œ

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    Set of Transactions

    β€’ item1,item2,item3

    β€’ item2,item4

    β€’ item1,item5

    β€’ item6,item7

    β€’ item2,item3,item4,item7

    β€’ item2,item3,item4,item8

    β€’ item2,item4,item5

    β€’ item2,item3,item4

    β€’ item4,item5

    β€’ …

    item2, item3,

    item4 occur often

    together

    Interesting == often together

  • Outline

    β€’ Overview

    β€’ The Apriori Algorithm

    β€’ Summary

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

  • Support and Frequent Item Sets

    β€’ Supportβ€’ Percentage of occurances of an itemset 𝐼𝑆

    β€’ π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘ 𝐼𝑆 =| π‘‘βˆˆπ‘‡βˆΆπΌπ‘†βŠ†π‘‘ |

    |𝑇|

    β€’ Frequent item setβ€’ Itemsets that appear together β€žoften enoughβ€œ

    β€’ Defined using a threshold

    β€’ 𝐼𝑆 βŠ† 𝐼 is frequent if π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘ 𝐼𝑆 β‰₯ π‘šπ‘–π‘›π‘ π‘’π‘π‘

    β€’ Rules can be generated by splitting itemsetsβ€’ 𝐼𝑆 = 𝑋 βˆͺ π‘Œ

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

  • Example for Generating Rules

    β€’ support π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 =3

    10β‰₯ 0.3

    β€’ Eight possible rules: β€’ βˆ… β‡’ {π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4}

    β€’ {π‘–π‘‘π‘’π‘š2} β‡’ {π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4}

    β€’ {π‘–π‘‘π‘’π‘š3} β‡’ {π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4}

    β€’ {π‘–π‘‘π‘’π‘š4} β‡’ {π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3}

    β€’ {π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3} β‡’ {π‘–π‘‘π‘’π‘š4}

    β€’ {π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4} β‡’ {π‘–π‘‘π‘’π‘š3}

    β€’ {π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4} β‡’ {π‘–π‘‘π‘’π‘š2}

    β€’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 β‡’ βˆ…

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    Set of Transactions

    β€’ item1,item2,item3

    β€’ item2,item4

    β€’ item1,item5

    β€’ item6,item7

    β€’ item2,item3,item4,item7

    β€’ item2,item3,item4,item8

    β€’ item2,item4,item5

    β€’ item2,item3,item4

    β€’ item4,item5

    β€’ item6,item7

    π‘šπ‘–π‘›π‘ π‘’π‘π‘ = 0.3

    Are all rules

    interesting?

  • Confidence, Lift, and Leverage

    β€’ Confidenceβ€’ Percentage of transactions that contain the antecedent, which also

    contain consequent

    β€’ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ 𝑋 β‡’ π‘Œ =π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘(𝑋βˆͺπ‘Œ)

    π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘(𝑋)=

    | π‘‘βˆˆπ‘‡:𝑋βˆͺπ‘ŒβŠ†π‘‡ |

    | π‘‘βˆˆπ‘‡:π‘‹βŠ†π‘‡ |

    β€’ Liftβ€’ Ratio of the probability of 𝑋 and π‘Œ together and independently

    β€’ 𝑙𝑖𝑓𝑑 𝑋 β‡’ Y =π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘(𝑋βˆͺπ‘Œ)

    π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘ 𝑋 β‹… π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘(π‘Œ)

    β€’ Leverageβ€’ Difference in the probability of 𝑋 and π‘Œ together and independently

    β€’ π‘™π‘’π‘£π‘’π‘Ÿπ‘Žπ‘”π‘’ 𝑋 β‡’ π‘Œ = π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘ 𝑋 βˆͺ π‘Œ βˆ’ π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘ π‘₯ β‹… π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘(π‘Œ)

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    Usually, lift favors itemsets with lower

    support, leverage with higher support

  • Confidence for the Example Rules

    β€’ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ βˆ… β‡’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 =0.3

    1= 0.3

    β€’ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ π‘–π‘‘π‘’π‘š2 β‡’ π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 =0.3

    0.6= 0.5

    β€’ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ π‘–π‘‘π‘’π‘š3 β‡’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4 =0.3

    0.4= 0.75

    β€’ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ π‘–π‘‘π‘’π‘š4 β‡’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3 =0.3

    0.6= 0.5

    β€’ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3 β‡’ π‘–π‘‘π‘’π‘š4 =0.3

    0.4= 0.75

    β€’ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4 β‡’ π‘–π‘‘π‘’π‘š3 =0.3

    0.5= 0.6

    β€’ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 β‡’ π‘–π‘‘π‘’π‘š2 =0.3

    0.3= 1

    β€’ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 β‡’ βˆ… =0.3

    0.3= 1

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    Set of Transactions

    β€’ item1,item2,item3

    β€’ item2,item4

    β€’ item1,item5

    β€’ item6,item7

    β€’ item2,item3,item4,item7

    β€’ item2,item3,item4,item8

    β€’ item2,item4,item5

    β€’ item2,item3,item4

    β€’ item4,item5

    β€’ item6,item7

  • Lift for the Example Rules

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    β€’ 𝑙𝑖𝑓𝑑 βˆ… β‡’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 =0.3

    1β‹…0.3= 1

    β€’ 𝑙𝑖𝑓𝑑 π‘–π‘‘π‘’π‘š2 β‡’ π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 =0.3

    0.6β‹…0.3= 1.66

    β€’ 𝑙𝑖𝑓𝑑 π‘–π‘‘π‘’π‘š3 β‡’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4 =0.3

    0.4β‹…0.5= 1.5

    β€’ 𝑙𝑖𝑓𝑑 π‘–π‘‘π‘’π‘š4 β‡’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3 =0.3

    0.6β‹…0.4= 1.25

    β€’ 𝑙𝑖𝑓𝑑 π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3 β‡’ π‘–π‘‘π‘’π‘š4 =0.3

    0.4β‹…0.6= 1.25

    β€’ 𝑙𝑖𝑓𝑑 π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4 β‡’ π‘–π‘‘π‘’π‘š3 =0.3

    0.5β‹…0.4= 1.5

    β€’ 𝑙𝑖𝑓𝑑 π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 β‡’ π‘–π‘‘π‘’π‘š2 =0.3

    0.3β‹…0.6= 1.66

    β€’ 𝑙𝑖𝑓𝑑 π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 β‡’ βˆ… =0.3

    0.3β‹…1= 1

    Set of Transactions

    β€’ item1,item2,item3

    β€’ item2,item4

    β€’ item1,item5

    β€’ item6,item7

    β€’ item2,item3,item4,item7

    β€’ item2,item3,item4,item8

    β€’ item2,item4,item5

    β€’ item2,item3,item4

    β€’ item4,item5

    β€’ item6,item7

  • Overview of Scores for Example

    Rule Confidence Lift Leverage

    βˆ… β‡’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 0.30 1.00 0.00

    π‘–π‘‘π‘’π‘š2 β‡’ π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 0.50 1.66 0.12

    π‘–π‘‘π‘’π‘š3 β‡’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4 0.75 1.50 0.10

    π‘–π‘‘π‘’π‘š4 β‡’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3 0.50 1.25 0.06

    π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3 β‡’ π‘–π‘‘π‘’π‘š4 0.75 1.25 0.06

    π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4 β‡’ π‘–π‘‘π‘’π‘š3 0.60 1.50 0.10

    π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 β‡’ π‘–π‘‘π‘’π‘š2 1.00 1.66 0.12

    π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 β‡’ βˆ… 1.00 1.00 0.00

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    Perfect confidence, but

    no gain over randomness

    Perfect confidence, and

    1.66 times more likely

    than randomness

  • Itemsets and Rules = Exponential

    β€’ Number of itemset is exponentialβ€’ All possible itemsets are the powerset 𝒫 of 𝐼

    β€’ 𝒫 𝐼 = 2|𝐼|

    β€’ Still exponential if we restrict the sizeβ€’ |𝐼| itemsets with π‘˜ = 1 items

    ‒𝐼 β‹…( 𝐼 βˆ’1)

    2itemsets with π‘˜ = 2 items

    β€’|𝐼|π‘˜

    =𝐼 !

    𝐼 βˆ’π‘˜ !π‘˜!itemsets with π‘˜ items

    β€’ Number of rules per itemset is exponentialβ€’ Possible antecedents of itemset 𝑖 are the powerset 𝒫 of 𝑖

    β€’ 𝒫 𝑖 = 2|𝑖|

    β€’ Example: 𝐼 = 100, π‘˜ = 3β€’ 161,700 possible itemsets

    β€’ 1,293,600 possible rules

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    How do we

    restrict the

    search space?

  • Pruning the Search Space

    β€’ Apriori Propertyβ€’ All subsets of frequent itemsets are also frequent

    β€’ π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘ 𝑖′ β‰₯ π‘ π‘’π‘π‘π‘œπ‘Ÿπ‘‘ 𝑖 for all 𝑖′ βŠ† 𝑖, 𝑖 ∈ 𝐼

    β€’ β€žGrowβ€œ itemsets and prune search space by applying Apriori property

    β€’ Start with itemsets of size k = 1

    β€’ Drop all itemsets that do not have minimal support

    β€’ Build all combinations of size k + 1

    β€’ Repeat untilβ€’ No itemsets with minimal support are found

    β€’ A threshold for π‘˜ is reached

    β€’ Still has exponential complexity, but better bounded

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

  • Example for Growing Itemsets (k=1)

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    Set of Transactions

    β€’ item1,item2,item3

    β€’ item2,item4

    β€’ item1,item5

    β€’ item6,item7

    β€’ item2,item3,item4,item7

    β€’ item2,item3,item4,item8

    β€’ item2,item4,item5

    β€’ item2,item3,item4

    β€’ item4,item5

    β€’ item6,item7

    π‘šπ‘–π‘›π‘ π‘’π‘π‘ = 0.3

    Rule Support

    {π‘–π‘‘π‘’π‘š1} 0.2

    {π‘–π‘‘π‘’π‘š2} 0.6

    {π‘–π‘‘π‘’π‘š3} 0.4

    {π‘–π‘‘π‘’π‘š4} 0.5

    {π‘–π‘‘π‘’π‘š5} 0.3

    {π‘–π‘‘π‘’π‘š6} 0.2

    {π‘–π‘‘π‘’π‘š7} 0.3

    {π‘–π‘‘π‘’π‘š8} 0.1

    Drop

    Drop

    Drop

  • Example for Growing Itemsets (k=2)

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    Set of Transactions

    β€’ item1,item2,item3

    β€’ item2,item4

    β€’ item1,item5

    β€’ item6,item7

    β€’ item2,item3,item4,item7

    β€’ item2,item3,item4,item8

    β€’ item2,item4,item5

    β€’ item2,item3,item4

    β€’ item4,item5

    β€’ item6,item7

    π‘šπ‘–π‘›π‘ π‘’π‘π‘ = 0.3

    Rule Support

    {π‘–π‘‘π‘’π‘š2, item3} 0.4

    {π‘–π‘‘π‘’π‘š2, item4} 0.5

    {π‘–π‘‘π‘’π‘š2, item5} 0.1

    {π‘–π‘‘π‘’π‘š2, item7} 0.1

    {π‘–π‘‘π‘’π‘š3, item4} 0.3

    {π‘–π‘‘π‘’π‘š3, item5} 0.0

    {π‘–π‘‘π‘’π‘š3, item7} 0.1

    … …

    Drop

    Drop

    Drop

    Drop

    Drop

  • Example for Growing Itemsets (k=3)

    β€’ Found the following frequent itemsetswith at least two items:

    β€’ π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3 , π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4 , π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4

    β€’ {π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4}

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    Set of Transactions

    β€’ item1,item2,item3

    β€’ item2,item4

    β€’ item1,item5

    β€’ item6,item7

    β€’ item2,item3,item4,item7

    β€’ item2,item3,item4,item8

    β€’ item2,item4,item5

    β€’ item2,item3,item4

    β€’ item4,item5

    β€’ item6,item7

    π‘šπ‘–π‘›π‘ π‘’π‘π‘ = 0.3

    Rule Support

    {π‘–π‘‘π‘’π‘š2, item3, item4} 0.3

    Only itemset

    remaining, growing

    terminates.

  • Candidates for Rules

    β€’ Usually, not all possible rules are considered

    β€’ Two common restrictions:β€’ No empty antecedents and consequents

    β€’ 𝑋 β‰  βˆ… and π‘Œ β‰  βˆ…

    β€’ Only one item as consequentβ€’ π‘Œ = 1

    β€’ Example:

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

    βˆ… β‡’ {π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4}{π‘–π‘‘π‘’π‘š2} β‡’ {π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4}{π‘–π‘‘π‘’π‘š3} β‡’ {π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4}{π‘–π‘‘π‘’π‘š4} β‡’ {π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3}{π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3} β‡’ {π‘–π‘‘π‘’π‘š4}{π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š4} β‡’ {π‘–π‘‘π‘’π‘š3}{π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4} β‡’ {π‘–π‘‘π‘’π‘š2}π‘–π‘‘π‘’π‘š2, π‘–π‘‘π‘’π‘š3, π‘–π‘‘π‘’π‘š4 β‡’ βˆ…

  • Evaluating Association Rules

    β€’ Use different criteria, not just support and confidenceβ€’ Lift and leverage can tell you if rules are coincidental

    β€’ Validate if rules hold on test dataβ€’ Check if the rules would also be found on the test data

    β€’ For basket prediction, use incomplete itemsets on test dataβ€’ Example: remove item4 from all itemsets and see if the rules would

    correctly predict where it is associated

    β€’ Manually inspect rulesβ€’ Do they make sense?

    β€’ Ask domain experts

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

  • Outline

    β€’ Overview

    β€’ The Apriori Algorithm

    β€’ Summary

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

  • Summary

    β€’ Associations are interesting relationships between items

    β€’ Interesting usually means appear together and not coincidental

    β€’ Number of possible itemsets/rules exponentialβ€’ Apriori property for bounding itemsets

    β€’ Restrictions on rule structures

    β€’ Test data and manual inspections for validation

    Introduction to Data Science

    https://sherbold.github.io/intro-to-data-science

top related