randomly sampling maximal itemsets -...

16
Randomly Sampling Maximal Itemsets Sandy Moens and Bart Goethals

Upload: others

Post on 15-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

Randomly Sampling Maximal Itemsets Sandy Moens and Bart Goethals

Page 2: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

2

Frequent Itemset Mining

•  Finding interesting patterns by e.g. support

•  Problems: -  Much redundancy -  Many, many patterns

Page 3: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

3

Frequent Itemset Mining

•  Finding interesting patterns by support

•  Problems: -  Much redundancy -  Many, many patterns

Page 4: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

4

Frequent Itemset Mining

•  Finding interesting patterns by support

•  Problems: -  Much redundancy -  Many, many patterns

Page 5: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

5

Pattern Set Mining

•  Less redundancy •  Less patterns •  But: large enumeration space!

Page 6: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

6

Pattern Set Mining

•  Less redundancy •  Less patterns •  But: large enumeration space!

Step 1: Enumerate

Page 7: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

7

Pattern Set Mining

•  Less redundancy •  Less patterns •  But: large enumeration space!

Step 1: Enumerate Step 2: Filter

Page 8: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

8

Output Space Sampling

•  No explicit enumeration

Page 9: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

9

Output Space Sampling

•  No explicit enumeration

Page 10: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

10

Random Maximal Itemset Sampling

•  Long patterns with low support -  E.g. microarray data, recommendation

•  Simple random walk over extensions -  Quality measure q -  Approximation measure p

Page 11: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

11

Random Walk

Page 12: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

12

Random Walk

Page 13: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

13

Random Walk

Page 14: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

14

Random Walk

Page 15: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

15

Spreading the Search

•  Uniform Metropolis-Hastings -  E.g. Hasan and Zaki, Musk: Uniform sampling of

k-maximal patterns (SDM’09)

•  Weight approximation score -  Additive -  Multiplicative -  Adaptive

Page 16: Randomly Sampling Maximal Itemsets - Visualizationpoloclub.gatech.edu/idea2013/papers/IDEA_RMIS.pdf · DEMO TIME . Author: Sandy Moens Created Date: 8/11/2013 11:44:08 AM

16

DEMO TIME