![Page 1: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/1.jpg)
FREERIDE: System Support for High Performance Data Mining
Ruoming Jin Leo Glimcher Xuan Zhang
Ge Yang Gagan Agrawal
Department of Computer and Information SciencesOhio State University
![Page 2: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/2.jpg)
Motivation: Data Mining Problem Datasets available for mining
are often large Our understanding of what
algorithms and parameters will give desired insights is limited
Time required for creating scalable implementations of different algorithms and running them with different parameters on large datasets slows down the data mining process
![Page 3: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/3.jpg)
Project Overview
FREERIDE (Framework for Rapid Implementation of datamining engines) as the base system
Demonstrated for a variety of standard mining algorithms
![Page 4: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/4.jpg)
FREERIDE offers: The ability to rapidly prototype a high-performance mining
implementation Distributed memory parallelization Shared memory parallelization Ability to process large and disk-resident datasets Only modest modifications to a sequential implementation for
the above three
![Page 5: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/5.jpg)
Key Observation from Mining Algorithms
Popular algorithms have a common canonical loop
Can be used as the basis for supporting a common middleware
While( ) {
forall( data instances d) {
I = process(d)
R(I) = R(I) op d
}
…….
}
![Page 6: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/6.jpg)
Shared Memory Parallelization Techniques
Full Replication: create a copy of the reduction object for each thread
Full Locking: associate a lock with each element
Optimized Full Locking: put the element and corresponding lock on the same cache block
Fixed Locking: use a fixed number of locks Cache Sensitive Locking: one lock for all
elements in a cache block
![Page 7: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/7.jpg)
Memory Layout for Various Locking Schemes
Full Locking Fixed Locking
Optimized Full Locking Cache-Sensitive Locking
Lock Reduction Element
![Page 8: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/8.jpg)
Trade-offs between Techniques
Memory requirements: high memory requirements can cause memory thrashing
Contention: if the number of reduction elements is small, contention for locks can be a significant factor
Coherence cache misses and false sharing: more likely with a small number of reduction elements
![Page 9: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/9.jpg)
Combining Shared Memory and Distributed Memory Parallelization
Distributed memory parallelization by replication of reduction object
Naturally combines with full replication on shared memory
For locking with non-trivial memory layouts, two options
Communicate locks also Copy reduction elements to a separate buffer
![Page 10: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/10.jpg)
Apriori Association Mining
Relative Performance of Full Replication, Optimized Full Locking and Cache-Sensitive Locking
0
20000
4000060000
80000
100000
0.10% 0.05% 0.03% 0.02%
Support Levels
Tim
e(s)
fr
ofl
csl
500MB dataset, N2000,L20, 4 threads
![Page 11: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/11.jpg)
K-means Shared Memory Parallelization
0
200
400
600
800
1000
1200
1400
1600
1thread
4threads
16threads
full repl
opt full locks
cache sens.Locks
![Page 12: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/12.jpg)
Performance on Cluster of SMPs
0
10000
20000
30000
40000
50000
60000
70000
1 node 2nodes
4nodes
8nodes
1 thread 2 threads 3 threads
Apriori Association Mining
![Page 13: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/13.jpg)
Results from EM Clustering Algorithm
EM is a popular data mining algorithm
Can we parallelize it using the same support that worked for other clustering algo (k-means) and algo for other mining tasks
0
500
1000
1500
2000
2500
3000
1 2 4 8nodes
1 thread 2 threads 3 threads 4 threads
![Page 14: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/14.jpg)
Results from FP-tree
0
10
20
30
40
50
60
1 2 4 8
1 thread 2 threads 3 threads
FPtree: 800 MB dataset 20 frequent itemsets
![Page 15: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/15.jpg)
A Case Study: Decision Tree Construction
• Question: can we parallelize decision tree construction using the same framework ?
• Most existing parallel algorithms have a fairly different structure (sorting, writing back …)
• Being able to support decision tree construction will significantly add to the usefulness of the framework
![Page 16: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/16.jpg)
Approach
Implemented RainForest framework (Gehrke) Currently focus on RF-read Overview of the algorithm
While the stop condition not satisfied read the data build the AVC-group for nodes choose the splitting attributes to split nodes select a new set of node to process as long as the main
memory could hold it
![Page 17: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/17.jpg)
Parallelization Strategies
Pure approach: only apply one of full replication, optimized full locking and cache-sensitive locking
Vertical approach: use replication at top levels, locking at lower
Horizontal: use replication for attributes with a small number of distinct values, locking otherwise
Mixed approach: combine the above two
![Page 18: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/18.jpg)
Results
0
500
1000
1500
2000
2500
3000
3500
1 2 4 8
No. of Nodes
Tim
e(s
) fr
ofl
csl
Performance of pure versions, 1.3GB dataset with 32 million records in the training set, function 7, the depth of decision tree = 16.
![Page 19: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/19.jpg)
Results
Combining full replication and full locking
0
500
1000
1500
2000
2500
3000
1 2 4 8
No. of Nodes
Tim
e(s
) horizontal
vertical
mixed
![Page 20: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/20.jpg)
Results
Combining full replication and cache-sensitive locking
0
500
1000
1500
2000
2500
3000
1 2 4 8
No. of Nodes
Tim
e(s
) horizontal
vertical
mixed
![Page 21: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/21.jpg)
Combining Distributed Memory and Shared Memory Parallelization for Decision Tree
The key problem: large size of AVC groups means very high communication volume
Results in very limited speedups Can we modify the algorithm to reduce
communication volume ?
![Page 22: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/22.jpg)
SPIES On (a) FREERIDE Developed a new
communication efficient decision tree construction algorithm – Statistical Pruning of Intervals for Enhanced Scalability (SPIES)
Combines RainForest with statistical pruning of intervals of numerical attributes to reduce memory requirements and communication volume
Does not require sorting of data, or partitioning and writing-back of records
Paper in SDM regular program
0
1000
2000
3000
4000
5000
6000
7000
1node
8nodes
1thread 2threads 3threads
![Page 23: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/23.jpg)
Applying FREERIDE for Scientific Data Mining
Joint work with Machiraju and Parthasarathy
Focusing on feature extraction, tracking, and mining approach developed by Machiraju et al.
A feature is a region of interest in a dataset
A suite of algorithms for extracting and tracking features
![Page 24: FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f1f5503460f94c37d42/html5/thumbnails/24.jpg)
Summary
Demonstrated a common framework for parallelization of a wide range of mining algos
Association mining – apriori and fp-tree Clustering – k-means and EM Decision tree construction Nearest neighbor search
Both shared memory and distributed memory parallelism
A number of advantages Ease parallelization Support higher-level interfaces