a performance analysis of multicore computer architectures michel schelske
TRANSCRIPT
A performance analysis of multicore computer
architectures
Michel Schelske
2Parallel Algorithms for Multicore Benchmarking10. Apr. 2008
Structure
1. Observations & Theory
2. Problems
3. Solution
3Parallel Algorithms for Multicore Benchmarking10. Apr. 2008
Observation I
clock rate
cores
performance
3-4 GHz
cores
frequency
performance
2003 20072005
1
2
4
Multi
performance
4Parallel Algorithms for Multicore Benchmarking10. Apr. 2008
Observation II
program
thread thread
partitioning
granularity
5Parallel Algorithms for Multicore Benchmarking10. Apr. 2008
Theory
Optimum
depends on hardware
and the problem to be solved
granularity
perf
orm
ance
coarse-grain fine-grain
6Parallel Algorithms for Multicore Benchmarking10. Apr. 2008
Example
7Parallel Algorithms for Multicore Benchmarking10. Apr. 2008
Observation III
Optimum
depends on hardware
and the problem to be solved
coarse-grain
granularity
perf
orm
ance
fine-grain
8Parallel Algorithms for Multicore Benchmarking10. Apr. 2008
Observation IV
Optimum
depends on hardware
and the problem to be solved
granularity
perf
orm
ance
coarse-grain fine-grain
9Parallel Algorithms for Multicore Benchmarking10. Apr. 2008
The problems
• Granularity is only one performance parameter.
• Find the optimal parallelization parameters with respect to– the algorithm– the computer architecture
10Parallel Algorithms for Multicore Benchmarking10. Apr. 2008
Our Solution
hardware
core core core core core
operating system
Application
Profiler
Benchmark
11Parallel Algorithms for Multicore Benchmarking10. Apr. 2008
Thank you for your attention