1 phd defense presentation managing shared resources in chip multiprocessor memory systems 12....
TRANSCRIPT
![Page 1: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/1.jpg)
1
PhD Defense Presentation
Managing Shared Resources in Chip Multiprocessor Memory Systems
12. October 2010
Magnus Jahre
![Page 2: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/2.jpg)
2
Outline
• Chip Multiprocessors (CMPs)
• CMP Resource Management
• Miss Bandwidth Management– Greedy Miss Bandwidth Management– Interference Measurement– Model-Based Miss Bandwidth Management
• Off-Chip Bandwidth Management
• Conclusion
![Page 3: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/3.jpg)
3
CHIP MULTIPROCESSORS
![Page 4: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/4.jpg)
4
Historical Processor Performance
19
78
19
80
19
82
19
84
19
86
19
88
19
90
19
92
19
94
19
96
19
98
20
00
20
02
20
04
20
06
20
08
20
10
1
10
100
1000
10000
100000
1000000
Processor Performance Components Per Chip (Moore's Law)
Year
No
rma
lize
d V
alu
e
Technology scaling is used to increase
clock frequency
Technology scaling is used to add processor
cores
Moore’s Law: 50% annual increase in thenumber of components per chip
52% Annual Performance Increase20% Annual Performance
Increase
Aggregate performance still follows Moore’s Law
![Page 5: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/5.jpg)
5
Power Dissipation Limits Practical Clock Frequency
Pe
ntiu
m
Pe
ntiu
m M
MX
Pe
ntiu
m II
Pe
ntiu
m II
I
Pe
ntiu
m 4
Pe
ntiu
m D
Co
re 2
Co
re i3
Co
re i5
Co
re i7
020406080
100120
Intel Processor Families
Th
erm
al
De
sig
n P
ow
er
(W)
Pe
ntiu
m
Pe
ntiu
m M
MX
Pe
ntiu
m II
Pe
ntiu
m II
I
Pe
ntiu
m 4
Pe
ntiu
m D
Co
re 2
Co
re i3
Co
re i5
Co
re i7
00.5
11.5
22.5
33.5
Intel Processor Family
Clo
ck
Fre
qu
en
cy
(G
Hz)
Source: Wikipedia, List of CPU Power Dissipation, Retrieved 02.06.10
Technology scaling increases clock
frequency
Technology scalingadds processor cores
Power Wall
![Page 6: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/6.jpg)
6
Chip Multiprocessors (CMPs)
• CMPs utilize chip resources with a constant power budget
• How does technology scaling impact CMPs?
Intel Nehalem
![Page 7: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/7.jpg)
7
Projected Number of Cores
2007 2008 2009 2010 2011 2012 2013 2014 2015 20160
10
20
30
40
50
60
70
80
90
ITRS Year of Production
Nu
mb
er
of
Co
res
ITRS expects 40% annual increase
Observation 2: Software parallelism is needed in the long-term
Observation 1: Multiprogramming can
provide near-term throughput improvement
![Page 8: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/8.jpg)
8
Processor Memory Gap1
97
8
19
80
19
82
19
84
19
86
19
88
19
90
19
92
19
94
19
96
19
98
20
00
20
02
20
04
20
06
20
08
20
10
1
10
100
1000
10000
100000
Main Memory Latency Processor Performance
Year
Re
lati
ve
Pe
rfo
rma
nc
e
7% Annual Memory Latency Improvement
Me
mo
ry W
all
Observation 3: Latency hiding techniques are necessary
![Page 9: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/9.jpg)
9
Performance vs. Bandwidth
2007 2008 2009 2010 2011 2012 2013 2014 20150
5
10
15
20
25
Processor Performance Off-Chip Bandwidth
ITRS Year of Production
Re
lati
ve
P
erf
orm
an
ce
/Ba
nd
-w
idth
Observation 4: Bandwidth must be used efficiently
![Page 10: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/10.jpg)
10
Software parallelism
Multi-programming
Latency hiding
Bandwidth efficiency
Concurrent applications share
hardware
Complex Memory Systems
Shared Resource Management
Application Trends Hardware Trends
![Page 11: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/11.jpg)
11
CMP RESOURCE MANAGEMENT
![Page 12: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/12.jpg)
12
Why Manage Shared Resources?
Provide predictable performance
Support OS scheduler assumptions
Cloud: Fulfill Service Level Agreement
![Page 13: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/13.jpg)
13
Performance Variability Metrics
• Fairness– The performance reduction due to interference between processes
is distributed across all processes in proportion to their priorities– Equal priorities: Performance reduction from sharing affects all
processes equally
• Quality of Service– The performance of a process is never drops below a certain limit
regardless of the behavior of co-scheduled processes
![Page 14: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/14.jpg)
14
Performance Variability (Fairness)
1 4 7 10 13 16 19 22 25 28 31 34 37 400
0.2
0.4
0.6
0.8
1
1.2
Crossbar-Based, 1 channel Ring-Based, 1 channel Crossbar-Based, 2 channelsRing-Based, 2 channels Crossbar-Based, 4 channels Ring-Based, 4 channels
Number of Workloads
Lo
wes
t F
arin
ess
Val
ue
Paper B.I
![Page 15: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/15.jpg)
15
Resource Management Tasks
Measurement
Allocation(Policy)
Enforcement(Mechanism)
![Page 16: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/16.jpg)
16
Mis
s B
and
wid
th
Ma
na
ge
men
tP
refe
tch
S
ch
ed
ulin
g
Off-line InterferenceMeasurement
On-line InterferenceMeasurement
Dynamic Miss Handling
Architecture
Greedy Miss Bandwidth Allocation
Performance Model Based Miss
Bandwidth Allocation
Low-Cost Open Page Prefetching
Opportunistic Prefetch Scheduling
Contributions
![Page 17: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/17.jpg)
17
GREEDY MISS BANDWIDTH MANAGEMENT
Miss Bandwidth Management
![Page 18: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/18.jpg)
18
Conventional Resource Allocation Implementation
CPU 1
Cro
ssba
r
MainMemory
Memory Bus
D-Cache
I-Cache
CPU 2D-Cache
I-Cache
CPU 3D-Cache
I-Cache
CPU 4D-Cache
I-CacheS
hare
d C
ach
e
Mem
ory
Co
ntro
ller
Private Memory System
Measurement
Allocation
Enforcement
![Page 19: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/19.jpg)
19
Alternative Resource Allocation Implementation
CPU 1
Cro
ssba
rMain
MemoryMemory Bus
D-Cache
I-Cache
CPU 2D-Cache
I-Cache
CPU 3D-Cache
I-Cache
CPU 4D-Cache
I-Cache
Sha
red
Cac
he
Mem
ory
Con
trol
ler
Private Memory System
Measurement
Allocation
Enforcement
4
Dynamic Miss Handling Architecture
![Page 20: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/20.jpg)
20
A
D
B
E
C
1
1
10
0
0
1
Cache
A
B
C
D 1
E 1
Miss Handling Architecture (MHA)
Address Target Info. U
Dynamic Miss Handling Architecture
A
D
B
E
C
Accesses
Cache is blocked
Tim
e
A DMHA controls the number of concurrent shared memory system requests that are allowed for each processor
![Page 21: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/21.jpg)
21
Greedy Miss Bandwidth Management• Idea: Reduce the number of MSHRs if a metric
exceeds a certain threshold
• Metrics:– Paper A.II: Memory bus utilization– Paper A.III: Simple interference counters (Interference Points)
• Performance feedback avoids excessive performance degradations
Paper A.II and A.III
![Page 22: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/22.jpg)
22
INTERFERENCE MEASUREMENT
Miss Bandwidth Management
![Page 23: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/23.jpg)
23
Resource Allocation Baselines
Baseline = Interference-free configuration
Quantify performance impact from interference
Private Mode and Shared Mode
![Page 24: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/24.jpg)
24
Interference Definition
InterferencePrivate Mode
Latency
Estimate ErrorPrivate
Mode Latency Measurement
Shared Mode Latency
PrivateMode Latency
Estimate
![Page 25: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/25.jpg)
25
Offline Interference Measurement
Interference Penalty Frequency (IPF) counts the number requests that experienced an interference latency of i cycles
Interference Impact Factor (IIF) is the interference latency times the probability of it arising, i.e. IIF(i) = i ∙ P(i)
Paper B.I
![Page 26: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/26.jpg)
26
Aggregate Interference Impact
Paper B.I
CB
1
CB
2
CB
4
Rin
g 1
Rin
g 2
Rin
g 4
CB
1
CB
2
CB
4
Rin
g 1
Rin
g 2
Rin
g 4
CB
1
CB
2
CB
4
Rin
g 1
Rin
g 2
Rin
g 4
4-core CMP 8-core CMP 16-core CMP
0
50
100
150
200
250
300
Memory Bus Cache Interconnect
![Page 27: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/27.jpg)
27
Resource Management Baselines
Processor A
Processor B
Shared Cache
Multiprogrammed Baseline (MPB)
Interconnect Memory Bus
Processor A
Processor B
Shared Cache
Processor A
Processor B
Shared Cache
Single Program Baseline (SPB)
Interconnect Memory Bus
Interconnect Memory Bus
A
B
![Page 28: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/28.jpg)
28
Baseline Weaknesses
• Multiprogrammed Baseline– Only accounts for interference in partitioned resources– Static and equal division of DRAM bandwidth does not give equal
latency– Complex relationship between resource allocation and performance
• Single Program Baseline– Does not exist in shared mode
Online Interference MeasurementDynamic Interference Estimation
Framework (DIEF)Paper B.II
![Page 29: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/29.jpg)
29
Online Interference Measurement
• Dynamic Interference Estimation Framework (DIEF)
• Estimates private mode average memory latency
• General, component-based framework
Paper B.II
![Page 30: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/30.jpg)
30
Shared Cache InterferenceAuxiliary Tag Directories
CP
U 0
CP
U 1
Cache Accesses:
Shared Cache
M N
A M B NMiss
Hit
C A B C
B D N
B MAA M B CD B A
Eviction may not be interference
Interference latency cost = miss penalty
D CB A
Eviction is interference
![Page 31: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/31.jpg)
31
Bus Interference Requirements
• Out-of-order memory bus scheduling• Shared mode only cache misses and cache hits• Shared cache writebacks
Computing private latency based on shared mode queue contents is difficult
Emulate private scheduling in the shared mode
![Page 32: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/32.jpg)
32
E D
Shared Bus Queue
C B
D C B A
1202004040
Arrival Order
Head Pointer
Execution Order
15
32
Latency Lookup Table
Bank 0
Bank 1
...
...
Open Page Emulation Registers
Memory Latency Estimation Buffer
Bank/ Page Mapping: A à (0,15), B à (0,19), C à (0,15), D à (1,32)
Estimated Queue Latency 120 40 40+ +=
BCD 40200
![Page 33: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/33.jpg)
33
MODEL-BASED MISS BANDWIDTH MANAGEMENT
Miss Bandwidth Management
![Page 34: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/34.jpg)
34
Model-Based Miss Bandwidth Allocation
DIEF provides accurate estimates of the average private mode memory latency
Can we use the estimates provided by DIEF to choose miss bandwidth allocations?
We need a model that relates average memory latency to performance
Paper A.IV
![Page 35: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/35.jpg)
35
Performance Model
Paper A.IV
Observation: The memory latency performance impact depends on the parallelism of memory requests
Very similar in private and shared mode
Shared mode measurements can provide private mode
performance estimates
![Page 36: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/36.jpg)
36
Bandwidth Management Flow
Paper A.IV
Measurement Modeling Allocation
Shared ModeMemory Latency
Private ModeMemory Latency
CommittedInstructions
Number ofMemory Requests
CPU Stall Time
Per-CPU Models
Perf. Metric Model
Find MSHR allocation that maximizes the
chosen performance metric
Set number of MSHRs for all last-level private caches
![Page 37: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/37.jpg)
37
OFF-CHIP BANDWIDTH MANAGEMENT
![Page 38: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/38.jpg)
38
Modern DRAM Interfaces
• Maximize bandwidth with 3D organization
• Repeated requests to the row buffer are very efficient
Row address
Column address
DRAM
Banks
Row Buffer
Rows
Co
lum
ns
![Page 39: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/39.jpg)
39
Low-Cost Open Page Prefetching• Idea: Piggyback
prefetches to open DRAM pages on demand reads
• Performance win if prefetcher accuracy is above ~40%
Paper C.I
![Page 40: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/40.jpg)
40
Opportunistic Prefetch Scheduling
Page Vector Table (PVT)
99
100
101
102
Demand Access
Prefetch Request
Idea: Issue prefetches when a page is closed
Increased efficiency: 8 transfers for 3 activations
Issued Prefetch
Paper C.II
A
A
A
![Page 41: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/41.jpg)
41
CONCLUSION
![Page 42: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/42.jpg)
42
Conclusion
• Managing bandwidth allocations can improve CMP system performance
• Miss bandwidth management– Greedy allocations– Management guided by accurate measurements and performance
models
• Off-chip bandwidth management with prefetching
![Page 43: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/43.jpg)
43
Thank You
Visit our website:http://research.idi.ntnu.no/multicore/
![Page 44: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/44.jpg)
44
EXTRA SLIDES
![Page 45: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/45.jpg)
45
Future Work
• Performance-directed management of shared caches and the memory bus
• Improving OS and system software with dynamic measurements
• Combining dynamic MHAs with prefetching to improve system performance
• Managing workloads of single-threaded and multi-threaded benchmarks
![Page 46: 1 PhD Defense Presentation Managing Shared Resources in Chip Multiprocessor Memory Systems 12. October 2010 Magnus Jahre](https://reader037.vdocument.in/reader037/viewer/2022110304/551940cf5503467e738b45a0/html5/thumbnails/46.jpg)
46
Example Chip Multiprocessor
CPU 1
Inte
rcon
nect
MainMemory
MemoryBus
D-Cache
I-Cache
CPU 2D-Cache
I-Cache
CPU 3D-Cache
I-Cache
CPU 4D-Cache
I-Cache
Sha
red
Cac
he
Mem
ory
Con
trol
ler
Private Memory System Shared Memory System