chapter 6 multiprocessor system. introduction each processor in a multiprocessor system can be...

24
Chapter 6 Multiprocessor System

Upload: carol-fleming

Post on 01-Jan-2016

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

Chapter 6 Multiprocessor

System

Page 2: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

Introduction Each processor in a multiprocessor system

can be executing a different instruction at any time.

The major advantages of MIMD system– Reliability– High performance

The overhead involved with MIMD– Communication between processors– Synchronization of the work – Waste of processor time if any processor runs out of

work to do– Processor scheduling

Page 3: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

Introduction (continued) task

– An entity to which a processor is assigned– a program, a function or a procedure in

execution process

– another word for a task processor (or processing element)

– hardware resource on which tasks are executed

Page 4: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

Introduction (continued) Thread

– The sequence of tasks performed in succession by a given processor

– The path of execution of a processor through a number of tasks.

– Multiprocessors provide for the simultaneous presence of a number of threads of execution in an application.

– Refer to Example 6.1 (degree of parallelism =3)

Page 5: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

R-to-C ratio A measure of how much overhead is

produced per unit of computation.– R: the length of the run time of the task

(=computation time)– C: the communication overhead

This ratio signifies task granularity A high R-to-C ratio implies that

communication overhead is insignificant compared to computation time.

Page 6: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

Task granularity Task granularity

– Coarse grain parallelism High R-to-C ratio

– Fine grain parallelism Low R-to-C ratio

– The general tendency to maximum performance is to resort to the finest possible granularity. providing for the highest degree of parallelism.

– Maximum parallelism does not lead to maximum overhead. a trade-off is required to reach an optimum level.

Page 7: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.1 MIMD Organization(Figure 6.2)

Two popular MIMD organizations– Shared memory (or tightly coupled )

architecture– Message passing (or loosely coupled)

architecture Share memory architecture

– UMA (uniform memory architecture)– Rapid memory access– Memory contention

Page 8: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.1 MIMD Organization (continued)

Message-passing architecture– Distributed memory MIMD system– NUMA (nonuniform memory access)– Heavy communication overhead for

remote memory access– No memory contention problem

Other models– Mixed of two

Page 9: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.2 Memory Organization Two parameters of interest in MIMD

memory system design– bandwidth – latency.

Memory latency is reduced by increasing the memory bandwidth.– By building the memory system with multiple

independent memory modules (Banked and interleaved memory architecture)

– By reducing the memory access and cycle times

Page 10: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

Multi-port memories Figure 6.3 (b)

– Each memory module is a three-port memory device.

– All three ports can be active simultaneously.

– The only restriction is that only one location can be write data into a memory location.

Page 11: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

Cache incoherence The problem wherein the value of a data

item is not consistent throughout the memory system.– Write-through

A processor updates the cache and also the corresponding entry in the main memory.

– Updating protocol– Invalidating protocol

– Write-back An updated cache-block is written back to the main

memory just before that block is replaced in the cache.

Page 12: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.2 Memory Organization (continued)

Cache coherence schemes– Not to use private caches (Figure 6.4)– With private cache architecture, but to

cache only non-sharable data items.– Cache flushing

Shared data are allowed to be cached only when it is known that only one processor will be accessing the data

Page 13: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.2 Memory Organization (continued)

Cache coherence schemes (continued)– Bus watching (or bus snooping) (Figure

6.5) Bus watching schemes incorporate hardware

that monitors the shared bus for data LOAD and STORE into each processor’s cache controller.

– Write-once The first STORE causes a write-through to the

main memory.

Ownership protocol

Page 14: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.3 Interconnection Network

Bus (Figure 6.6)– Bus window (Figure 6.7(a))– Fat tree (Figure 6.7 (b))

Loop or ring– token ring standard

Mesh

Page 15: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.3 Interconnection Network(continued)

Hypercube– Routing is straightforward.– The number of nodes must be increased

by powers of two. Crossbar

– It offers multiple simultaneous communications but at a high hardware complexity.

Multistage switching networks

Page 16: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.4 Operating System Considerations

The major functions of the multiprocessor system– Keeping track of the status of all the resources at

all time– Assigning tasks to processors in a justifiable

manner– Spawning and creating new processors such that

they can be executed in parallel or independently of each other.

– Collecting their individual results when all the spawned processed are completed and passing them to other processors as required.

Page 17: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.4 Operating System Considerations

(continued) Synchronization mechanisms

– Processes in an MIMD operate in a cooperative manner and a sequence control mechanism is needed to ensure the ordering of operations.

– Processes compete with each other to gain access to shared data items.

– An access control mechanism is needed to maintain orderly access

Page 18: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.4 Operating System Considerations

(continued) Synchronization mechanisms

– The most primitive synchronization techniques Test & set Semaphores Barrier synchronization Fetch & add

Heavy-weight process and Light-weight process

Scheduling – Static– Dynamic : load balancing

Page 19: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.5 Programming (continued)

Four main structures of parallel programming– Parbegin / parend– Fork / join– Doall– Processes, tasks, procedures, and so

on can be declared for parallel execution.

Page 20: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.6 Performance Evaluation and Scalability Performance evaluation

– Speed-up : S = Ts / Tp To= TpP-Ts Tp=(To+Ts)/P S = Ts P/(To+Ts)– Efficiency : E = S/p = Ts/(Ts+To) = 1/(1+To/Ts)

Page 21: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

Scalability Scalability: the ability to increase

speedup as the number of processors increase.

A parallel system is scalable if its efficiency can be maintained at a fixed value by increasing the number of processors as the problem size increases.– Time-constrained scaling– Memory-constrained scaling

Page 22: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

Isoefficiency function E = 1/(1+To/Ts) To/Ts=(1-E)/E. Hence, Ts=ETo/(1-E) For a given value of E, E/(1-E) is a

constant, K. Then Ts=KTo (Isoefficency function) A small isoeffiency function indicates

that small increments in problem size are sufficient to maintain efficiency when p is increased.

Page 23: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

6.6 Performance Evaluation and Scalability

(continued) Performance models

– The basic model Each task is equal and takes R time units to be

executed on a processor. If two tasks on different processors wish to

communicate with each other, they do so at a cost C time units.

– Model with linear communication overhead– Model with overlapped communication– Stochastic model

Page 24: Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The

Examples Alliant FX series

– Figure 6.17– Parallelism

Instruction level Loop level Task level