1 pipeline and vector processing chapter # 9. 2 contents parallel processing pipelining arithmetic...

28
1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9

Upload: emmeline-deeble

Post on 14-Dec-2015

343 views

Category:

Documents


9 download

TRANSCRIPT

Page 1: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

1

PIPELINE AND VECTOR PROCESSING

CHAPTER # 9

Page 2: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

2

CONTENTS

Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector Processing Array Processors

Page 3: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

3

Figure 9-1

Processor with multiple functional units

Integer multiply

Adder-sub tractor

Floating-pointmultiply

Floating-pointdivide

Floating-pointAdd-subtract

Incrementer

Logic unit

Shift unit

Processorregister

To memory

Page 4: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

4

Instruction and stream.

Single instruction stream, single data stream (SISD).

Single instruction stream, multiple data stream (SIMD).

Multiple instruction stream, single data stream (MISD).

Multiple instruction stream, multiple data stream (MIMD).

Page 5: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

5

Figure 9-2

Example of Pipelining.

Ai Bi Ci

R1 Ai , R2 Bi

Input Ai and Bi

R3 R1 * R2, R4 Ci Multiply and input Ci

R5 R3 + R4 Add Ci to product

R1 R2

Multiplier

R3 R4

Adder

R5

Page 6: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

6

ClockPulsenumber

Segment1R1 R2

Segment2R3 R4

Segment3 R5

1 A1 B1 ---- ---- ----

2 A2 B2 A1*B1 C1 ---- 3 A3 B3 A2*B2 C2 A1*B1+C1

4 A4 B4 A3*B3 C3 A2*B2+C2

5 A5 B5 A4*B4 C4 A3*B3+C3

6 A6 B6 A5*B5 C5 A4*B4+C4

7 A7 B7 A6*B6 C6 A5*B5+C5

8 ---- ---- A7*B7 C7 A6*B6+C6

9 ---- ---- ---- ---- A7*B7+C7

Table 9-1

Content of registers in pipeline example.

Page 7: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

7

Figure 9-3

Four segment pipeline.

Clock

Input R4S1 R3R2 S4S3S2R1

Page 8: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

8

Figure 9-4

Space-time diagram for pipeline.

1 2 3 4 5 6 7 8 9

T1 T2 T3 T4 T5 T6

T1 T2 T3 T4 T5 T6

T1 T2 T3 T4 T5 T6

T1 T2 T3 T4 T5 T6

Clock cycleSegment

:1

2

3

4

Page 9: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

9

Figure 9-5

Multiple functional units in parallel.

P4

Ii+3

P3

Ii+2

P2

Ii+1

P1

Ii

Page 10: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

10

Arithmetic Pipeline

Compare the exponents. Align the mantissas. Add or subtract the mantissas. Normalize the result.

Page 11: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

11

Mantissas Exponents

a b A B

Segment 1

Segment 2

Segment 3

Segment 4

R

CompareExponentBy subtraction

R R

Choose exponent

R

Adjust

Exponent

R

R

Align mantissas

R

R

Add or subtract

mantissas

Normalize

result

R

Difference

Figure 9-6

Pipeline for floating-point and subtraction.

Page 12: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

12

Instruction Pipeline

Fetch the instruction from memory. Decode the instruction. Calculate the effective address. Fetch the operands from memory. Execute the instruction. Store the result in the proper place.

Page 13: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

13

Figure 9-7

Four-segment CPU pipeline.

Segment 1

Segment 2

Segment 3

Segment 4

Decode instructionAnd calculateEffective address

Fetch instruction from memory

Branch?

Fetch operandFrom memory

Execute instruction

Interrupt?Interrupthandling

Update PC

Empty pipe

yes

no

yes

no

Page 14: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

14

FI is the segment that fetches an instruction.

DA is the segment that decodes the instruction and calculate the effective address.

FO is the segment that fetches the operand.

EX is the segment that executes the instruction.

Segments and their purpose.

Page 15: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

15

1 2 3 4 5 6 7 8 9 10 11 12 13

1

2

3

4

5

6

7

Step:

Instruction:

(Branch)

FI DA FO EX

FI DA FO

FO FI DA

EX

EX

EX

EX

EX

EX

FO

FO

FO

FO

DA

DA

DA

DA

FI

FI

FI

FI

FI -- --

-- -- --

Figure 9-8

Timing of instruction pipeline.

Page 16: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

16

Pipeline Conflicts

Resource conflicts Data dependency conflicts Branch difficulties conflicts

Page 17: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

17

Three-segment instruction pipeline

I: Instruction fetch A: ALU operation E: Execute instruction

Page 18: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

18

Delayed Load

LOAD R1 M[address 1]

LOAD R2 M[address 2]

ADD R3 R1+R2

STORE M[address 3] R3

Page 19: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

19

E

654321

I

Clock cycles

I

I

I

A

A

A

A

E

E

E

E

1. Load R1

2. Load R2

3. Add R1+R2

4. Store R3

Pipeline timing with data conflict

7654321 Clock cycles

1. Load R1

2. Load R2

3. No-operation

4. Add R1+R2

5. Store R3

I

I

I

I

I

A

A

A

A

A

E

E

E

E

Pipeline timing with delayed load

Figure 9-9

Three segment pipeline timing.

Page 20: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

20

Figure 9-10

Examples of delayed branch.

I

Clock cycles

I

I

I

A

A

A

A

E

E

E

E

1. Load

2. Increment

3. Add

4. Subtract

10987654321

5. Branch to X

6. NO-operation

7. NO-operation

8. Instruction in X

I A

I A E

E

I A E

I A E

Using no-operation instructions

Page 21: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

21

I

Clock cycles

I

I

A

A

A

E

E

E

1. Load

2. Increment

4. Add

5. Subtract

3. Branch to X

6. Instruction in X

I A

I A E

E

I A E

1 2 3 4 5 6 7 8

Figure 9-10

Examples of delayed branch.

Rearranging instruction

Page 22: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

22

Application of Vector Processing

Long range weather forecasting. Petroleum explorations. Seismic data analysis. Medical diagnosis. Aerodynamics and space flight simulations.

Page 23: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

23

Figure 9-11

Instruction format for vector processor

Operationcode

Base addressSource 1

Base addressSource 2

Base addressdestination

Vectorlength

Page 24: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

24

Figure 9-12

Pipeline for calculating an inner product

SourceA

SourceB

Multiplier pipeline

Adder pipeline

Page 25: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

25

Figure 9-13

Multiple module memory organization

AR AR AR AR

DRDRDRDR

Memoryarray

Memoryarray

Memoryarray

Memoryarray

Address bus

Data bus

Page 26: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

26

Types of Array Processors

Attached Array Processor SIMD Array Processor

Page 27: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

27

Figure 9-14

Attached Array Processor with host computer

General-Purposecomputer

input-outputinterface

Attached arrayprocessor

Local memoryMain memoryHigh-speed memory to

Memory bus

Page 28: 1 PIPELINE AND VECTOR PROCESSING CHAPTER # 9. 2 CONTENTS Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector

28

Figure 9-15

SIMD array processor organization

Master controlunit

Main memory

PE1

PE2

PE3

PEn

M1

M2

M3

Mn