understanding the tigersharc alu pipeline determining the speed of one stage of iir filter

30
Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Post on 20-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Understanding the TigerSHARC ALU pipeline

Determining the speed of one stage of IIR filter

Page 2: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

2 / 3004/18/23

Understanding the TigerSHARC ALU pipeline TigerSHARC has many pipelines If these pipelines stall – then the processor

speed goes down Need to understand how the ALU pipeline works

Learn to use the pipeline viewer

May be different answer for floating point and integer operations

Page 3: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

3 / 3004/18/23

Register File and COMPUTE Units

Page 4: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

4 / 3004/18/23

Simple ExampleIIR -- Biquad For (Stages = 0 to 3) Do

S0 = Xin * H5 + S2 * H3 + S1 * H4 Yout = S0 * H0 + S1 * H1 + S2 * H2 S2 = S1 S1 = S0

S0

S1

S2

Page 5: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

5 / 3004/18/23

Set up the tests. Want to make sure correct answer as code changes

Page 6: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

6 / 3004/18/23

Step 1 – Stub plus return value

Build an assembly language stub forfloat iirASM(void);

Make it return a floating point value of 40.5 to show that we can return a value of 40.5

J8 is an INTEGERso how can we return 40.5?

ANSWER – WE DON’TWe return the “bit pattern” for 40.5, which is “INTEGER”

Page 7: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

7 / 3004/18/23

Code does not work when passing back floats with J8 register

Page 8: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

8 / 3004/18/23

Code does work when using XR8 register – NOTE NOT XFR8

Page 9: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

9 / 3004/18/23

Step 2 – Using C++ code as comments set up the coefficients

XFR0 = 0.0;;Does not exist

XR0 = 0.0;;DOES EXIST

Bit-patternsrequireintegerregisters

Leave what youwanted to dobehind ascomments

Page 10: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

10 / 3004/18/23

Page 11: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

11 / 3004/18/23

Modify C++ code so that it can be translated into assembly code

Can only have1 instruction per line

Code must execute sequentially so remember the ;;

Page 12: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

12 / 3004/18/23

Start with S0 = Xin instruction

Can’t use

XFR8 = XFR6

to copy a register

Page 13: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

13 / 3004/18/23

Since XFR8 = XFR6 is not allowedTry XR8 = R6;

SIMD Single instruction Multiple Data

R6 means move XR6 and YR6 (Multiple data move described in 1 instruction)

Try XR8 = XR6

Page 14: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

14 / 3004/18/23

Some operationsare FLOAToperations and must have XFR on left sideof equationBUT only R on the right

Some operations areSISD operationsand must haveXR on both side of theequation (or just R on both sidesof the equation makingthem SIMD X and Y withgarbage happening on Y)

Personally, I thinkall these problemsare “assembler” issuesand could be madeconsistent

Page 15: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

15 / 3004/18/23

Disconnect from target and go to simulator

Page 16: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

16 / 3004/18/23

Activate Simulator

Page 17: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

17 / 3004/18/23

Rebuild the project and set breakpoints at start and end of ASM code

Page 18: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

18 / 3004/18/23

Activate the pipeline viewer

Page 19: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

19 / 3004/18/23

Adjust the pipeline window so can see all the instruction pipeline stages

Page 20: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

20 / 3004/18/23

PIPELINE STAGESSee page 8-34 of Processor manual

Instruction fetch -- F1, F2, F3 and F4Fetch Unit Pipe – memory driven128 bits fetched – may make up 1, 2, 3, or 4

instructions (or parts of a couple instructions Instructions into IAB, instruction alignment

buffer Integer ALU pipe – PD, D, I and A

Page 21: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

21 / 3004/18/23

PIPELINE STAGESSee page 8-34 of Processor manual

10 pipeline stages, but may be completely desynchronized (happen semi-indepently)

Instruction fetch -- F1, F2, F3 and F4 Integer ALU – PreDecode, Decode,

Integer, Access Compute Block – EX1 and EX2

Page 22: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

22 / 3004/18/23

PIPELINE STAGESSee page 8-34 of Processor manual

Instruction fetch -- F1, F2, F3 and F4Fetch Unit Pipe Memory driven not instruction driven128 bits fetched – may make up 1, 2, 3, or 4

instruction lines (or parts of a couple of instruction lines

Instruction fetched into IAB, instruction alignment buffer

Page 23: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

23 / 3004/18/23

PIPELINE STAGESSee page 8-34 of Processor manual

Integer ALU pipe – PD, D, I and A PreDecode – the next COMPLETE instruction line (1,

2, 3 or 4 ) fetched from IAB Decode – different instructions dispatched to different

execution units (J-IALU, K-IALU, Compute Blocks) Data memory access start in Integer stage A stands for Access stage Results are not available EX2 stage, but (by register

forwarding) can be sometimes accessed earlier

Page 24: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

24 / 3004/18/23

PIPELINE STAGESSee page 8-34 of Processor manual

Compute BlockEX1 and EX2Result is always written to the target register

on the rising edge of CCLK after stage EX2Following guaranteed

R2 = R0 + R1; R6 = R2 * R3;;

R2 at end of instruction R2 value at beginning of instruction used

Page 25: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

25 / 3004/18/23

Only interested in later stages of the pipeline. Adjust properties

Page 26: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

26 / 3004/18/23

Run the code till first ASM break point: Note cycle Number 39830

Then runagain tillreach second ASM breakpoint

Calculate execution time

Page 27: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

27 / 3004/18/23

Page 28: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

28 / 3004/18/23

Pipeline viewer says 26 cyclesbut what do we expect

8 cycles

Page 29: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

29 / 3004/18/23

Pipeline viewer says 26 cyclesbut what do we expect -- 21

13 cycles expected

Where are theextra cycles comingfrom

and how easy is itto code in such a way that the extracycles can beremoved

ANSWERFairly straight forwardin idea, can be difficult in practice

Page 30: Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter

Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada

30 / 3004/18/23

Understanding the TigerSHARC ALU pipeline TigerSHARC has many pipelines If these pipelines stall – then the processor

speed goes down Need to understand how the ALU pipeline works

Learn to use the pipeline viewer

May be different answer for floating point and integer operations