understanding the tigersharc alu pipeline determining the speed of one stage of iir filter
Post on 20-Dec-2015
219 views
TRANSCRIPT
Understanding the TigerSHARC ALU pipeline
Determining the speed of one stage of IIR filter
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
2 / 3004/18/23
Understanding the TigerSHARC ALU pipeline TigerSHARC has many pipelines If these pipelines stall – then the processor
speed goes down Need to understand how the ALU pipeline works
Learn to use the pipeline viewer
May be different answer for floating point and integer operations
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
3 / 3004/18/23
Register File and COMPUTE Units
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
4 / 3004/18/23
Simple ExampleIIR -- Biquad For (Stages = 0 to 3) Do
S0 = Xin * H5 + S2 * H3 + S1 * H4 Yout = S0 * H0 + S1 * H1 + S2 * H2 S2 = S1 S1 = S0
S0
S1
S2
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
5 / 3004/18/23
Set up the tests. Want to make sure correct answer as code changes
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
6 / 3004/18/23
Step 1 – Stub plus return value
Build an assembly language stub forfloat iirASM(void);
Make it return a floating point value of 40.5 to show that we can return a value of 40.5
J8 is an INTEGERso how can we return 40.5?
ANSWER – WE DON’TWe return the “bit pattern” for 40.5, which is “INTEGER”
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
7 / 3004/18/23
Code does not work when passing back floats with J8 register
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
8 / 3004/18/23
Code does work when using XR8 register – NOTE NOT XFR8
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
9 / 3004/18/23
Step 2 – Using C++ code as comments set up the coefficients
XFR0 = 0.0;;Does not exist
XR0 = 0.0;;DOES EXIST
Bit-patternsrequireintegerregisters
Leave what youwanted to dobehind ascomments
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
10 / 3004/18/23
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
11 / 3004/18/23
Modify C++ code so that it can be translated into assembly code
Can only have1 instruction per line
Code must execute sequentially so remember the ;;
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
12 / 3004/18/23
Start with S0 = Xin instruction
Can’t use
XFR8 = XFR6
to copy a register
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
13 / 3004/18/23
Since XFR8 = XFR6 is not allowedTry XR8 = R6;
SIMD Single instruction Multiple Data
R6 means move XR6 and YR6 (Multiple data move described in 1 instruction)
Try XR8 = XR6
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
14 / 3004/18/23
Some operationsare FLOAToperations and must have XFR on left sideof equationBUT only R on the right
Some operations areSISD operationsand must haveXR on both side of theequation (or just R on both sidesof the equation makingthem SIMD X and Y withgarbage happening on Y)
Personally, I thinkall these problemsare “assembler” issuesand could be madeconsistent
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
15 / 3004/18/23
Disconnect from target and go to simulator
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
16 / 3004/18/23
Activate Simulator
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
17 / 3004/18/23
Rebuild the project and set breakpoints at start and end of ASM code
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
18 / 3004/18/23
Activate the pipeline viewer
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
19 / 3004/18/23
Adjust the pipeline window so can see all the instruction pipeline stages
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
20 / 3004/18/23
PIPELINE STAGESSee page 8-34 of Processor manual
Instruction fetch -- F1, F2, F3 and F4Fetch Unit Pipe – memory driven128 bits fetched – may make up 1, 2, 3, or 4
instructions (or parts of a couple instructions Instructions into IAB, instruction alignment
buffer Integer ALU pipe – PD, D, I and A
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
21 / 3004/18/23
PIPELINE STAGESSee page 8-34 of Processor manual
10 pipeline stages, but may be completely desynchronized (happen semi-indepently)
Instruction fetch -- F1, F2, F3 and F4 Integer ALU – PreDecode, Decode,
Integer, Access Compute Block – EX1 and EX2
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
22 / 3004/18/23
PIPELINE STAGESSee page 8-34 of Processor manual
Instruction fetch -- F1, F2, F3 and F4Fetch Unit Pipe Memory driven not instruction driven128 bits fetched – may make up 1, 2, 3, or 4
instruction lines (or parts of a couple of instruction lines
Instruction fetched into IAB, instruction alignment buffer
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
23 / 3004/18/23
PIPELINE STAGESSee page 8-34 of Processor manual
Integer ALU pipe – PD, D, I and A PreDecode – the next COMPLETE instruction line (1,
2, 3 or 4 ) fetched from IAB Decode – different instructions dispatched to different
execution units (J-IALU, K-IALU, Compute Blocks) Data memory access start in Integer stage A stands for Access stage Results are not available EX2 stage, but (by register
forwarding) can be sometimes accessed earlier
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
24 / 3004/18/23
PIPELINE STAGESSee page 8-34 of Processor manual
Compute BlockEX1 and EX2Result is always written to the target register
on the rising edge of CCLK after stage EX2Following guaranteed
R2 = R0 + R1; R6 = R2 * R3;;
R2 at end of instruction R2 value at beginning of instruction used
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
25 / 3004/18/23
Only interested in later stages of the pipeline. Adjust properties
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
26 / 3004/18/23
Run the code till first ASM break point: Note cycle Number 39830
Then runagain tillreach second ASM breakpoint
Calculate execution time
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
27 / 3004/18/23
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
28 / 3004/18/23
Pipeline viewer says 26 cyclesbut what do we expect
8 cycles
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
29 / 3004/18/23
Pipeline viewer says 26 cyclesbut what do we expect -- 21
13 cycles expected
Where are theextra cycles comingfrom
and how easy is itto code in such a way that the extracycles can beremoved
ANSWERFairly straight forwardin idea, can be difficult in practice
Speed IIR -- stage 1, M. Smith, ECE, University of Calgary, Canada
30 / 3004/18/23
Understanding the TigerSHARC ALU pipeline TigerSHARC has many pipelines If these pipelines stall – then the processor
speed goes down Need to understand how the ALU pipeline works
Learn to use the pipeline viewer
May be different answer for floating point and integer operations