static ilp static (compiler based) scheduling
DESCRIPTION
Static ILP Static (Compiler Based) Scheduling. Σημειώσεις UW-Madison Διαβάστε κεφ. 4 βιβλίο, και Paper on Itanium στην ιστοσελίδα. Today’s Theme and Contents. Let compiler uncover the ILP Objective:more ilp/simpler hardware/faster clock/less power How: Static Scheduling Loop Unrolling - PowerPoint PPT PresentationTRANSCRIPT
StaticILP.12/12/02
Static ILP Static (Compiler Based) Scheduling
• Σημειώσεις UW-Madison• Διαβάστε κεφ. 4 βιβλίο, και• Paper on Itanium στην ιστοσελίδα
StaticILP.22/12/02
Today’s Theme and Contents
• Let compiler uncover the ILP– Objective:more ilp/simpler hardware/faster clock/less power
• How:– Static Scheduling– Loop Unrolling– software pipelining,– Static Multiple Issue: VLIW
» local, global scheduling» static branch prediction» software speculation: trace scheduling, superblocks» nops, lockstep» conditional moves,predication» speculative loads
• IA-64 and Itanium
StaticILP.32/12/02
Basic Idea
• The compiler moves dependent instructions apart to avoid hazards
• This means:– such instructions exist (if not there employ
transformations)– the compiler knows implementation details
» latency AND superscalarity (issue width)
• What happens if implementation changes?
• Static ILP applicable to statically and dynamically scheduled processors
• Statically scheduled processors: the compiler dictates which instructions can execute together (scheduling done in software)
StaticILP.42/12/02
(Local Scheduling)
StaticILP.52/12/02
(Local Scheduling)
StaticILP.62/12/02
StaticILP.72/12/02
StaticILP.82/12/02
StaticILP.92/12/02
StaticILP.102/12/02
StaticILP.112/12/02
StaticILP.122/12/02
StaticILP.132/12/02
StaticILP.142/12/02
StaticILP.152/12/02
(useful for large iteration counts)
StaticILP.162/12/02
Software speculation/Global Scheduling
StaticILP.172/12/02
StaticILP.182/12/02
HOW??
Static prediction, profile, frequency, pathWhich is better the above or dynamic prediction
StaticILP.192/12/02
StaticILP.202/12/02
Register pressure
StaticILP.212/12/02
Superblocking: overcomes some of the complexities of trace schedulingsingle vs multiple entry
StaticILP.222/12/02
StaticILP.232/12/02
StaticILP.242/12/02
StaticILP.252/12/02
Does noy have
StaticILP.262/12/02
StaticILP.272/12/02
StaticILP.282/12/02
PentiumIV +3GHz vs Itanium 1GHz
StaticILP.292/12/02
LockStep: any hazard stall / NOPs if not enough //ism
StaticILP.302/12/02
StaticILP.312/12/02
Predicated Execution &Conditional Moves
Convert control dependences to data dependences
if (a=0) s=t;R1 R2 R3
bnez R1,Laddu R2,R3,0
L:
cmovz R2,R3,R1
Above for all itypes is called predication…
+/-?
StaticILP.322/12/02
Speculative Loads
Bypass stores speculative - repair code in case ofmispeculationUse an address buffer
1. LookUp Table: updated by address of speculative load
2. Updated by addresses of intervening stores
3. Check instruction that no store conflicted and release
entry
StaticILP.332/12/02
StaticILP.342/12/02
StaticILP.352/12/02
StaticILP.362/12/02
StaticILP.372/12/02
StaticILP.382/12/02
Let the compiler do the work
• All• Most of it• As long as it improves performance• …
StaticILP.392/12/02
by Harsh Sharangpani and Ken Arora
see web page
StaticILP.402/12/02
StaticILP.412/12/02
IdeaCompiler has
larger instruction
window than hardware.
Communicateto the hardware
more of the information gleaned at
compile time.
StaticILP.422/12/02
Six instructions wide and ten stage deepTries to minimize latency of most frequent operations
Hardware support for compilation time indeterminacies
StaticILP.432/12/02
Software initiated prefetch (requests filtered by instruction cache)prefetch must be 12 cycles before branch to hide latencyL2 -> streaming buffer -> instruction cache
Four level branch predictor hierarchy to prevent 9-cycle pipeline stall Decoupling buffer hold up to 8 bundles of code (bundle?)
StaticILP.442/12/02
Conclusion/Future
• Compiler can do a lot of the work but need hardware assitance
• Currently in pursue of best of both worlds
• Future:– How long IA-32 will last --- and will IA-64 take over IA32
market?– Will IA64 be the only ISA in the world?