blueeyes presentation
TRANSCRIPT
-
8/7/2019 BLUEEYES presentation
1/30
VLIW
ARCHITECTURE
-
8/7/2019 BLUEEYES presentation
2/30
VLIW . 2
Increasing Processor Performance
Semiconductor Technology
Parallel Processing
Multiprocessors, Multicomputers
Parallelism within the Processor
Pipelining
ILP
-
8/7/2019 BLUEEYES presentation
3/30
VLIW . 3
ILP (Instruction Level Parallelism)
Parallel Execution ofInstructions.
Overlapping of instructions
ILP processors
Superscalar processors
VLIW processors.
-
8/7/2019 BLUEEYES presentation
4/30
-
8/7/2019 BLUEEYES presentation
5/30
VLIW . 5
Execution in a Scalar Processor
Fetch
Write Back
Execute
Decode
-
8/7/2019 BLUEEYES presentation
6/30
VLIW . 6
Decision about operations by H/W
More than one instruction at a time
Dynamic scheduling
Superscalar processors
-
8/7/2019 BLUEEYES presentation
7/30
VLIW . 7
Basic Superscalar Approach
REGISTER FILE
INSTRUCTION CACHE
EXECUTION UNIT
# 1
INSTRUCTION
BUFFERS, DECODERS,
DISPATCHER
RECORD BUFFER
EXECUTION UNIT
# 4
EXECUTION UNIT
# 3
EXECUTION UNIT
# 2
DATA CACHE
-
8/7/2019 BLUEEYES presentation
8/30
VLIW . 8
Execution in Superscalar
Fetch
Decode
Execute
Write Back
With degree 4
-
8/7/2019 BLUEEYES presentation
9/30
VLIW 9
Disadvantages ofSuperscalar
Complexity of hardware.
Window size constrained. This limits the capacityto detect independent instructions.
More power consumption.
-
8/7/2019 BLUEEYES presentation
10/30
VLIW . 10
VLIW
Very Long Instruction Word.
Instructions hundereds of bits in length
Uses long instruction called a Multiop
Multiple functional units are concurrently used
Functional units share a common register file.
Code compaction by compiler.
-
8/7/2019 BLUEEYES presentation
11/30
VLIW 11
A Brief History
Joseph fisher,Trace scheduling,1979
He coined the acronym VLIW.
In 1984, two companies were started
Multiflow, started by Joseph Fisher
Cydrome, founded by Bob Rau.
-
8/7/2019 BLUEEYES presentation
12/30
VLIW . 12
In 1987, Cydrome delivered the first machine the 256 bit Cydra 5.
Multiflow delivered Trace/200 - 1987
Trace/300 - 1988
Trace/500 - 1990
-
8/7/2019 BLUEEYES presentation
13/30
VLIW . 13
Since then VLIW machines have seen arevival and some degrees of success.
Multiflow closed in 1990
Cydrome closed in 1998
-
8/7/2019 BLUEEYES presentation
14/30
VLIW . 14
Basic VLIW Approach
REGISTER FILE
INSTRUCTION CACHE
EXECUTION UNIT# 1
INSTRUCTION
REGISTER
EXECUTION UNIT
# 4
EXECUTION UNIT
# 3
EXECUTION UNIT
# 2
DATA CACHE
REGISTER FILE
EXECUTION UNIT
# 1
-
8/7/2019 BLUEEYES presentation
15/30
-
8/7/2019 BLUEEYES presentation
16/30
-
8/7/2019 BLUEEYES presentation
17/30
VLIW . 17
Case Studies
Defoe.
Intel Itanium Processor.
TransmetaCrusoe Processor.
-
8/7/2019 BLUEEYES presentation
18/30
VLIW . 18
Defoe Architecture
D-Cache
Simple
Integer
Simple
Integer
Complex
Integer
Load/
Store
Load/
Store
Branch/
Cmp
64 entry Register File
Dispersal Unit
D-Cache
16x
Pred
Score
Board
&
Fetch
To L2
Cache
From L2
Cache
-
8/7/2019 BLUEEYES presentation
19/30
VLIW Abhilash.P.K. 19
Instruction Encoding
64 bit compressed VLIW architecture.
Used variable length multiops
Individual operations are encoded as 32 bit words.
A special stop bit indicates the end of an instructionword.
Stop bit
(1 bit)
Predicate
(4 bits)
OPCODE
(9 bits)
RDEST
(6 bits)
RSRC 1
(6 bits)
RSRC 2
(6 bits)
-
8/7/2019 BLUEEYES presentation
20/30
VLIW . 20
Intel Itanium Processor
Intels first implementation ofIA-64.
IA-64 is an ISA for the EPIC (Explicitly Parallel
InstructionComputing) style ofVLIW, developed
jointly by Intel and HP.
-
8/7/2019 BLUEEYES presentation
21/30
VLIW . 21
64 bit processor, with
4 integer units
4 multimedia units
2 load/store units
2 extended precision floating
point units
2 single precision floating point units
-
8/7/2019 BLUEEYES presentation
22/30
VLIW . 22
Transmeta Crusoe Processor
Designed to reduce power consumption.
Dynamic scheduling consumes more power.
VLIW replaces the complex ways of gaining ILP
with simpler and more power efficient ways.
-
8/7/2019 BLUEEYES presentation
23/30
VLIW . 23
Instruction Format
Instructions are either64 or128 bits long.
Molecules and atoms
.
64 GPRs
-
8/7/2019 BLUEEYES presentation
24/30
VLIW . 24
Compiler Support
Instruction scheduling algorithms are critical.
Three important scheduling algorithms
Trace scheduling
Trace scheduling-2
Super Block scheduling
-
8/7/2019 BLUEEYES presentation
25/30
VLIW . 25
Advantages
Less hardware complexity.
Static Scheduling Much more hardware can be devoted to useful
computation.
Software has a larger window to look at..
Can find more ILP.
-
8/7/2019 BLUEEYES presentation
26/30
VLIW . 26
Shortcomings
Wasteful encoding with NOPs.
Hard to maintain code compatibility between
generations.
Increased program size.
Compiler has to explicitly add NOP.
New versions of the architecture can force majorrewriting of the compiler.
-
8/7/2019 BLUEEYES presentation
27/30
VLIW . 27
Future of VLIW
Newer processors are mainly used for
Stream and image processing. Eg PhilipsTrimedia
Digital Signal Processig. Eg TMS320C62x from
Texas Instr
Mobile computing. Eg Transmeta Crusoe
High end server applications. Eg Intel Itanium
-
8/7/2019 BLUEEYES presentation
28/30
VLIW 28
Stream and media processing lend themselves
to VLIW style with large amounts ofILP.
Superscalars will be forced to use simpler
structures and seek help from software.
-
8/7/2019 BLUEEYES presentation
29/30
VLIW . 29
References
www.cs.utah.edu/~mbinu/coursework/686_vliw/
www.semiconductors.philips.com/acrobat/others/
Advanced Computer Architecture - Kai Hwang.
www.entecollege.com
-
8/7/2019 BLUEEYES presentation
30/30