hpps 2008 - panzica motta bonanno
DESCRIPTION
TRANSCRIPT
ISA customizationISA customization
Andrea Tommaso BonannoAndrea Tommaso BonannoValerio Panzica La MannaValerio Panzica La Manna
Alfredo MottaAlfredo Motta
Prof: Donatella Sciuto
ISA customization Andrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
2
Context
Customized processorA basic processor is defined and then extended with units specialized for each particular application (i.e. differently customized versions of the processor for each product)
Instruction set extensiongrouping dataflow independent operations in the application software as potential new complex instructionsThe resulting set of instructions is actually used for code generation and execution
ISA customization Andrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
3
Customizable processors design
Very common practice in system-on-chip design
Growing in the demand for application-specific embedded processors
Performance increasing in specific domains (decrease) costs of advanced RISC processors (decrease) complexity of entirely customized instruction sets
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
4
Issues
GoalAutomatically finding the best performing ISEs for an application independently from the technology adopted
ProblemNo feasible exact solution if we want to generate more than one instruction but
Several algorithms exist to find the single best performing ISE in a graph (ILP, pruned subgraphs exploration)
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
5
Graph Exploration
Dependence graphRepresents the data flow of a basic block of the program assembly codeNodes: Instructions of the basic block whose used
registers are involved in dependences
Input and Output nodes
Edges: show dependences between instructions
RAW dependences
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
6
Graph Exploration
addik r1, r1, -48swi r15, r1, 0swi r19, r1, 40swi r22, r1, 44addk r19, r1, r0addik r3, r0, 1swi r3, r19, 28lwi r3, r19, 28addik r18, r0, 9cmp r18, r3, r18blti r18, 272
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
7
ISE Identification Alghorithms
Critical point of most algorithms
Problem space exponential in the size of the graph
New practical algorithm (Bonzini, Pozzi, USI)
Focuses on convex subgraphs and multiple vertex dominatorsComplexity polynomial in the size of the graph
Main approachesBased on the dominance relationEnumeration of valid subgraphs
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
8
Our Work
Simulated code generation by means of the synthesized MicroBlaze processor
Xilinx EDK
Parsing of the assembly code in such a way to find out all the dependence graphs and represent them in a structured way
C++ (Boost libraries)
Dot notation
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
9
Our Work
Implementation of the incremental version of the novel algorithm for the identification of the best performing ISE
PythonGraphviz
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
10
Tool 1: Goal
Building a graph of all the true data dependencies of a basic block.
PARSER
AssemblyGenerationC CODE
MicroBlaze Assembly
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
...
11
The code taken as example regards some functions :Quarter-Wave Sine TransformDiscrete Sine TransformDiscrete Cosine Transform
C CODE
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
12
AssemblyGenerator
Xilinx XPS
C CODE
Xilinx EDKSymulation of
MicroblazeProcessor
MicroBlaze Assembly
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
13
MicroBlaze Assembly
In order to find the data dependencies of the Assembly code, the MicroBlaze ISA is divided in three subclasses:
1. Write&Use instructions: ISTR_NAME RD, RA, RB(e.g addik, lwi)
2. OnlyUse instructions: ISTR_NAME RA, RB (offset)(e.g swi, put)
3. Control instructions: ISTR_NAME RA, RB // target(e.g bnei, blti)
Note that control instructions define the end of a basic block
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
14
2c: f8930028 swi r4, r19, 40 30: f8b3002c swi r5, r19, 44 34: 30600001 addik r3, r0, 1 38: f873001c swi r3, r19, 28 3c: e873001c lwi r3, r19, 28 40: 32400064 addik r18, r0, 100 44: 16439001 cmp r18, r3, r18 48: bc5202a4 blti r18, 676 //104
Assembly code example
Write&Use Instr
OnlyUse Instr
Control Instr
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
15
2c: f8930028 swi r4, r19, 40 30: f8b3002c swi r5, r19, 44 34: 30600001 addik r3, r0, 1 38: f873001c swi r3, r19, 28 3c: e873001c lwi r3, r19, 28 40: 32400064 addik r18, r0, 100 44: 16439001 cmp r18, r3, r18 48: bc5202a4 blti r18, 676 // 104
Assembly code example
Write&Use Instr
OnlyUse Instr
Control Instr
ISA customization Andrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
16
3 main steps:
1. Clean the input file from not necessary informations2. For each instruction i:
Determine the instruction subclass S(i)If S(i) is Write&Use find all the data dependences until the end of the basic block (so when S(i) == Control )
3. For each basic block b:Create a graph G(b) where the nodes represent the instructions in b and the arcs the dependencies.Convert G in a dot file
PARSER The Algorithm
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
17
The tool is completely realized in C++:Good performance (computed assembly code with more than 50 basic blocks in less than 260 ms)Possibility to exploit the Boost Graph LibrariesQuadratic complexity of the algorithm
Future Developments:Create a multi-architecture parser that supports different ISA
ISA customization Andrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
PARSER Implementation Infos
18
RESULT
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
19
Tool 2: Problem Statement
Input: Graph in DOT format representing the dependencies between instructionsAlgorithm Key Idea: Search for instructions that can be grouped together in order to create custom instructions → convex cuts Output: All the convex cut of the input graph
find_convex_cut.py
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
20
Definitions - Cut
Definition 1 (Cut): A cut S is a subgraph of a graph G.Definition 2 (Convex cut): A cut S is convex if there is no path from a vertex u S to another ∈vertex v S which ∈contains a vertex w ∈NOT(S)
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
21
Definitions - Input&Output
Out goal is to find customizable instructions. We must take care about the number of possible input and output.Definition 3 (Input Set): Given a set S, we call inputs of S the set I(S) of predecessor vertices of those edges which enter the cut S from the rest of the graph G.
Definition 4 (Output Set): Given a set S, we call outputs of S the set O(S) of vertices which are part of S, but have at least one successor v NOT(S)∈
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
HPPS Project19-6-2008
22
User Parameters
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
Nin: Represents the maximum number of read port in the register file which a custom instruction can use → the algorithm searchs all convex cuts that have less input than Nin.
Nout: Represents the maximum number of write ports in the register file which a custom instruction can use → the algorithm searchs all convex cuts that have less outputs than Nout.
Forbidden nodes are related to architectural constraints. They represent instructions that must not be included in any convex cut. Example: if the custom functional unit cannot have any memory port, loads and stores must be included in the forbidden nodes set
HPPS Project19-6-2008
23
Key Idea
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
Summary: Given a graph G, the posed problem is to find all convex cuts S G under the constraints ⊆that |I (S)| ≤ Nin ,|O (S)| ≤ Nout , and S ∩ F = .∅Complexity Problem: The problem is that common alghorithms are exponential in the size of the graph.
Solution: #Input=Nin and #Output=Nout of custom instructions are limited. Why not looking for a relation between them and convex cuts?
HPPS Project19-6-2008
24
Theorem
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
Any convex cut is uniquely identified by its sets of input and output vertices, respectively I(S) and O(S). In other words, two convex cuts of the same graph are equal iff they have the same inputs and outputs.
-> It is possible to enumerate convex cuts by coupling every possible set of outputs with all the possible sets of inputs. Moreover if we put a contraint on the number of inputs and outputs, the number of valid convex cuts is clearly polynomial; more precisely, it is O n^(Nin +Nout) .
HPPS Project19-6-2008
25
Compute the set
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
1- Every element of the output set is not postdominated by any element
2- The input set is a common multiple vertex dominator of the output set
So first we pick an output value that is admissible, then we pick an input value such that the input set is a common multiple vertex dominator of the output. Doing that recursively we construct the input and output sets.
The algorithm searches for input and output sets. Obviously not all the sets are admissible. Two properties are required.
HPPS Project19-6-2008
26
Function Summary
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
Check-Cut(I,O,S,Nin,Nout)Pick-Inputs(I,o,O,S,Nin,Nout)Pick-Output(I,O,S,Nin,Nout)
The algorithm works recurively.Input and Output set starts empty and then they grows recursively. For every couple of input and output sets we check if the convex cut generated is valid or not (forbidden nodes)
Poli-Enum-Incr()Pick-Output( , , , Nin , Nout)∅ ∅ ∅
HPPS Project19-6-2008
27
Final Result
ISA customizationAndrea Tommaso BonannoValerio Panzica La MannaAlfredo Motta
Example of a convex cut for a particular block
HPPS Project19-6-2008