soc system design approach

69
Introduction to the Systems Approach Mr. A. B. Shinde Assistant Professor, Electronics Engineering, PVPIT, Budhgaon, Sangli [email protected]

Upload: ambadas-shinde

Post on 05-Apr-2017

87 views

Category:

Engineering


3 download

TRANSCRIPT

Page 1: SOC System Design Approach

Introduction to the Systems

Approach

Mr. A. B. Shinde

Assistant Professor,

Electronics Engineering,

PVPIT, Budhgaon, Sangli

[email protected]

Page 2: SOC System Design Approach

System Architecture

• The term architecture denotes:

– The operational structure and

– The user’s view of the system.

• Over time, it has evolved to include

– Functional specification and

– Hardware implementation.

• The system architecture defines the system - level building blocks

2

Page 3: SOC System Design Approach

System Architecture: An Overview

3

The increasing transistor density on

a silicon die

The decrease of transistor cost

over the years

Page 4: SOC System Design Approach

Components of the System

• Processors (core, multimedia & coprocessors)

• Memories (storing elements)

• Interconnects

4

A basic SOC system model

Page 5: SOC System Design Approach

Hardware and Software

• Crucial decision to be made is:

1. Which components are to be implemented in hardware and

2. Which components are to be implemented in software.

• Software implementation usually uses general - purpose processor

(GPP), which interprets instructions at run time.

• This architecture offers flexibility and adaptability and provides a way

of sharing resources among different applications

• Hardware implementation of the ISA (Instruction Set Architecture) is

generally slower and consumes more power.

5

Page 6: SOC System Design Approach

Hardware and Software

6

Most software developers use high - level languages and tools that

enhance productivity, such as program development environments,

optimizing compilers and performance profilers.

Direct implementation of applications using hardware (ASICs), provides

high performance at the expense of programmability, flexibility,

productivity and cost.

Benefits Drawbacks

HardwareFast,

low power consumption

Inflexible,

unadaptable,

complex to build and test

Software

Flexible,

Adaptable,

simple to build and test

Slow,

high power consumption

Page 7: SOC System Design Approach

Hardware and Software

7

Programmability versus Performance

Coarse - Grained Reconfigurable Architecture

Application – Specific Instruction Processors

Page 8: SOC System Design Approach

System Architecture: An Overview

• Custom ASIC hardware and software on GPPs can be seen as twoextremes in the technology spectrum with different trade – offs

• An ASIP is a processor with an instruction set customized for aspecific application or domain.

• An FPGA typically contains an array of computation units, memories,and their interconnections, all are programmable in the field by user.

• Because of the growing demand for reducing the time to market andthe increasing cost of chip fabrication, FPGAs are becoming morepopular for implementing digital designs.

• Coarse - Grained Reconfigurable Architecture (CGRA) It containslogic blocks that process byte - wide or multiple byte - wide data,which can form building blocks of datapaths.

• Structured ASIC It allows application builders to customize theresources before fabrication. While it offers performance close to thatof ASIC.

• Digital Signal Processors (DSPs) The organization and instructionset for these devices are optimized for digital signal processingapplications.

8

Page 9: SOC System Design Approach

Processor Architectures

• Processors are characterized either by their application or by their

architecture (or structure)

9

Processors Identified by Function

Processors Identified by Architecture

Page 10: SOC System Design Approach

Processor Architectures

• Sequential processors execute one instruction at a time.

• Many processors have the capability to execute several instructions

concurrently by means of pipelining, multiple execution units, and

multiple cores.

• Pipelining is a powerful technique.

• Instruction - level parallelism (ILP) means that multiple operations

can be executed in parallel within a program.

• ILP may be achieved with hardware, compiler, or operating system,

provided that there is no data dependency

• In general, a computer architecture consists of one or more

interconnected processor elements (PEs) that operate concurrently,

solving a single overall problem.

10

Page 11: SOC System Design Approach

Instruction - level parallelism (ILP)

• Instruction-level parallelism (ILP) is a measure of how many of the

operations in a computer program can be performed simultaneously.

Consider the following program:

• For Example:

1. e = a + b

2. f = c + d

3. g = e * f

Here, Operation 3 depends on the results of operations 1 and 2, so it

cannot be calculated until both of them are completed. As, operations

1 and 2 do not depend on any other operation, so they can be

calculated simultaneously.

If each operation is completed in one unit of time then three

instructions can be completed in two units of time, giving an ILP of 3/2.

11

Page 12: SOC System Design Approach

Processor: A Functional View

• Table shows different SOC designs and the processor used in each

design.

• For these examples, we can characterize them as general purpose, or

special purpose.

12

Processor Models for Different SOC Examples

Page 13: SOC System Design Approach

Processor: An Architectural View

• Simple Sequential Processor directly implement the sequential

execution model.

• The next instruction is not processed until all execution for the current

instruction is complete and its results have been committed.

13

Instruction execution sequence.

1. IF - fetching the instruction into the instruction register (IF),

2. ID - decoding the opcode of the instruction (ID),

3. AG - generating the address in memory of any data item residing there,

4. DF - fetching data operands into executable registers (DF),

5. EX - executing the specified operation (EX), and

6. WB - writing back the result to the register file (WB).

Page 14: SOC System Design Approach

System Architecture: An Overview

• Sequential processor model

14

During execution, a sequential processor executes one or more

operations per clock cycle from the instruction stream.

An instruction is a container that represents the smallest execution

packet managed explicitly by the processor.

Scalar and superscalar processors consume one or more instructions per

cycle, where each instruction contains a single operation.

Page 15: SOC System Design Approach

System Architecture: An Overview

• Pipelined Processor Pipelining is approach to exploit parallelism that

is based on concurrently performing different phases (instruction fetch,

decode, execution, etc.) of processing an instruction.

• Pipelining assumes that these phases are independent — when this

condition does not hold, the processor stalls the downstream phases

to enforce the dependency.

• Thus, multiple operations can be processed simultaneously with each

operation at a different phase of its processing.

15

Instruction timing in a pipelined processor.

Page 16: SOC System Design Approach

System Architecture: An Overview

• Pipelined Processor

16

Figure shows the general form of a pipelined processor.

Static pipeline, requires the processor to go through all stages or

phases of the pipeline.

Dynamic pipeline allows the bypassing of one or more pipeline

stages, depending on the requirements of the instruction.

The more complex dynamic pipelines allow instructions to complete out of

(sequential) order, or even to initiate out of order.

Page 17: SOC System Design Approach

System Architecture: An Overview

• When pipelining does not execute multiple instructions at exactly the

same time, there are other techniques to do the same.

• Most instructions consist of only a single operation, this kind of

parallelism has been named ILP (instruction level parallelism).

• Two architectures that exploit (implement) ILP are superscalar and

VLIW (Very Long Instruction Word) processors.

• A superscalar processor dynamically examines the instruction

stream to determine which operations are independent and can be

executed.

• A VLIW processor relies on the compiler to analyze the available

operations (OP) and to schedule independent operations into wide

instruction words.

17

Page 18: SOC System Design Approach

System Architecture: An Overview

• Figure shows the instruction timing of a pipelined superscalar or

VLIW processor executing two instructions per cycle.

• In this case, all the instructions are independent so that they can be

executed in parallel.

18

Page 19: SOC System Design Approach

System Architecture: An Overview

• Superscalar Processors:

• Dynamic pipelined processors has limitations to execute a single

operation per cycle.

• This limitation can be avoided with the addition of multiple

functional units and a dynamic scheduler to process more than one

instruction per cycle.

• Superscalar processors can achieve execution rates of several

instructions per cycle (usually limited to two, but more is possible

depending on the application).

19

Page 20: SOC System Design Approach

System Architecture: An Overview

20

Superscalar processor model

Page 21: SOC System Design Approach

System Architecture: An Overview

• Superscalar Processors:

• A superscalar processor adds a scheduling instruction window that

analyzes multiple instructions from the instruction stream in each

cycle.

• These instructions are treated in the same manner as in a pipelined

processor.

• Because of the complexity of the dynamic scheduling logic, high –

performance superscalar processors are limited to processing

four to six instructions per cycle.

• In Flynn's taxonomy, a single-core superscalar processor is

classified as an SISD processor whereas a multi-core superscalar

processor is classified as an MIMD processor

21

Page 22: SOC System Design Approach

System Architecture: An Overview

• VLIW (Very Long Instruction Word) Processors

22

In contrast to dynamic analysis in hardware to determine which

operations can be executed in parallel, VLIW processors rely on static

analysis in the compiler

VLIW processor model

Page 23: SOC System Design Approach

System Architecture: An Overview

• VLIW processors are less complex than superscalar processors and

have the potential for higher performance.

• A VLIW processor executes operations from statically scheduled

instructions.

• VLIW processors rely on the static analysis performed by the

compiler and are unable to take advantage of any dynamic

execution characteristics.

• A simple VLIW implementation results in high performance.

23

Page 24: SOC System Design Approach

System Architecture: An Overview

• VLIW processors:

• VLIW instruction encodes multiple operations, at least one

operation for each execution unit of a device.

• For example, if a VLIW device has four execution units, then a

VLIW instruction for the device has four operation fields, each field

specifying what operation should be done on that corresponding

execution unit.

• To accommodate these operation fields, VLIW instructions are usually

at least 64 bits wide, and far wider on some architectures.

• For example:

• In one cycle, it does a floating-point multiply, a floating-point add, and

two auto increment loads. All of this fits in one 48-bit instruction:

f12 = f0 * f4, f8 = f8 + f12, f0 = dm(i0, m3), f4 = pm(i8, m9);

24

Page 25: SOC System Design Approach

System Architecture: An Overview

• SIMD Architectures:

25

Page 26: SOC System Design Approach

System Architecture: An Overview

• SIMD Architectures:

26

SIMD Processable Patterns SIMD Unprocesable Patterns

Example: Brightness Computation by SIMD Operations

Page 27: SOC System Design Approach

System Architecture: An Overview

• SIMD Architectures: Array and Vector Processors:

• The SIMD class of processor architecture includes both array and

vector processors.

• From the view of an assembly - level programmer, programming

SIMD architecture appears to be similar to programming a simple

processor.

27

Page 28: SOC System Design Approach

System Architecture: An Overview

• SIMD Architectures: Array and Vector Processors:

• The two popular types of SIMD processor are the array processor

and the vector processor.

• They differ both in their implementations and in their data

organizations.

• An array processor consists of many interconnected processor

elements, each having their own local memory space.

• A vector processor consists of a single processor that references a

global memory space and has special function units that operate

on vectors.

28

Page 29: SOC System Design Approach

System Architecture: An Overview

• Array processor

29

Array processor model.

Array processor: Is a set of parallel processor elements connected via

one or more networks, possibly including local and global inter element

communications and control communications.

Each processor element (PE) has its own private memory.

Data is distributed across the elements in a regular fashion… i.e. dependent

on both the actual structure of the data and also the computations to be

performed on the data.

Page 30: SOC System Design Approach

System Architecture: An Overview

• Array processor:

• Direct access to global memory or another processor element’s

local memory is expensive, so intermediate values are propagated

through the array through local inter-processor connections.

• Since instructions are broadcast, processor element cannot alter

the flow of the instruction stream;

however, individual processor elements can conditionally

disable instructions based on local status information i.e. processor

elements remains idle.

30

Page 31: SOC System Design Approach

System Architecture: An Overview

• Array processor:

• The actual instruction stream consists of more than a fixed stream

of operations.

• An array processor is typically coupled to a general - purpose

control processor that provides both scalar operations as well as

array operations that are broadcasted to all processor elements in

the array.

• The control processor performs the scalar sections of the

application, interfaces with the outside world, and controls the

flow of execution;

• The array processor performs the array sections of the application

as directed by the control processor.

31

Page 32: SOC System Design Approach

System Architecture: An Overview

• Array processor:

32

Page 33: SOC System Design Approach

System Architecture: An Overview

• Vector Processors:

• A vector processor is a single processor that resembles a

traditional single stream processor, except that some of the functional

units (and registers) operate on vectors (sequences of data values).

• These functional units are deeply pipelined and have high clock

rates.

• The vector pipelines often have the rapid delivery of the input

vector data elements, together with the high clock rates, results in a

significant throughput.

33

Page 34: SOC System Design Approach

System Architecture: An Overview

34

Vector processor model.

Page 35: SOC System Design Approach

System Architecture: An Overview

• A typical vector processor configuration consists of:

a vector register file,

one vector addition unit,

one vector multiplication unit, and

one vector reciprocal unit (used along with the vector multiplication unit to

perform division)35

Vector processor model.

Page 36: SOC System Design Approach

System Architecture: An Overview

• In modern vector processors, vectors are loaded into special vector

registers and stored back into memory

• Vector processors have several features that enable them to achieve

high performance.

• One of the feature is, the ability to concurrently load and store

values between the vector register file and the main memory while

performing computations on values in the vector register file.

• If the length of vector register is limited and vectors longer than

the register length needs to be processed then they are processed

in segments.

36

Page 37: SOC System Design Approach

System Architecture: An Overview

• Most vector processors support a result bypassing:

— that allows a follow - on computation to commence as soon as

the first value is available from the preceding computation.

Thus, instead of waiting for the entire vector to be processed, the

follow - on computation can be significantly overlapped with the

preceding computation that it is dependent on.

37

Page 38: SOC System Design Approach

Memory and Addressing

• Memory requirements in SOC applications varies significantly.

• In one case, the memory program can reside entirely in an on-chip

ROM, with the data in on-chip RAM.

• In another case, the memory system might support an elaborate

operating system requiring a large off-chip memory (system on a

board), with a memory management unit and cache.

38

Page 39: SOC System Design Approach

Memory and Addressing

• Why not simply include memory with the processor on the die?

• This has many attractions

(Benefits of embedding memory & processor on same die):

1. It improves the accessibility of memory, improving both memory

access time and bandwidth.

2. It reduces the need for large cache.

3. It improves performance for memory based applications.

39

Page 40: SOC System Design Approach

Memory and Addressing

• Drawbacks of embedding memory & processor on same die:

• First problem is that DRAM memory process technology differs

from standard microprocessor process technology, and would

cause some sacrifice in achievable bit density.

• Second problem is more serious: If memory were restricted to the

processor die, its size would be correspondingly limited.

• Thus, the conventional processor die model has evolved to implement

multiple robust homogeneous processors sharing the higher levels of

a two or three-level cache structure with the main memory off-die, on

its own multi die module.

40

Page 41: SOC System Design Approach

Memory and Addressing

• From a design complexity point of view, this has the advantage of

being a ― universal ‖ solution: One implementation fits all

applications, although not necessarily equally well.

• So, the production quantities can be large enough to justify the

costs.

41

Page 42: SOC System Design Approach

Memory and Addressing

• An alternative to the previous approach is integrate the memory onsystem die.

• For specific applications, whose memory size can be bounded, we canimplement an integrated memory SOC. This concept is illustrated inabove figure

42

Page 43: SOC System Design Approach

Memory and Addressing

• SOC Memory Examples

• It is important for SOC designers to consider whether to put RAM and

ROM on-die or off-die.

• Table shows various examples of SOC embedded memory macro cell.

43

SOC Memory

Considerations

Page 44: SOC System Design Approach

Memory and Addressing

• Addressing: The Architecture of Memory

• Addressing facilities are available to the programmer.

• Some facilities are available to the application programmer and

some to the operating system programmer.

• Virtual memory enables programs requiring larger storage than

the physical memory.

• When virtual addressing facilities are properly implemented and

programmed, memory can be efficiently and securely accessed.

44

Page 45: SOC System Design Approach

Memory and Addressing

• Virtual memory is often supported by a memory management unit.

• The physical memory address is determined by a sequence of (at

least) three steps:

1. The application produces a process address, this together with the

process or user ID, defines the virtual address:

virtual address = offset + (program) base + index ,

where the offset is specified in the instruction while the base and

index values are in specified registers.

45

Page 46: SOC System Design Approach

Memory and Addressing

• Virtual memory is often supported by a memory management unit.

• The physical memory address is determined by a sequence of (at

least) three steps:

2. Since multiple processes must cooperate in the same memory space,

the process addresses must be coordinated and relocated.

This is typically done by a segment table.

Upper bits of the virtual address are used to address a segment

table.

system address = virtual address + (process) base

where the system address must be less than the bound .

46

Page 47: SOC System Design Approach

Memory and Addressing

• Virtual memory is often supported by a memory management unit.

• The physical memory address is determined by a sequence of (at

least) three steps:

3. Virtual versus real. For many SOC applications, the memory space

exceeds the available (real) implemented memory.

Memory space is implemented on disk and only the recently used

regions (pages) are brought into memory.

47

Page 48: SOC System Design Approach

System - Level Interconnection

• SOC technology typically relies on the interconnection of

predesigned circuit modules (known as intellectual property (IP)

blocks) to form a complete system.

• In this way, the design task is raised from a circuit level to a system

level.

• Central to the system – level performance and the reliability of the

finished product is dependent on the method of interconnection used.

• A well - designed interconnection scheme should have efficient

communication protocols

48

Page 49: SOC System Design Approach

System - Level Interconnection

• It should provide efficient communication between different modules

maximizing the degree of parallelism.

• SOC interconnect methods can be classified into two main

approaches: buses and network - on - chip,

49

Page 50: SOC System Design Approach

System - Level Interconnection

• Bus - Based Approach

50

With the bus - based approach, IP blocks are designed to conform to published bus

standards (AMBA or CoreConnect).

Communication between modules is achieved through the sharing of the physical

connections of address, data, and control bus signals.

Usually, two or more buses are employed in a system, organized in a hierarchical

fashion.

To optimize system - level performance and cost, the bus closest to the CPU has the

highest bandwidth, and the bus farthest from the CPU has the lowest bandwidth.

Page 51: SOC System Design Approach

System - Level Interconnection

• Network - on - Chip Approach

51

A network - on - chip system consists of an

array of switches, either dynamically

switched as in a crossbar or statically

switched as in a mesh.

The crossbar approach uses

asynchronous channels to connect

synchronous modules that can operate at

different clock frequencies.

This approach has the advantage of higher

throughput than a bus - based system

while making integration of a system with

multiple clock domains easier.

Page 52: SOC System Design Approach

System - Level Interconnection

• Network - on - Chip Approach

52

This interconnect scheme is based on a

two - dimensional mesh topology.

All communications between switches

are conducted through data packets,

routed through the router interface circuit

within each node.

Since the interconnections between

switches have a fixed distance,

interconnect - related problems such as

wire delay and cross talk noise are much

reduced.

Page 53: SOC System Design Approach

System - Level Interconnection

53

Interconnect Models for Different SOC Examples

Page 54: SOC System Design Approach

An Approach For Soc Design

• Two important ideas in a design process are:

1. Figuring out the requirements and specifications, and

2. Iterating through different stages of design toward an efficient and

effective completion.

• Requirements and Specifications

• Design Iteration

– Initial Design

– Optimized Design

54

Page 55: SOC System Design Approach

An Approach For Soc Design

• Requirements and Specifications :

• There must be a deep understanding before a starting of design

process.

• They are useful at the beginning and at the end of the design process:

– At the beginning, to clarify what needs to be achieved; and

– At the end, as a reference against which the completed design

can be evaluated.

55

Page 56: SOC System Design Approach

An Approach For Soc Design

• Requirements and Specifications :

• The system requirements are the externally generated criteria for

the system.

• They may come from competition, from sales insights, from

customer requests, from product profitability analysis, or from a

combination of all.

• Requirements are rarely definitive of anything about the system.

Indeed, requirements can frequently be unrealistic:

“I want it fast, I want it cheap, and I want it now ”

56

Page 57: SOC System Design Approach

An Approach For Soc Design

• Requirements and Specifications :

• It is important for the designer to analyze carefully the

requirements, expressions, and to spend sufficient time in

understanding the market situation to determine all the factors

expressed in the requirements.

57

Page 58: SOC System Design Approach

An Approach For Soc Design

• Requirements and Specifications :

• Some of the factors the designer considers in determining

requirements include

• Compatibility with previous designs or published standards,

• Reuse of previous designs,

• Customer requests/complaints,

• Sales reports,

• Cost analysis,

• Competitive equipment analysis, and

• Trouble reports (reliability) of previous products and

competitive products.

58

Page 59: SOC System Design Approach

An Approach For Soc Design

• Design Iteration:

• Design is always an iterative process.

• So, the obvious question is how to get the very first, initial design?

• This is the design that we can then iterate through and optimize

according to the design criteria.

59

Page 60: SOC System Design Approach

An Approach For Soc Design

• Design Iteration:

• Initial Design:

• This is the first design that meets

the key requirements, while other

performance and cost criteria’s

are not considered.

• For instance, processor or

memory or input/output (I/O)

should be sized to meet all (high

– priority, real – time) constraints.

• Promising components and their

parameters are selected for

better performance.

• It is simplified model of the

expected computational capacity

or data bandwidth capability.

60

The SOC initial design process

Page 61: SOC System Design Approach

An Approach For Soc Design

• Design Iteration:

• Optimized Design:

• Once the base performance (or area) requirements are met and the

base functionality is ensured, then the goal is to minimize:

– the cost (area) and/or

– the power consumption or

– the design effort required to complete the design.

• This is the iterative step of the process.

• The first steps of this process use higher - fidelity tools (simulations,

trial layouts, etc.) to ensure that the initial design actually does satisfy

the design specifications and requirements.

• The later steps refine, complete, and improve the design according to

the design criteria.

61

Page 62: SOC System Design Approach

System Architecture and Complexity

• The basic difference between processor architecture and system

architecture is that the system adds another layer of complexity, and

the complexity of these systems limits the cost savings.

• Historically, the computer is a single processor plus a memory.

• As long as this is fixed (within broad tolerances), implementing that

processor on one or more silicon die does not change the design

complexity.

• Once die densities enable a scalar processor to fit on a chip, the

complexity issue changes.

62

Page 63: SOC System Design Approach

System Architecture and Complexity

• As transistor densities significantly improve after this point, there are

obvious processor extension strategies to improve performance:

1. Additional Cache: Here we add cache storage and, as large

caches have slower access times, a second - level cache.

2. A More Advanced Processor: We implement a superscalar or a

VLIW processor that executes more than one instruction each cycle.

Additionally, we can speed up the execution units that affects the

critical path delay, especially the floating - point execution times.

3. Multiple Processors: Now we implement multiple (superscalar)

processors and their associated multilevel caches.

63

Page 64: SOC System Design Approach

System Architecture and Complexity

• Instead of the 100,000transistor processors, ouradvanced processor hasmillions of transistors.

• The way to manage thiscomplexity is to reusedesigns.

• So, reusing several simplerprocessor designsimplemented on a die ispreferable to a new, moreadvanced, singleprocessor.

• For this to work, we need agood interconnectionmechanism to access thevarious processors andmemory.

64

Complexity of design

Page 65: SOC System Design Approach

System Architecture and Complexity

• So, when an application is well specified, the system-on-a-chip

approach includes

1. Multiple (usually) heterogeneous processors, each specialized for

specific parts of the application;

2. The main memory with (often) ROM for partial program storage;

3. A relatively simple, small (single - level) cache structure or

buffering schemes associated with each processor; and

4. A bus or switching mechanism for communications.

65

Page 66: SOC System Design Approach

Product Economics And Implications For SoC

• Factors Affecting

Product Costs

• The basic cost and

profitability of a product

depend on many factors:

• its technical appeal,

• its cost,

• the market size, and

• the effect the product

has on future products.

• The issue of cost goes well

beyond the product’s

manufacturing cost.

66

Project cost components

Page 67: SOC System Design Approach

Product Economics And Implications For SoC

• Factors Affecting Engineering Costs

67

Page 68: SOC System Design Approach

Product Economics And Implications For SoC

• Modelling Product Economics and Technology Complexity

• To put all this into perspective, consider a general model of a product’s

average unit cost:

unit cost = (project cost) / (number of units).

• The product cost is simply the sum of all the fixed and variable costs.

• We represent the fixed cost as a constant, Kf .

• It is also clear that the variable costs are of the form Kv × n , where n is

the number of units.

68

Page 69: SOC System Design Approach

Thank You…

This presentation is published only for Educational Purpose

69