architectures - 1

Upload: binu-velambil

Post on 02-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Architectures - 1

    1/27

    31/07/2013 1

    ARCHITECTURES - 1

    Mariagiovanna Sami

  • 7/27/2019 Architectures - 1

    2/27

    31/07/2013 2

    Architecture: which definition? Abstract archi tecturethe functional

    specificationof a computer:

    Concrete architecturean

    implementationof an abstract architecture.

    Abstract arcihtecture: a black box

    specification specification of a machine

    can be seen:

  • 7/27/2019 Architectures - 1

    3/27

    31/07/2013 3

    Architecture definition (2) From the programmers point of viewwe deal

    with a programming model, equivalent to

    description of themachine language;

    From the designers point of viewwe deal with

    a hardware model(a black-box description for

    the designer: must include additional information,

    e.g., interface protocols etc.).

  • 7/27/2019 Architectures - 1

    4/27

    31/07/2013 4

    Architecture definition (3) Usually, architecture denotes abstract

    architecture. Concrete Architecture is

    often called microarchitecture(term

    originally created for microprogrammed

    CPUs, extended more in general to the

    structural description in terms of functionalunits and interconnections).

  • 7/27/2019 Architectures - 1

    5/27

    31/07/2013 5

    Where do we start from? Background: the Von Neumann

    paradigm (and the Harvard alternative)

    Extension to a reactive paradigm still

    V.N!

  • 7/27/2019 Architectures - 1

    6/27

    31/07/2013 6

    An Architectural Paradigm: Composition of hardware and program

    execution mode;

    Does not include software, but implies the

    execution mode of object code!

  • 7/27/2019 Architectures - 1

    7/2731/07/2013 7

    The classical V.N. abstract

    architecture:

    Memory Control Unit

    I/O

    ALU CPU

  • 7/27/2019 Architectures - 1

    8/2731/07/2013 8

    Programming style: imperative,

    control-flow dominated

    One address space in memory

    information is identified by its address Machine instructions are stored

    sequentially: natural order of fetching and

    execution is by increasing address values execution in the same sequential order;

    Variables are identified by namestranslated as addresses.

  • 7/27/2019 Architectures - 1

    9/2731/07/2013 9

    The Control Flow:

    The C.U. determines address of next instructionto be executed as contained in therProgramCounter (PC)andfetches it from memory:

    The C.U. decodes the instruction and controls its

    execution by proper commands to ALU andmemory

    Simultaneously, address ofnextinstruction iscomputed: as a rule, next instruction is

    immediately sequential to the one being executed(address computed by incrementing PC) unlessotherwise explicitly stated by controlinstruction

  • 7/27/2019 Architectures - 1

    10/2731/07/2013 10

    Control-dominated execution: Controlis implicitlydetermined by

    orderingof instructions in the program or

    explicitlymodified by jump/branch

    instructions:

    Executionis inherentlysequentialand

    serial.

  • 7/27/2019 Architectures - 1

    11/2731/07/2013 11

    The basic approach: C.U. the

    only active unit All transfers to/from memory controlled by

    C.U.;

    I/O initiated by instructions in the program

    (program-controlled I/O, polling):

    C.U. activates transfer channels;

    All actions are de facto synchronized by

    execution of the program.

  • 7/27/2019 Architectures - 1

    12/2731/07/2013 12

    The Harvard variant... Basically, separates programand data

    memory:

    Program

    MemoryControl Unit

    I/O

    ALU

    Data

    Memory

  • 7/27/2019 Architectures - 1

    13/2731/07/2013 13

    Performance Evaluation...

    Made with reference to a set ofbenchmark

    programs(often synthetic);

    For every instruction in the machines

    Instruction Set (IS) the total time required

    (fetch+execute) is known;

    Profiling(execution of the program with

    suitable sets of data) gives the dynamic

    sequenceof instructions executed

  • 7/27/2019 Architectures - 1

    14/2731/07/2013 14

    Performance Evaluation (2)

    Total time required by execution of the program =

    sum of times required by all instructions in the

    dynamic sequence of execution (Instructions may

    have different latency depending on specificoperations, necessity of accessing memory to

    read/write data, even length of instruction itself).

    performance optimization through choice of

    best algorithm + less time-consuming

    instructions

  • 7/27/2019 Architectures - 1

    15/2731/07/2013 15

    (Some of) the bottlenecks Memory is slower than logic: larger (and less

    costly) memory = wider gap, but ever larger

    addressable memory space is requested!; Execution is totally serialan instruction must be

    completed before its successor is fetched from

    memory; technology dominates instruction

    latency and overall performances;

  • 7/27/2019 Architectures - 1

    16/2731/07/2013 16

    Bottlenecks (2) If a reactive system is designed

    (typically, an application-specific or

    embedded system) an external event

    created by an I/O device is serviced only

    when the device is polled by the program

    real-time only as good as theprogrammer can make it!

  • 7/27/2019 Architectures - 1

    17/2731/07/2013 17

    So, how to achieve better

    performances? Modify memory structureso that the

    programmer will see a very large addressable

    spacebut the CPU will see a fast equivalentmemory;

    Achieve better eff iciencyfor execution of the

    instruction sequence;

    Allow servicing external events when events

    arisein an asynchronous waywith respect to

    program execution.

  • 7/27/2019 Architectures - 1

    18/2731/07/2013 18

    Starting from the bottom... Servicing external events? Solution born

    with the first minicomputers (early 60s):

    interrupt(an external unit may initiatean

    actionexecution of the servicing routine

    is then controlled by the C.U.)

  • 7/27/2019 Architectures - 1

    19/2731/07/2013 19

    Getting better efficiency for

    instruction execution? A first approach: create instructions capable of

    executing complex operations(object code more

    compact; one instruction fetched from memoryexecutes actions previously performed by a

    sequence of instructions).

    Drawbacks: more complex C.U. (longer clock

    period); identification of useful complex

    instructions for general-purpose CPUs is difficult.

  • 7/27/2019 Architectures - 1

    20/2731/07/2013 20

    Complex instructionsStill...:

    The solution has been widely adopted

    CISC machines a winning approach for along time;

    May be very useful when specializedtasks

    are widely used (e.g., DSP or image-processing) or for application-specificCPUs.

  • 7/27/2019 Architectures - 1

    21/2731/07/2013 21

    Getting better efficiency for

    instruction executionthe

    alternative Modify structure of CPU and execution

    paradigm to introduce parallelismovercome the serial execution bottleneck.

    But...

    Which kind of parallelism? Parallelism hasto be detected within the applicationat

    which level?

  • 7/27/2019 Architectures - 1

    22/2731/07/2013 22

    What about the memory

    problem? Introduce a hierarchyof memories - large,

    slow (and cheap) ones at the bottom, fast,

    small (and costly) ones at the top (nearest

    the CPU);

    Allow a wider memory bandwidthmore

    than one unit of information at a time istransferred from memory to CPU (or

    between memories).

  • 7/27/2019 Architectures - 1

    23/2731/07/2013 23

    Memory (2)In fact:

    Hierarchy: does not imply any assumption

    on mode of execution other than serial;

    requires extensions to hw structure

    controlling memory access;

    larger bandwidth: meaningful only if some

    form of parallelism is adopted.

  • 7/27/2019 Architectures - 1

    24/27

    31/07/2013 24

    What these lectures will be about: Memory hierarchy: it is assumed that the

    basic points are already known (e.g.:

    virtual memory and its hw supports; the

    scope of cache memory...). Attention will

    be given to cache organization and

    performances: technological aspects arenot discussed here (other courses...).

  • 7/27/2019 Architectures - 1

    25/27

    31/07/2013 25

    What these lectures will be about

    (2): Parallelism: from within the CPU to

    system-level, taking into account

    character istics of application-specif ic

    systems:

    Pipelining

    Instruction-Level Parallelism (ILP)

    Multi-threading

    Multi-processor systems.

  • 7/27/2019 Architectures - 1

    26/27

    31/07/2013 26

    Course organization Lectures

    Exercises

    Use of tools for architecture evaluation and

    design:

    Analysis of an applications behaviour given

    a fixed architecture

    Design of a specific architecture for a given

    application.

  • 7/27/2019 Architectures - 1

    27/27

    27

    Texts: Slides are available in the Masters

    repository;

    Suggested readings: a list will be circulated(books available in the Library, papersaccessible via Internet or provided in hard-

    copy); Manuals of software tools: available in the

    repository.