embedded slides

Upload: suresh-patel

Post on 06-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Embedded Slides

    1/28

    ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION

    Lesson-17: Memory organisation, and types of memory

    1. Memory OrganisationRandom access model

    A memory-, a data byte, or a word, or a double word, or a quad word may be accessed from or

    at all addressable locations with a similar process would be used to access from all locations and

    there is would be equal access time for a read or for a write that is independent of a memory

    address location. This mode differentiates from another model called serial access mode

    Addresses

    Memory (both RAM and ROM) divided into a set of storage locations, each of which can hold 1

    byte (8 bits) of data.

    The storage locations are numbered, and the number of a storage location (called its address) is

    used to tell the memory system which location the processor wants to reference.

    Important characteristics of a computer system is the width of the addresses it uses, which

    limits the amount of memory that the processor can address. Most current computers use

    either 32-bit or 64-bit addresses, allowing them to access either 232

    or 264

    bytes of memory.

    RANDOM ACCESS MODEL OF MEMORY

    Simple model for RAM and ROM

    Both has random-access model of memory

    All memory operations take the same amount of time independent of the address of the byte or

    word at the memory

    Example

    Assume that the memory system will support two operations: load (read operation into

    processor from memory) and store (read operation from processor into memory).

    Load from one set of addresses (2 or 4) will take same time for store from another set of

    addresses (2 or 4)

    ROM

    Contents of the read-only memory cannot be modified by the computer but may be read.

    A system has ROM unit(s)for bootstrap program(s), basic input-output system (BIOS)

    program(s) and for vector addresses for the interrupts

  • 8/3/2019 Embedded Slides

    2/28

    Used to hold bootstrap program that is executed automatically by the system every time it is

    turned on or reset. Instructs the system to load its operating system off

    ROM image

    ROM image holds the programs, operating system, and data required by the system

    Random-access memory (RAM)

    Can be both read and, written,

    Hold the programs, operating system, and data required by the system.

    Generally volatile, meaning that it does not retain the data stored in it when the system 's

    power is turned off. A

    Data that needs to be stored while the system is off must be written to a permanent storage

    device, such as a flash memory or hard disk.

    An example is as follows: A mobile phone has 128 kB or 256 kB of RAM to hold the stack and

    temporary variables of the programs, operating system, and data.

    ALIGNMENT OF MULTIBYTE STORE AND LOAD IN A MEMORY ORGANISATION

    Some memory organisation requires loads and stores to be "aligned. A 4-byte word has been aligned at

    address 0x000C or 0x1000, which is a multiple of 4. This simplifies the organisation of the memory

    system

    LITTLE ENDIAN AND BIG ENDIAN IN A MEMORY ORGANISATION

    Some processor and memory organisation requires littleendian and other bigendian aligned

    multiple bytes when there is store into the memory or load into the processor from memory.

    ARM processor permits programming at the start and enables a programmer to define one of

    the word-alignments littleendian or bigendian at the beginning.

    Princeton Architecture

    80x86 processors and ARM7 have Princeton architecture for main memory. 8051-family

    microcontrollers have Harvard architecture.). Vectors and pointers, variables, program segments

    and memory blocks for data and stacks have different addresses in the program in Princeton

    memory architecture.

    Harvard architecture

    When the address spaces for the data and for program are distinct

  • 8/3/2019 Embedded Slides

    3/28

    Handling streams of data that are required to be accessed in cases of single instruction multiple

    data type instructions and DSP instructions.

    Separate data buses ensure simultaneous accesses for instructions and data.

    Harvard and Princeton Memory Organizations

  • 8/3/2019 Embedded Slides

    4/28

    2. Types of Memoryy Most systems two types of memoryread-onlymemory(ROM) and random-accessmemory

    (RAM).

    y A computer system has ROM unit(s) for bootstrap program(s), basic input-output system (BIOS)

    program's) and for vector addresses for the interrupts

    y An embedded system has ROM unit(s) for storing ROM image and flash to save non-volatile data

    and results

    ROM Uses

    Language specific bits for the fonts corresponding to each character to a printer or display unit.

    Images bits for a display.

    Pictogram bytes for the full bit-image corresponding to the pixels for a pictogram. Sequentialchanges at the inputs of display unit repeatedly generate the full pictogram.

    In a CISC as a control ROM at a micro-programmed unit for implementing instructions

    1) Masked ROM Used for large scale manufacturing; mask prepared for foundry

    - A finalised ROM image of system program and data, pictograms, image pixels, pixels for the fonts of

    a language, combination-circuits implementing a truth-table

    2) EPROM Used in place of masked ROM during development phase; UV Erasable and Electrically

    programmable by a device programmer

    3) E2ROM Used during the program run to save non-volatile data and results (for examples, date

    and time of a transaction, present port status, port driving history, system malfunctions

    history); Electrically Erasable by writing a byte or a set of bytes with all 1s and Electrically

    programmable during a program run one byte write at each write instance.

    4) Flash A flash memory functions as the ROM. Electrically Erasable sector of 16 kB to 256 kB at

    an instance and Electrically programmable one byte at each instance during a program run.

    RAM

    The RAM can be both read and, written, and is used to hold the programs, operating system,

    and data required by a computer system. In embedded systems, it holds the stack and

    temporary variables of the programs, operating system, and data

    RAMCharacteristics

    RAM is generally volatile,

  • 8/3/2019 Embedded Slides

    5/28

    does not retain the data stored in it when the system 's power is turned off.

    Any data that needs to be stored while the system is off must be written to a permanent storage

    device, such as a flash memory or hard disk.

    Example :A mobile phone has 128 kB or 256 kB of RAM to hold the stack and temporary variables of the

    programs, operating system, and data

    RAM Types

    1) SRAM (static RAM) and DRAM (dynamic RAM) Used for saving the variables, stacks, process

    control blocks, input buffer, output buffer, decompressed format of program and data at the

    ROM image

    2) EDO (Extended Data Out) RAM Used up to 100 MHz clock rate, zero wait state between two

    fetches, single cycle read or write

    3)SDRAM (SynchronousDRAM) Synchronised read operation; keeps next word ready while

    previous one is being fetched; used up to 1 GHz clock cycle

    4) RDRAM (Rambus* DRAM) Burst accesses of four successive words in single fetch; used for 1

    GHz + performance of the system

    * A developer company name

    5) Parameterised Distributed RAM when slow bus accesses exists RAM distributed for the specific

    tasks of the system and devices - for examples for fast IO buffers, fast stacks, ..

    6)

    Parameterised Block RAM Specific block dedicated for specific use, for example, for DCToperations

    Summary

    We learnt

    Random access memory model, ROM, RAM

    Addresses

    Data alignment

    Little and big endian

    Flash

    Princeton and Harvard architectures

  • 8/3/2019 Embedded Slides

    6/28

    ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION

    Lesson-16: Processor organisation and Performance Metrics

    1. Processor, Memory and busesProcessor Organisation

    Processor

    ALU.

    Processor circuit does sequential operations and a clockguides these.

    Program counterand stackpointer, which points to the instruction to be fetched and top of the

    data pushed into the stack.

    Certain processor have on-chip memory management unit (MMU).

    Registers

    General-purpose registers.

    Registers organize onto a common internal bus of the processor. A register is of 32, 16 or 8 bits

    depending on whether the ALU performs at an instance a 32- or 16- or 8-bit operation

  • 8/3/2019 Embedded Slides

    7/28

    CISC

    Processor may have CISC (Complex Instruction Set Computer) or RISC (Reduced Instruction Set

    Computer) architecture may affect the system design.

    CISChas ability to process complex instructions and complex data sets with fewer registers as it

    provides for a large number of addressing modes.

    RISC

    Simpler instructions and all in a single cycle per instruction.

    New RISC processors, such as ARM7 and ARM9 also provide for a few most useful CISC

    instructions also.

    CISCconverges toa RISCimplementation because the most instructions are hardwired and

    implement in single clock cycle

    Interrupts

    Processor provides for the inputs for external interrupts so that the external circuits can send

    the interrupt signals

    May possess an internal interruptcontroller(handler) to program the service routine priorities

    and to allocate vector addresses.

    DMA (Direct Memory Access) Controller

    External Devices can directly write and read into the blocks of RAM using the DMA controller,

    when the buses are not in use of the processor

    Multiple DMA channels on chip.

    When there are number ofI/O devices and an I/O device needs to access a multi byte data set

    fast, the system memory on-chip DMA controller help greatly

    INSTRUCTION LEVEL PARALLELISM

    y Execute several instructions is parallel. Two or more instructions execute in parallel as well as in

    pipeline.

    y

    During the in which two parallel pipelines in a processor and two instructions In and In+1

    executing in parallel at the separate execution units .

  • 8/3/2019 Embedded Slides

    8/28

    3. Processor Performance Metrics

    Metrics

    1) MIPS Million Instructions Per Second

    2) MFLOPS Million Floating Point Operations Per Second

    3) Dhrystone/s Number of times a benchmark program called Dhrystone program can run per

    second.[1MIPS = 1757Dhrystone/s]

    Embedded Benchmark Consortium (EEMBC) five-benchmark program suites

    Telecommunications

    Consumer Electronics

    Automotive and Industrial Electronics

    Consumer Electronics

    Office Automation.

  • 8/3/2019 Embedded Slides

    9/28

    Summary

    We learnt

    Processor, address, data and control buses and Memory

    CISC and RISC

    Instruction Level Parallelism

    Performance Metrics

    ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION

    Lesson-18: Memory Allocations and Memory Map

    1. Memory Allocation To Program Segments and Blocks

    Functions, Processes, Data and Stacks at the Various Segments ofMemory

    Segment wise memory allocation in four segments; Code, Data, Stack and Extra (for examples, image,

    String)

    Segments and Paging at the Memory

  • 8/3/2019 Embedded Slides

    10/28

  • 8/3/2019 Embedded Slides

    11/28

    4) Table

    5) Look up Table Look-up-table row first column points to another memory block of a data

    structure data

    6) List: In a list element, a data structure of an item also points to the next item

    7) Process Control Block [Refer Chapter 7Lesson 1]

    Memory Map

    Map to show the program and data allocation of the addresses to ROM, RAM, EEPROM or Flash in

    the system .

  • 8/3/2019 Embedded Slides

    12/28

    Memory map for an exemplary embedded system, smart card needing 2 kB memory

    Memory map for an exemplary Java embedded card with software for encrypting and deciphering

    the transactions

  • 8/3/2019 Embedded Slides

    13/28

    Memory map sections in a smart card

  • 8/3/2019 Embedded Slides

    14/28

    Memory map sections in another smart card

    Summary

    We learnt

    Allocations to various Segments and data structures and the memory map of Exemplary cases

  • 8/3/2019 Embedded Slides

    15/28

  • 8/3/2019 Embedded Slides

    16/28

    Arbitration: Priority arbiter

    Consider the situation where multiple peripherals request service from single resource (e.g.,

    microprocessor, DMA controller) simultaneously - which gets serviced first?

    Priority arbiter

    Single-purpose processor

    Peripherals make requests to arbiter, arbiter makes requests to resource

    Arbiter connected to system bus for configuration only

    Arbitration: Daisy-chain arbitration

    Arbitration done by peripherals

    Built into peripheral or external logic added

    req input and ackoutput added to each peripheral

    Peripherals connected to each other in daisy-chain manner

    One peripheral connected to resource, all others connected upstream

    Peripherals req flows downstream to resource, resources ackflows upstream to

    requesting peripheral

    Closest peripheral has highest priority

  • 8/3/2019 Embedded Slides

    17/28

    Pros/cons

    Easy to add/remove peripheral - no system redesign needed

    Does not support rotating priority

    One broken peripheral can cause loss of access to other peripherals

    Network-oriented arbitration

    When multiple microprocessors share a bus (sometimes called a network)

    Arbitration typically built into bus protocol

    Separate processors may try to write simultaneously causing collisions

    Data must be resent

    Dont want to start sending again at same time

    statistical methods can be used to reduce chances

    Typically used for connecting multiple distant chips

    Trend use to connect multiple on-chip processors

    Example: Vectored interrupt using

    an interrupt table

    Fixed priority: i.e., Peripheral1 has highest priority

    Keyword _at_ followed by memory address forces compiler to place variables in specific

    memory locations

  • 8/3/2019 Embedded Slides

    18/28

    e.g., memory-mapped registers in arbiter, peripherals

    A peripherals index into interrupt table is sent to memory-mapped register in arbiter

    Peripherals receive external data and raise interrupt

    Multilevel bus architectures

    Dont want one bus for all communication

    Peripherals would need high-speed, processor-specific bus interface

    excess gates, power consumption, and cost; less portable

    Too many peripherals slows down bus

    Processor-local bus

    High speed, wide, most frequent communication

    Connects microprocessor, cache, memory controllers, etc.

    Peripheral bus

    Lower speed, narrower, less frequent communication

    Typically industry standard bus (ISA, PCI) for portability

  • 8/3/2019 Embedded Slides

    19/28

    Bridge

    Single-purpose processor converts communication between busses

    Advanced communication principles

    Layering

    Break complexity of communication protocol into pieces easier to design and

    understand

    Lower levels provide services to higher level

    Lower level might work with bits while higher level might work with packetsof data

    Physical layer

    Lowest level in hierarchy

    Medium to carry data from one actor (device or node) to another

    Parallel communication

    Physical layer capable of transporting multiple bits of data

    Serial communication

    Physical layer transports one bit of data at a time

    Wireless communication

    No physical connection needed for transport at physical layer

  • 8/3/2019 Embedded Slides

    20/28

    Parallel communication

    Multiple data, control, and possibly power wires

    One bit per wire

    High data throughput with short distances

    Typically used when connecting devices on same IC or same circuit board

    Bus must be kept short

    long parallel wires result in high capacitance values which requires more time

    to charge/discharge

    Data misalignment between wires increases as length increases

    Higher cost, bulky

    Serial communication

    Single data wire, possibly also control and power wires

    Words transmitted one bit at a time

    Higher data throughput with long distances

    Less average capacitance, so more bits per unit of time

    Cheaper, less bulky

    More complex interfacing logic and communication protocol

    Sender needs to decompose word into bits

    Receiver needs to recompose bits into word

    Control signals often sent on same wire as data increasing protocol complexity

  • 8/3/2019 Embedded Slides

    21/28

    Wireless communication

    Infrared (IR)

    Electronic wave frequencies just below visible light spectrum

    Diode emits infrared light to generate signal

    Infrared transistor detects signal, conducts when exposed to infrared light

    Cheap to build

    Need line of sight, limited range

    Radio frequency (RF)

    Electromagnetic wave frequencies in radio spectrum

    Analog circuitry and antenna needed on both sides of transmission

    Line of sight not needed, transmitter power determines range

    Error detection and correction

    Often part of bus protocol

    Error detection: ability of receiver to detect errors during transmission

    Error correction: ability of receiver and transmitter to cooperate to correct problem

    Typically done by acknowledgement/retransmission protocol

    Bit error: single bit is inverted

    Burst of bit error: consecutive bits received incorrectly

    Parity: extra bit sent with word used for error detection

    Odd parity: data word plus parity bit contains odd number of 1s

    Even parity: data word plus parity bit contains even number of 1s

    Always detects single bit errors, but not all burst bit errors

    Checksum: extra word sent with data packet of multiple words

    e.g., extra word contains XOR sum of all data words in packet

  • 8/3/2019 Embedded Slides

    22/28

    Serial protocols: I2C

    I2C (Inter-IC)

    Two-wire serial bus protocol developed by Philips Semiconductors nearly 20 years ago

    Enables peripheral ICs to communicate using simple communication hardware

    Data transfer rates up to 100 kbits/s and 7-bit addressing possible in normal mode

    3.4 Mbits/s and 10-bit addressing in fast-mode

    Common devices capable of interfacing to I2C bus:

    EPROMS, Flash, and some RAM memory, real-time clocks, watchdog timers,

    and microcontrollers

    I2C bus structure

  • 8/3/2019 Embedded Slides

    23/28

    Serial protocols: CAN

    CAN (Controller area network)

    Protocol for real-time applications

    Developed by RobertBosch GmbH

    Originally for communication among components of cars

    Applications now using CAN include:

    elevator controllers, copiers, telescopes, production-line control systems, and

    medical instruments

    Data transfer rates up to 1 Mbit/s and 11-bit addressing

    Common devices interfacing with CAN:

    8051-compatible 8592 processor and standalone CANcontrollers

    Actual physical design of CANbus not specified in protocol

    Requires devices to transmit/detect dominant and recessive signals to/from

    bus

    e.g., 1 = dominant, 0 = recessive if single data wire used

    Bus guarantees dominant signal prevails over recessive signal if asserted

    simultaneously

    Serial protocols: FireWire

    FireWire (a.k.a. I-Link, Lynx, IEEE1394)

    High-performance serial bus developed byApple Computer Inc.

    Designed for interfacing independent electronic components

    e.g., Desktop, scanner

    Data transfer rates from 12.5 to 400 Mbits/s, 64-bit addressing

    Plug-and-play capabilities

    Packet-based layered design structure

    Applications using FireWire include:

  • 8/3/2019 Embedded Slides

    24/28

  • 8/3/2019 Embedded Slides

    25/28

    Parallel protocols: PCIBus

    PCIBus (Peripheral Component Interconnect)

    High performance bus originated at Intel in the early 1990s

    Standard adopted by industry and administered by PCISIG (PCISpecial Interest Group)

    Interconnects chips, expansion boards, processor memory subsystems

    Data transfer rates of 127.2 to 508.6 Mbits/s and32-bit addressing

    Later extended to 64-bit while maintaining compatibility with 32-bit schemes

    Synchronous bus architecture

    Multiplexed data/address lines

    Parallel protocols: ARM Bus

    ARM Bus

    Designed and used internally byARM Corporation

    Interfaces with ARM line of processors

    Many IC design companies have own bus protocol

    Data transfer rate is a function of clock speed

    If clock speed of bus is X, transfer rate = 16 xX bits/s

    32-bit addressing

    Wireless protocols: IrDA

    IrDA

    Protocol suite that supports short-range point-to-point infrared data transmission

    Created and promoted by the Infrared Data Association (IrDA)

    Data transfer rate of 9.6 kbps and4 Mbps

    IrDA hardware deployed in notebook computers, printers, PDAs, digital cameras,

    public phones, cell phones

  • 8/3/2019 Embedded Slides

    26/28

    Lack of suitable drivers has slowed use by applications

    Windows 2000/98now include support

    Becoming available on popular embedded OSs

    Wireless protocols: Bluetooth

    Bluetooth

    New, global standard for wireless connectivity

    Based on low-cost, short-range radio link

    Connection established when within 10 meters of each other

    No line-of-sight required

    e.g., Connect to printer in another room

    Wireless Protocols: IEEE802.11

    IEEE802.11

    Proposed standard for wireless LANs

    Specifies parameters for PHY and MAC layers of network

    PHY layer

    physical layer

    handles transmission of data between nodes

    provisions for data transfer rates of 1 or 2 Mbps

    operates in 2.4 to 2.4835 GHz frequency band (RF)

    or300 to 428,000 GHz (IR)

    MAC layer

    medium access control layer

    protocol responsible for maintaining order in shared medium

    collision avoidance/detection

  • 8/3/2019 Embedded Slides

    27/28

    ChapterSummary

    Basic protocol concepts

    Actors, direction, time multiplexing, control methods

    General-purpose processors

    Port-based or bus-based I/O

    I/O addressing: Memory mapped I/O or Standard I/O

    Interrupt handling: fixed or vectored

    Direct memory access

    Arbitration

    Priority arbiter (fixed/rotating) or daisy chain

    Bus hierarchy Advanced communication

    Parallel vs. serial, wires vs. wireless, error detection/correction, layering

    Serial protocols: I2

    C, CAN, FireWire, and USB; Parallel: PCI and ARM.

    Serial wireless protocols: IrDA, Bluetooth, and IEEE 802.11.

    Intel8259 programmable priority controller

    Signal Description

    D[7..0] These wires are connected to the system bus and are used by the microprocessor towrite or read the internal registers of the 8259.

    A[0..0] This pin actis in cunjunction with WR/RD signals. It is used by the 8259 to decipher

    various command words the microprocessor writes and status the microprocessor

    wishes to read.

    WR When this write signal is asserted, the 8259 accepts the command on the data line, i.e.,

    the microprocessor writes to the 8259 by placing a command on the data lines and

    asserting this signal.

    RD When this read signal is asserted, the 8259 provides on the data lines its status, i.e., the

    microprocessor reads the status of the 8259 by asserting this signal and reading the data

    lines.

    INT This signal is asserted whenever a valid interrupt request is received by the 8259, i.e., itis used to interrupt the microprocessor.

    INTA This signal, is used to enable 8259 interrupt-vector data onto the data bus by a sequence

    of interrupt acknowledge pulses issued by the microprocessor.

    IR

    0,1,2,3,4,5,6,7

    An interrupt request is executed by a peripheral device when one of these signals is

    asserted.

    CAS[2..0] These are cascade signals to enable multiple 8259 chips to be chained together.

    SP/EN This function is used in conjunction with the CAS signals for cascading purposes.

  • 8/3/2019 Embedded Slides

    28/28

    Intel8237 DMA controller

    Signal Description

    D[7..0] These wires are connected to the system bus (ISA) and are used by the

    microprocessor to write to the internal registers of the 8237.

    A[19..0] These wires are connected to the system bus (ISA) and are used by the DMA to

    issue the memory location where the transferred data is to be written to. The 8237 is

    ALE* This is the address latch enable signal. The 8237 use this signal when driving the

    system bus (ISA).

    MEMR* This is the memory write signal issued by the 8237 when driving the system bus

    (ISA).

    MEMW* This is the memory read signal issued by the 8237 when driving the system bus (ISA).

    IOR* This is the I/O device read signal issued by the 8237 when driving the system bus

    (ISA) in order to read a byte from an I/O deviceIOW* This is the I/O device write signal issued by the 8237 when driving the system bus

    (ISA) in order to write a byte to an I/O device.

    HLDA This signal (hold acknowledge) is asserted by the microprocessor to signal that it has

    relinquished the system bus (ISA).

    HRQ This signal (hold request) is asserted by the 8237 to signal to the microprocessor a

    request to relinquish the system bus (ISA).

    REQ 0,1,2,3 An attached device to one of these channels asserts this signal to request a DMA

    transfer.

    ACK 0,1,2,3 The 8237 asserts this signal to grant a DMA transfer to an attached device to one of

    these channels.*See the ISA bus description in this chapter for complete details.