embedded slides
TRANSCRIPT
-
8/3/2019 Embedded Slides
1/28
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION
Lesson-17: Memory organisation, and types of memory
1. Memory OrganisationRandom access model
A memory-, a data byte, or a word, or a double word, or a quad word may be accessed from or
at all addressable locations with a similar process would be used to access from all locations and
there is would be equal access time for a read or for a write that is independent of a memory
address location. This mode differentiates from another model called serial access mode
Addresses
Memory (both RAM and ROM) divided into a set of storage locations, each of which can hold 1
byte (8 bits) of data.
The storage locations are numbered, and the number of a storage location (called its address) is
used to tell the memory system which location the processor wants to reference.
Important characteristics of a computer system is the width of the addresses it uses, which
limits the amount of memory that the processor can address. Most current computers use
either 32-bit or 64-bit addresses, allowing them to access either 232
or 264
bytes of memory.
RANDOM ACCESS MODEL OF MEMORY
Simple model for RAM and ROM
Both has random-access model of memory
All memory operations take the same amount of time independent of the address of the byte or
word at the memory
Example
Assume that the memory system will support two operations: load (read operation into
processor from memory) and store (read operation from processor into memory).
Load from one set of addresses (2 or 4) will take same time for store from another set of
addresses (2 or 4)
ROM
Contents of the read-only memory cannot be modified by the computer but may be read.
A system has ROM unit(s)for bootstrap program(s), basic input-output system (BIOS)
program(s) and for vector addresses for the interrupts
-
8/3/2019 Embedded Slides
2/28
Used to hold bootstrap program that is executed automatically by the system every time it is
turned on or reset. Instructs the system to load its operating system off
ROM image
ROM image holds the programs, operating system, and data required by the system
Random-access memory (RAM)
Can be both read and, written,
Hold the programs, operating system, and data required by the system.
Generally volatile, meaning that it does not retain the data stored in it when the system 's
power is turned off. A
Data that needs to be stored while the system is off must be written to a permanent storage
device, such as a flash memory or hard disk.
An example is as follows: A mobile phone has 128 kB or 256 kB of RAM to hold the stack and
temporary variables of the programs, operating system, and data.
ALIGNMENT OF MULTIBYTE STORE AND LOAD IN A MEMORY ORGANISATION
Some memory organisation requires loads and stores to be "aligned. A 4-byte word has been aligned at
address 0x000C or 0x1000, which is a multiple of 4. This simplifies the organisation of the memory
system
LITTLE ENDIAN AND BIG ENDIAN IN A MEMORY ORGANISATION
Some processor and memory organisation requires littleendian and other bigendian aligned
multiple bytes when there is store into the memory or load into the processor from memory.
ARM processor permits programming at the start and enables a programmer to define one of
the word-alignments littleendian or bigendian at the beginning.
Princeton Architecture
80x86 processors and ARM7 have Princeton architecture for main memory. 8051-family
microcontrollers have Harvard architecture.). Vectors and pointers, variables, program segments
and memory blocks for data and stacks have different addresses in the program in Princeton
memory architecture.
Harvard architecture
When the address spaces for the data and for program are distinct
-
8/3/2019 Embedded Slides
3/28
Handling streams of data that are required to be accessed in cases of single instruction multiple
data type instructions and DSP instructions.
Separate data buses ensure simultaneous accesses for instructions and data.
Harvard and Princeton Memory Organizations
-
8/3/2019 Embedded Slides
4/28
2. Types of Memoryy Most systems two types of memoryread-onlymemory(ROM) and random-accessmemory
(RAM).
y A computer system has ROM unit(s) for bootstrap program(s), basic input-output system (BIOS)
program's) and for vector addresses for the interrupts
y An embedded system has ROM unit(s) for storing ROM image and flash to save non-volatile data
and results
ROM Uses
Language specific bits for the fonts corresponding to each character to a printer or display unit.
Images bits for a display.
Pictogram bytes for the full bit-image corresponding to the pixels for a pictogram. Sequentialchanges at the inputs of display unit repeatedly generate the full pictogram.
In a CISC as a control ROM at a micro-programmed unit for implementing instructions
1) Masked ROM Used for large scale manufacturing; mask prepared for foundry
- A finalised ROM image of system program and data, pictograms, image pixels, pixels for the fonts of
a language, combination-circuits implementing a truth-table
2) EPROM Used in place of masked ROM during development phase; UV Erasable and Electrically
programmable by a device programmer
3) E2ROM Used during the program run to save non-volatile data and results (for examples, date
and time of a transaction, present port status, port driving history, system malfunctions
history); Electrically Erasable by writing a byte or a set of bytes with all 1s and Electrically
programmable during a program run one byte write at each write instance.
4) Flash A flash memory functions as the ROM. Electrically Erasable sector of 16 kB to 256 kB at
an instance and Electrically programmable one byte at each instance during a program run.
RAM
The RAM can be both read and, written, and is used to hold the programs, operating system,
and data required by a computer system. In embedded systems, it holds the stack and
temporary variables of the programs, operating system, and data
RAMCharacteristics
RAM is generally volatile,
-
8/3/2019 Embedded Slides
5/28
does not retain the data stored in it when the system 's power is turned off.
Any data that needs to be stored while the system is off must be written to a permanent storage
device, such as a flash memory or hard disk.
Example :A mobile phone has 128 kB or 256 kB of RAM to hold the stack and temporary variables of the
programs, operating system, and data
RAM Types
1) SRAM (static RAM) and DRAM (dynamic RAM) Used for saving the variables, stacks, process
control blocks, input buffer, output buffer, decompressed format of program and data at the
ROM image
2) EDO (Extended Data Out) RAM Used up to 100 MHz clock rate, zero wait state between two
fetches, single cycle read or write
3)SDRAM (SynchronousDRAM) Synchronised read operation; keeps next word ready while
previous one is being fetched; used up to 1 GHz clock cycle
4) RDRAM (Rambus* DRAM) Burst accesses of four successive words in single fetch; used for 1
GHz + performance of the system
* A developer company name
5) Parameterised Distributed RAM when slow bus accesses exists RAM distributed for the specific
tasks of the system and devices - for examples for fast IO buffers, fast stacks, ..
6)
Parameterised Block RAM Specific block dedicated for specific use, for example, for DCToperations
Summary
We learnt
Random access memory model, ROM, RAM
Addresses
Data alignment
Little and big endian
Flash
Princeton and Harvard architectures
-
8/3/2019 Embedded Slides
6/28
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION
Lesson-16: Processor organisation and Performance Metrics
1. Processor, Memory and busesProcessor Organisation
Processor
ALU.
Processor circuit does sequential operations and a clockguides these.
Program counterand stackpointer, which points to the instruction to be fetched and top of the
data pushed into the stack.
Certain processor have on-chip memory management unit (MMU).
Registers
General-purpose registers.
Registers organize onto a common internal bus of the processor. A register is of 32, 16 or 8 bits
depending on whether the ALU performs at an instance a 32- or 16- or 8-bit operation
-
8/3/2019 Embedded Slides
7/28
CISC
Processor may have CISC (Complex Instruction Set Computer) or RISC (Reduced Instruction Set
Computer) architecture may affect the system design.
CISChas ability to process complex instructions and complex data sets with fewer registers as it
provides for a large number of addressing modes.
RISC
Simpler instructions and all in a single cycle per instruction.
New RISC processors, such as ARM7 and ARM9 also provide for a few most useful CISC
instructions also.
CISCconverges toa RISCimplementation because the most instructions are hardwired and
implement in single clock cycle
Interrupts
Processor provides for the inputs for external interrupts so that the external circuits can send
the interrupt signals
May possess an internal interruptcontroller(handler) to program the service routine priorities
and to allocate vector addresses.
DMA (Direct Memory Access) Controller
External Devices can directly write and read into the blocks of RAM using the DMA controller,
when the buses are not in use of the processor
Multiple DMA channels on chip.
When there are number ofI/O devices and an I/O device needs to access a multi byte data set
fast, the system memory on-chip DMA controller help greatly
INSTRUCTION LEVEL PARALLELISM
y Execute several instructions is parallel. Two or more instructions execute in parallel as well as in
pipeline.
y
During the in which two parallel pipelines in a processor and two instructions In and In+1
executing in parallel at the separate execution units .
-
8/3/2019 Embedded Slides
8/28
3. Processor Performance Metrics
Metrics
1) MIPS Million Instructions Per Second
2) MFLOPS Million Floating Point Operations Per Second
3) Dhrystone/s Number of times a benchmark program called Dhrystone program can run per
second.[1MIPS = 1757Dhrystone/s]
Embedded Benchmark Consortium (EEMBC) five-benchmark program suites
Telecommunications
Consumer Electronics
Automotive and Industrial Electronics
Consumer Electronics
Office Automation.
-
8/3/2019 Embedded Slides
9/28
Summary
We learnt
Processor, address, data and control buses and Memory
CISC and RISC
Instruction Level Parallelism
Performance Metrics
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION
Lesson-18: Memory Allocations and Memory Map
1. Memory Allocation To Program Segments and Blocks
Functions, Processes, Data and Stacks at the Various Segments ofMemory
Segment wise memory allocation in four segments; Code, Data, Stack and Extra (for examples, image,
String)
Segments and Paging at the Memory
-
8/3/2019 Embedded Slides
10/28
-
8/3/2019 Embedded Slides
11/28
4) Table
5) Look up Table Look-up-table row first column points to another memory block of a data
structure data
6) List: In a list element, a data structure of an item also points to the next item
7) Process Control Block [Refer Chapter 7Lesson 1]
Memory Map
Map to show the program and data allocation of the addresses to ROM, RAM, EEPROM or Flash in
the system .
-
8/3/2019 Embedded Slides
12/28
Memory map for an exemplary embedded system, smart card needing 2 kB memory
Memory map for an exemplary Java embedded card with software for encrypting and deciphering
the transactions
-
8/3/2019 Embedded Slides
13/28
Memory map sections in a smart card
-
8/3/2019 Embedded Slides
14/28
Memory map sections in another smart card
Summary
We learnt
Allocations to various Segments and data structures and the memory map of Exemplary cases
-
8/3/2019 Embedded Slides
15/28
-
8/3/2019 Embedded Slides
16/28
Arbitration: Priority arbiter
Consider the situation where multiple peripherals request service from single resource (e.g.,
microprocessor, DMA controller) simultaneously - which gets serviced first?
Priority arbiter
Single-purpose processor
Peripherals make requests to arbiter, arbiter makes requests to resource
Arbiter connected to system bus for configuration only
Arbitration: Daisy-chain arbitration
Arbitration done by peripherals
Built into peripheral or external logic added
req input and ackoutput added to each peripheral
Peripherals connected to each other in daisy-chain manner
One peripheral connected to resource, all others connected upstream
Peripherals req flows downstream to resource, resources ackflows upstream to
requesting peripheral
Closest peripheral has highest priority
-
8/3/2019 Embedded Slides
17/28
Pros/cons
Easy to add/remove peripheral - no system redesign needed
Does not support rotating priority
One broken peripheral can cause loss of access to other peripherals
Network-oriented arbitration
When multiple microprocessors share a bus (sometimes called a network)
Arbitration typically built into bus protocol
Separate processors may try to write simultaneously causing collisions
Data must be resent
Dont want to start sending again at same time
statistical methods can be used to reduce chances
Typically used for connecting multiple distant chips
Trend use to connect multiple on-chip processors
Example: Vectored interrupt using
an interrupt table
Fixed priority: i.e., Peripheral1 has highest priority
Keyword _at_ followed by memory address forces compiler to place variables in specific
memory locations
-
8/3/2019 Embedded Slides
18/28
e.g., memory-mapped registers in arbiter, peripherals
A peripherals index into interrupt table is sent to memory-mapped register in arbiter
Peripherals receive external data and raise interrupt
Multilevel bus architectures
Dont want one bus for all communication
Peripherals would need high-speed, processor-specific bus interface
excess gates, power consumption, and cost; less portable
Too many peripherals slows down bus
Processor-local bus
High speed, wide, most frequent communication
Connects microprocessor, cache, memory controllers, etc.
Peripheral bus
Lower speed, narrower, less frequent communication
Typically industry standard bus (ISA, PCI) for portability
-
8/3/2019 Embedded Slides
19/28
Bridge
Single-purpose processor converts communication between busses
Advanced communication principles
Layering
Break complexity of communication protocol into pieces easier to design and
understand
Lower levels provide services to higher level
Lower level might work with bits while higher level might work with packetsof data
Physical layer
Lowest level in hierarchy
Medium to carry data from one actor (device or node) to another
Parallel communication
Physical layer capable of transporting multiple bits of data
Serial communication
Physical layer transports one bit of data at a time
Wireless communication
No physical connection needed for transport at physical layer
-
8/3/2019 Embedded Slides
20/28
Parallel communication
Multiple data, control, and possibly power wires
One bit per wire
High data throughput with short distances
Typically used when connecting devices on same IC or same circuit board
Bus must be kept short
long parallel wires result in high capacitance values which requires more time
to charge/discharge
Data misalignment between wires increases as length increases
Higher cost, bulky
Serial communication
Single data wire, possibly also control and power wires
Words transmitted one bit at a time
Higher data throughput with long distances
Less average capacitance, so more bits per unit of time
Cheaper, less bulky
More complex interfacing logic and communication protocol
Sender needs to decompose word into bits
Receiver needs to recompose bits into word
Control signals often sent on same wire as data increasing protocol complexity
-
8/3/2019 Embedded Slides
21/28
Wireless communication
Infrared (IR)
Electronic wave frequencies just below visible light spectrum
Diode emits infrared light to generate signal
Infrared transistor detects signal, conducts when exposed to infrared light
Cheap to build
Need line of sight, limited range
Radio frequency (RF)
Electromagnetic wave frequencies in radio spectrum
Analog circuitry and antenna needed on both sides of transmission
Line of sight not needed, transmitter power determines range
Error detection and correction
Often part of bus protocol
Error detection: ability of receiver to detect errors during transmission
Error correction: ability of receiver and transmitter to cooperate to correct problem
Typically done by acknowledgement/retransmission protocol
Bit error: single bit is inverted
Burst of bit error: consecutive bits received incorrectly
Parity: extra bit sent with word used for error detection
Odd parity: data word plus parity bit contains odd number of 1s
Even parity: data word plus parity bit contains even number of 1s
Always detects single bit errors, but not all burst bit errors
Checksum: extra word sent with data packet of multiple words
e.g., extra word contains XOR sum of all data words in packet
-
8/3/2019 Embedded Slides
22/28
Serial protocols: I2C
I2C (Inter-IC)
Two-wire serial bus protocol developed by Philips Semiconductors nearly 20 years ago
Enables peripheral ICs to communicate using simple communication hardware
Data transfer rates up to 100 kbits/s and 7-bit addressing possible in normal mode
3.4 Mbits/s and 10-bit addressing in fast-mode
Common devices capable of interfacing to I2C bus:
EPROMS, Flash, and some RAM memory, real-time clocks, watchdog timers,
and microcontrollers
I2C bus structure
-
8/3/2019 Embedded Slides
23/28
Serial protocols: CAN
CAN (Controller area network)
Protocol for real-time applications
Developed by RobertBosch GmbH
Originally for communication among components of cars
Applications now using CAN include:
elevator controllers, copiers, telescopes, production-line control systems, and
medical instruments
Data transfer rates up to 1 Mbit/s and 11-bit addressing
Common devices interfacing with CAN:
8051-compatible 8592 processor and standalone CANcontrollers
Actual physical design of CANbus not specified in protocol
Requires devices to transmit/detect dominant and recessive signals to/from
bus
e.g., 1 = dominant, 0 = recessive if single data wire used
Bus guarantees dominant signal prevails over recessive signal if asserted
simultaneously
Serial protocols: FireWire
FireWire (a.k.a. I-Link, Lynx, IEEE1394)
High-performance serial bus developed byApple Computer Inc.
Designed for interfacing independent electronic components
e.g., Desktop, scanner
Data transfer rates from 12.5 to 400 Mbits/s, 64-bit addressing
Plug-and-play capabilities
Packet-based layered design structure
Applications using FireWire include:
-
8/3/2019 Embedded Slides
24/28
-
8/3/2019 Embedded Slides
25/28
Parallel protocols: PCIBus
PCIBus (Peripheral Component Interconnect)
High performance bus originated at Intel in the early 1990s
Standard adopted by industry and administered by PCISIG (PCISpecial Interest Group)
Interconnects chips, expansion boards, processor memory subsystems
Data transfer rates of 127.2 to 508.6 Mbits/s and32-bit addressing
Later extended to 64-bit while maintaining compatibility with 32-bit schemes
Synchronous bus architecture
Multiplexed data/address lines
Parallel protocols: ARM Bus
ARM Bus
Designed and used internally byARM Corporation
Interfaces with ARM line of processors
Many IC design companies have own bus protocol
Data transfer rate is a function of clock speed
If clock speed of bus is X, transfer rate = 16 xX bits/s
32-bit addressing
Wireless protocols: IrDA
IrDA
Protocol suite that supports short-range point-to-point infrared data transmission
Created and promoted by the Infrared Data Association (IrDA)
Data transfer rate of 9.6 kbps and4 Mbps
IrDA hardware deployed in notebook computers, printers, PDAs, digital cameras,
public phones, cell phones
-
8/3/2019 Embedded Slides
26/28
Lack of suitable drivers has slowed use by applications
Windows 2000/98now include support
Becoming available on popular embedded OSs
Wireless protocols: Bluetooth
Bluetooth
New, global standard for wireless connectivity
Based on low-cost, short-range radio link
Connection established when within 10 meters of each other
No line-of-sight required
e.g., Connect to printer in another room
Wireless Protocols: IEEE802.11
IEEE802.11
Proposed standard for wireless LANs
Specifies parameters for PHY and MAC layers of network
PHY layer
physical layer
handles transmission of data between nodes
provisions for data transfer rates of 1 or 2 Mbps
operates in 2.4 to 2.4835 GHz frequency band (RF)
or300 to 428,000 GHz (IR)
MAC layer
medium access control layer
protocol responsible for maintaining order in shared medium
collision avoidance/detection
-
8/3/2019 Embedded Slides
27/28
ChapterSummary
Basic protocol concepts
Actors, direction, time multiplexing, control methods
General-purpose processors
Port-based or bus-based I/O
I/O addressing: Memory mapped I/O or Standard I/O
Interrupt handling: fixed or vectored
Direct memory access
Arbitration
Priority arbiter (fixed/rotating) or daisy chain
Bus hierarchy Advanced communication
Parallel vs. serial, wires vs. wireless, error detection/correction, layering
Serial protocols: I2
C, CAN, FireWire, and USB; Parallel: PCI and ARM.
Serial wireless protocols: IrDA, Bluetooth, and IEEE 802.11.
Intel8259 programmable priority controller
Signal Description
D[7..0] These wires are connected to the system bus and are used by the microprocessor towrite or read the internal registers of the 8259.
A[0..0] This pin actis in cunjunction with WR/RD signals. It is used by the 8259 to decipher
various command words the microprocessor writes and status the microprocessor
wishes to read.
WR When this write signal is asserted, the 8259 accepts the command on the data line, i.e.,
the microprocessor writes to the 8259 by placing a command on the data lines and
asserting this signal.
RD When this read signal is asserted, the 8259 provides on the data lines its status, i.e., the
microprocessor reads the status of the 8259 by asserting this signal and reading the data
lines.
INT This signal is asserted whenever a valid interrupt request is received by the 8259, i.e., itis used to interrupt the microprocessor.
INTA This signal, is used to enable 8259 interrupt-vector data onto the data bus by a sequence
of interrupt acknowledge pulses issued by the microprocessor.
IR
0,1,2,3,4,5,6,7
An interrupt request is executed by a peripheral device when one of these signals is
asserted.
CAS[2..0] These are cascade signals to enable multiple 8259 chips to be chained together.
SP/EN This function is used in conjunction with the CAS signals for cascading purposes.
-
8/3/2019 Embedded Slides
28/28
Intel8237 DMA controller
Signal Description
D[7..0] These wires are connected to the system bus (ISA) and are used by the
microprocessor to write to the internal registers of the 8237.
A[19..0] These wires are connected to the system bus (ISA) and are used by the DMA to
issue the memory location where the transferred data is to be written to. The 8237 is
ALE* This is the address latch enable signal. The 8237 use this signal when driving the
system bus (ISA).
MEMR* This is the memory write signal issued by the 8237 when driving the system bus
(ISA).
MEMW* This is the memory read signal issued by the 8237 when driving the system bus (ISA).
IOR* This is the I/O device read signal issued by the 8237 when driving the system bus
(ISA) in order to read a byte from an I/O deviceIOW* This is the I/O device write signal issued by the 8237 when driving the system bus
(ISA) in order to write a byte to an I/O device.
HLDA This signal (hold acknowledge) is asserted by the microprocessor to signal that it has
relinquished the system bus (ISA).
HRQ This signal (hold request) is asserted by the 8237 to signal to the microprocessor a
request to relinquish the system bus (ISA).
REQ 0,1,2,3 An attached device to one of these channels asserts this signal to request a DMA
transfer.
ACK 0,1,2,3 The 8237 asserts this signal to grant a DMA transfer to an attached device to one of
these channels.*See the ISA bus description in this chapter for complete details.