bus-based computer systems zbusses. zmemory devices. zi/o devices zdesigning with microprocessors....
TRANSCRIPT
Bus-Based Computer Systems
Busses.Memory devices.I/O devicesDesigning with microprocessors.Development and debugging.System-level performance analysis.
1
04/19/23
The CPU busBus allows CPU, memory, devices to
communicate. Shared communication medium.
A bus is: A set of wires. A communications protocol.
Protocols are specified by state machines.asynchronous logic behavior
2
04/19/23
Four-cycle handshake( 四周期握手协议 )
device 1 device 2enq
ack
time
device 1
device 2
1 2 3 4
3
04/19/23
Four-cycle handshake, cont’.
1. Device 1 raises enq.2. Device 2 responds with ack.3. Device 2 lowers ack once it has
finished.4. Device 1 lowers enq.
4
Microprocessor busses
04/19/23 5
04/19/23
Bus read and write
6
04/19/23
State diagrams for bus read
CPU device
Get data
Done
Adrs
Wait
See ack
Senddata
Release ack
Adrs
Wait
Ack
start
7
04/19/23
Bus wait state
8
04/19/23
Bus burst read(总线突发读 )
9
Disconnected transfers( 非连接传输 )
The request and response are separate. A first operation requests the transfer The bus can then be used for other
operations When the data are ready, the transfer is
completed later.
10
Small data bundles
Reduce the cost of the chip.How to form the data or address
Assembled inside the CPU’s bus logic before being presented to the CPU proper.
11
04/19/23
Bus multiplexing(多路复用)
CPU
adrs
device
data
adrs
data enable
Adrs enable
12
DMA(直接存储器访问)
Direct memory access (DMA) performs data transfers without executing instructions. CPU sets up transfer. DMA engine fetches, writes.
DMA controller is a separate unit.
04/19/23 13
DMA(直接存储器访问)
04/19/23 14
Bus mastership(总线主控器)
By default, CPU is bus master and initiates transfers.
DMA become bus master to perform its work. CPU can’t use bus while DMA operates.
Bus mastership protocol:(Four-cycle handshake protocol) Bus request. Bus grant.
04/19/23 15
DMA operation
CPU sets DMA registers for start address, length. DMA status register controls the unit. Once DMA is bus master, it transfers automatically.
May run continuously until complete. May use every nth bus cycle.
04/19/23 16
System bus configurations
Multiple busses allow parallelism: Slow devices on
one bus. Fast devices on
separate bus.
A bridge connects two busses.
04/19/23
CPU slow device
memory
high-speeddevice
brid
ge
slow device
17
04/19/23
Bridge state diagram
18
ARM AMBA bus Two varieties:
AHB is high-performance. APB is lower-speed, lower cost.
AHB supports pipelining, burst transfers, split transactions, multiple bus masters.
All devices are slaves on APB.
04/19/23 19
ARM AMBA bus
04/19/23 20
Memory components
Each type of memory comes in varying: Capacities. Widths.
For a memory of a given size, there are several versions 256Mb
64M*432M*8
04/19/23 21
I/O devices
KeyboardLedDisplay
7-segment LCD Touchscreen
Timer and counterA/D , D/A
22
04/19/23
Keyboard---Switch debouncing
A switch must be debounced to multiple contacts caused by eliminate mechanical bouncing:
23
04/19/23
Encoded keyboard
An array of switches is read by an encoder.
N-key rollover remembers multiple key depressions.
row
24
04/19/23
LED(发光二极管 )
Must use resistor to limit current:
25
display
Display Directly driven
Single-digit display consists of seven segments
Frame buffer(帧缓冲区 )Large display
26
04/19/23
7-segment LCD display
May use parallel or multiplexed input.
27
04/19/23
Types of high-resolution display
Liquid crystal display (LCD) is dominant form.
Plasma(等离子 ), Organic Light-Emitting Diode ; OLED ( 有机发光二极管显示面板 ) , etc.
Frame buffer holds current display contents. Written by processor. Read by video.
28
04/19/23
Touchscreen(触摸屏 )
Includes input and output device.Input device is a two-dimensional
voltmeter:
29
04/19/23
Touchscreen position sensing
ADC
voltage
30
Timers and counters(定时器和计数器)
Very similar: a timer is incremented by a periodic
signal; a counter is incremented by an
asynchronous, occasional signal.Rollover causes interrupt.
31
Watchdog timer(监视定时器 )
Watchdog timer is periodically reset by system timer.
If watchdog is not reset, it generates an interrupt to reset the host.
host CPU watchdogtimer
interrupt
reset
32
04/19/23
Digital-to-analog conversion
Interface: only the data value The input value is continuously
converted to analog form
33
04/19/23
A/D conversion
Types of A/D converter circuits A constant amount of time Variable-time converters provide a done signal
so that the microprocessor knows when the value is ready.
Interface: Analog inputs Two major digital input:
A data port allows A/D registers to be read and writtenA clock input
34
04/19/23
System architectures
Architecture: a set of elements and the relationships between them that together form a single unit.
Architectures and components: software; hardware.
Some software is very hardware-dependent.
35
04/19/23
Hardware platform architecture
Contains several elements:CPU;bus;memory;I/O devices: networking, sensors,
actuators, etc.How big/fast much each one be?
36
04/19/23
Software architecture
Functional description must be broken into pieces:
division among people;conceptual organization;performance;testability;maintenance.
37
04/19/23
Hardware and software architectures
Hardware and software are intimately related:
software doesn’t run without hardware;how much hardware you need is
determined by the software requirements: speed; memory.
38
04/19/23
Evaluation boards(评估板 )
Designed by CPU manufacturer or others.Includes CPU, memory, some I/O devices.May include a serial link for downloading
programs.CPU manufacturer often gives out
evaluation board netlist(网表 ) and board layout---can be used as starting point for your custom board design.
39
04/19/23
Adding logic to a board
Programmable logic devices (PLDs) provide low/medium density logic.
Field-programmable gate arrays (FPGAs) provide more logic and multi-level logic.
Application-specific integrated circuits (ASICs) are manufactured for a single purpose.
40
04/19/23
The PC as a platform
Advantages: A wide variety of I/O devices rich and familiar software environment.
Disadvantages: Larger, more power hungry requires a lot of hardware resources; not well-adapted to real-time.
41
04/19/23
Typical PC hardware platform
CPU
CPU bus
memory
DMAcontroller
timers
businterface
bus
inte
rfac
e
high-speed bus
low-speed bus
High-speeddevice
device
42
04/19/23
Typical busses
PCI: standard for high-speed interfacing 33 or 66 MHz. PCI Express:wide buese with many data
and address bits along with multiple control bits.
USB (Universal Serial Bus), Firewire (IEEE 1394): relatively low-cost serial interface with high speed.
43
04/19/23
Software elements
IBM PC uses BIOS (Basic I/O System) to implement low-level functions: boot-up; minimal device drivers.
BIOS has become a generic term for the lowest-level system software.
44
04/19/23
Debugging embedded systems
Challenges: target system may be hard to observe; target may be hard to control; may be hard to generate realistic
inputs; setup sequence may be complex.
45
04/19/23
Host/target design
Use a host system to prepare software for target system:
targetsystem
host systemserial line
46
04/19/23
Host-based tools
Cross compiler:(交叉编译器 ) compiles code on host for target
system.Cross debugger:
displays target state, allows target system to be controlled.
47
04/19/23
Software debuggers
Breakpoint LED.ICE.(电路内部仿真器)Logic analyzer
48
04/19/23
Breakpoints
A breakpoint allows the user to stop execution, examine system state, and change state.
Replace the breakpointed instruction with a subroutine call to the monitor program.
49
04/19/23
ARM breakpoints
0x400 MUL r4,r6,r60x404 ADD r2,r2,r40x408 ADD r0,r0,#10x40c B loop
uninstrumented code
0x400 MUL r4,r6,r60x404 ADD r2,r2,r40x408 ADD r0,r0,#10x40c BL bkpoint
code with breakpoint
50
04/19/23
Breakpoint handler actions
Save registers.Allow user to examine machine.Before returning, restore system state.
Safest way to execute the instruction is to replace it and execute in place.
Put another breakpoint after the replaced breakpoint to allow restoring the original breakpoint.
51
04/19/23
In-circuit emulators (ICE)
A microprocessor in-circuit emulator is a specially-instrumented microprocessor.
Allows you to stop execution, examine CPU state, modify registers.
JTAG
52
04/19/23
Logic analyzers
A logic analyzer is an array of low-grade oscilloscopes(示波器 ):
53
04/19/23
How to exercise code
Run on host system.Run on target system.Run in instruction-level simulator.Run on cycle-accurate simulator.Run in hardware/software co-
simulation environment.
54
04/19/23
Debugging real-time code
Bugs in drivers can cause non-deterministic behavior in the foreground problem.
Bugs may be timing-dependent.
55
System-level performance analysis
Performance depends on all the elements of the system: CPU. Cache. Bus. Main memory. I/O device.
04/19/23
memory CPU
cache
56
04/19/23
Bandwidth as performance
Bandwidth applies to several components: Memory. Bus. CPU fetches.
Different parts of the system run at different clock rates.
Different components may have different widths (bus, memory).
57
04/19/23
Bandwidth and data transfers
Video frame: 320 x 240 x 3 = 230,400 bytes. Transfer in 1/30 sec=0.033 sec for a frame.
Transfer 1 byte/s, 0.23 sec per frame. Too slow.
Increase bandwidth: Increase bus width. Increase bus clock rate.
58
Bus bandwidth
T: # bus cycles.P: time/bus_cycle.Total time for transfer:
t = TP.
W:W-wide set of bytes.D: data payload length.Overhead: O=O1 + O2
04/19/23
O1 D O2
W
Tbasic(N) = (D+O)N/W
59
Bus burst transfer bandwidth
T: # bus cycles.P: time/bus cycle.Total time for transfer:
t = TP.
B: a burst performs B transfers of w bytes.
D: data payload length.
O = O1 + O2 .
04/19/23
B O
W
Tburst(N) = (BD+O)N/(BW)
21
…
60
04/19/23
Bandwidth problems of memory----Memory aspect ratios(长宽比 )
64 M16 M
8 M
1 4 8
61
04/19/23
Memory access times
Memory component access times comes from chip data sheet. Page modes allow faster access for
successive transfers on same page.If data doesn’t fit naturally into
physical words: A = [E*w/W]+1
62
Bus performance bottlenecks
Transfer 320 x 240 video frame 30 frames/sec = 6,912,000 bytes/sec.
Is performance bottleneck bus or memory?
04/19/23
memory CPU
63
04/19/23
Bus performance bottlenecks, cont’d.
Bus: assume 1 MHz bus, D=1, O=3,w=2: Tbasic = (1+3)6,912,000/2 = 13,824,000
cycles = 13.82 sec.Memory: try burst mode B=4, width
w=0.5,D=1,O=4 ,100MHz bus. Tmem = (4*1+4)6,912,000/(4*0.5) =
27,648,000 cycles = 0.2765 sec.
64
Performance spreadsheet
04/19/23
Bus memory
Clock
1.00E-6 Clock
1.00E-8
W 2 W 0.5
D 1 D 1
O 3 O 4
B 4
N 6 912 000
N 6 912 000
Tbasic 13,824,000
Tmem 27,648,000
t 13.82 T 0.2765
65
Parallelism
Speed things up by running several units at once.
DMA provides parallelism if CPU doesn’t need the bus: DMA + bus. CPU.
04/19/23 66
summary
BusI/O devicesDevelopment and debug
67
homework
1. 4-2,
2. 4-6,
3.
68