avishai wool lecture 8 - 1 introduction to systems programming lecture 8 paging design input-output

Post on 20-Dec-2015

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Avishai Woollecture 8 - 1

Introduction to Systems Programming Lecture 8

Paging Design

Input-Output

Avishai Woollecture 8 - 2

Steps in Handling a Page Fault

Avishai Woollecture 8 - 3

VirtualPhysical mapping

• CPU accesses virtual address 100000

• MMU looks in page table to find physical address– Page table is in memory too

• Unreasonable overhead!

Avishai Woollecture 8 - 4

TLB: Translation Lookaside Buffer

• Idea: Keep the most frequently used parts of the page table in a cache, inside the MMU chip.

• TLB holds a small number of page table entries: Usually 8 – 64

• TLB hit rate very high because, e.g., instructions fetched sequentially.

Avishai Woollecture 8 - 5

A TLB to speed up paging

Example: • Code loops through pages 19,20,21• Uses data array in pages 129,130,140• Stack variables in pages 860,861

Avishai Woollecture 8 - 6

Valid TLB Entries

• TLB miss:– Do regular page lookup– Evict a TLB entry and store the new TLB entry– Miniature paging system, done in hardware

• When OS does context switch to a new process, all TLB entries become invalid: – Early instructions of new process will cause TLB

misses.

Avishai Woollecture 8 - 7

TLB placement/eviction

• Done by hardware

• Placement rule:– TLBIndex = VirtualAddr modulo TLBSize– TLBSize is always 2k

TLBIndex = k least-significant bits

– Keep “tag” (rest of bits) to fully identify virtual addr

• Virtual address can be only in one TLB index

• No explicit “eviction”: simply overwrite what is in TLB[TLBIndex]

Avishai Woollecture 8 - 8

TLB + Page table lookup

In TLB?In pagetable?

Page fault:copy from

disk to memory

Virtual address

Physical address

No

Yes Yes; update TLB

No

Avishai Woollecture 8 - 9

TLB – cont.

• If address is in TLB page is in physical memory– OS invalidates TLB entry when evicting a page– So page fault not possible if we have a TLB hit

• “page fault rate” is computed only on TLB misses

Avishai Woollecture 8 - 10

Example: Average memory access time

• TLB lookup: 4ns• Phys mem access: 10ns• Disk access: 10ms

• TLB miss rate: 1%• Page fault rate: 0.1%

• Assume page table is in memory.

p=0.99, time=4ns+10ns

Page hit: p=0.01*0.999, time=4ns+10ns+10ns

Page fault: p=0.01*0.001, time=4ns+10ns+10ms+10nsTLB miss

TLB hit

Average memory access: 114.1ns (1.141*10-7)

Avishai Woollecture 8 - 11

Design issues in Paging

Avishai Woollecture 8 - 12

Local versus Global Allocation Policies:Physical Memory

a) Original configuration – ‘A’ causes page faultb) Local page replacementc) Global page replacement

Avishai Woollecture 8 - 13

Local or Global?

• Local number of frames per process is fixed– If working set grows thrashing– If working set shrinks waste

• Global usually better

• Some algorithms can only be local (working set, WSClock).

Avishai Woollecture 8 - 14

How many frames to give a process?

• Fixed number

• Proportional to its size (before load)

• Zero, let it issue page faults for all its pages.– This is called pure demand paging.

• Monitor page-fault-frequency (PFF), give more pages if PFF high.

Avishai Woollecture 8 - 15

Page fault rate as a function of the number of page frames assigned

Avishai Woollecture 8 - 16

Load Control• Despite good designs, system may still thrash• When PFF algorithm indicates

– some processes need more memory – but no processes need less

• Solution: Reduce number of processes competing for memory– swap one or more to disk, divide up frames they held– reconsider degree of multiprogramming

Avishai Woollecture 8 - 17

Cleaning Policy

• Need for a background process, paging daemon– periodically inspects state of memory

• When too few frames are free– selects pages to evict using a replacement algorithm

• It can use same circular list (clock) – as regular page replacement algorithm but with diff ptr

Avishai Woollecture 8 - 18

Windows XP Page Replacement• Processes are assigned working set minimum and

working set maximum• Working set minimum is the minimum number of page

frames the process is guaranteed to have in memory• A process may be assigned as many page frames up to

its working set maximum• When the amount of free memory in the system falls

below a threshold, automatic working set trimming is performed to restore the amount of free memory

• Working set trimming removes frames from processes that have more than their working set minimum

Avishai Woollecture 8 - 19

Devices, Controllers, and I/O Architectures

Avishai Woollecture 8 - 20

I/O Device Types

• Block Devices– block size of 512-32768 bytes– block can be read/written individually– typical: disks / floppy / CD

• Character Devices– delivers / accepts a sequential stream of characters– non-addressable – typical: keyboard, mouse, printer, network

• Other: Monitor, Clock

Avishai Woollecture 8 - 21

Typical Data Rates

Avishai Woollecture 8 - 22

Device Controllers

• I/O devices have components:– mechanical component – electronic component

• The electronic component is the device controller– may be able to handle multiple devices

• Controller's tasks– convert serial bit stream to block of bytes– perform error correction as necessary– make available to main memory

Avishai Woollecture 8 - 23

Communicating with Controllers

• Controllers have registers to deliver data, accept data, etc.

• Option 1: special I/O commands, I/O ports in r0, 4

• “4” is not memory address 4, it is I/O port 4

• Option 2: I/O registers mapped to memory addresses

Avishai Woollecture 8 - 24

Memory-Mapped Registers

• Controller connected to the bus

• Has a physical “memory address” like B0000000

• When this address appears on the bus, the controller responds (read/write to its I/O register)

• RAM configured to ignore controller’s address

Avishai Woollecture 8 - 25

Possible I/O Register Mappings

• Separate I/O and memory space (IBM 360)• Memory-mapped I/O (PDP-11)• Hybrid (Pentium, 640K-1M are for I/O)

Avishai Woollecture 8 - 26

Advantages of Memory Mapped I/O

• No special instructions, can be written in C.

• Protection by not putting I/O memory in user virtual address space.

• All machine instructions can access I/O:LOOP: test *b0000004 // check if port_4 is 0 beq READY branch LOOP

READY: ...

Avishai Woollecture 8 - 27

Disadvantages of Memory Mapped I/O

• Memory and I/O controllers have to be on the same bus:– modern architectures have separate memory bus!– Pentium has 3 buses: memory, PCI, ISA

Avishai Woollecture 8 - 28

Bus Architectures

(a) A single-bus architecture(b) A dual-bus memory architecture

Avishai Woollecture 8 - 29

Memory Mapped with Separate Bus

• I/O Controllers do not see memory bus.

• Option 1: all addresses to memory bus. No response I/O bus

• Option 2: Snooping device between buses– speed difference is a problem

• Option 3 (Pentium): filter addresses in PCI bridge

Avishai Woollecture 8 - 30

Structure of a large Pentium system

Avishai Woollecture 8 - 31

Principles of I/O Software

Avishai Woollecture 8 - 32

Goals of I/O Software

• Device independence– programs can access any I/O device – without specifying device in advance

· (floppy, hard drive, or CD-ROM)

• Uniform naming– name of a file or device a string or an integer– not depending on which machine

• Error handling– handle as close to the hardware as possible

Avishai Woollecture 8 - 33

Goals of I/O Software (2)

• Synchronous vs. asynchronous transfers– blocked transfers vs. interrupt-driven

• Buffering– data coming off a device cannot be stored in final

destination

• Sharable vs. dedicated devices– disks are sharable– tape drives would not be

Avishai Woollecture 8 - 34

How is I/O Programmed

• Programmed I/O

• Interrupt-driven I/O

• DMA (Direct Memory Access)

Avishai Woollecture 8 - 35

Programmed I/O

Steps in printing a string

Avishai Woollecture 8 - 36

Polling

Busy-waiting until device can accept another character

Example assumes memory-mapped registers

Avishai Woollecture 8 - 37

Properties of Programmed I/O

• Simple to program

• Ties up CPU, especially if device is slow

Avishai Woollecture 8 - 38

Interrupts Revisited

bus

Avishai Woollecture 8 - 39

Interrupt-Driven I/O

Code executed when print system call is made

Interrupt service procedure

Avishai Woollecture 8 - 40

Properties of Interrupt-Driven I/O

• Interrupt every character or word.

• Interrupt handling takes time.

• Makes sense for slow devices (keyboard, mouse)

• For fast device: use dedicated DMA controller – usually for disk and network.

Avishai Woollecture 8 - 41

Direct Memory Access (DMA)

• DMA controller has access to bus.

• Registers:– memory address to write/read from– byte count– I/O port or mapped-memory address to use– direction (read from / write to device)– transfer unit (byte or word)

Avishai Woollecture 8 - 42

Operation of a DMA transfer

Avishai Woollecture 8 - 43

I/O Using DMA

code executed when the print system call is made

interrupt service procedure

Avishai Woollecture 8 - 44

DMA with Virtual Memory

• Most DMA controllers use physical addresses

• What if memory of buffer is paged out during DMA transfer?

• Force the page to not page out (“pinning”)

Avishai Woollecture 8 - 45

Burst or Cycle-stealing

• DMA controller grabs bus for one word at a time, it competes with CPU bus access. This is called “cycle-stealing”.

• In “burst” mode the DMA controller acquires the bus (exclusively), issues several transfers, and releases. – More efficient – May block CPU and other devices

Avishai Woollecture 8 - 46

Concepts for review• TLB

• Local/Global page replacement

• Demand paging

• Page-fault-frequency monitor

• I/O device controller

• in/out commands

• Memory-mapped registers

• PCI Bridge

• Programmed I/O (Polling)

• Interrupt-driven I/O

• I/O using DMA

• Page pinning

• DMA cycle-stealing

• DMA burst mode

top related