avishai wool lecture 6 - 1 introduction to systems programming lecture 6 memory management

Avishai Woollecture 6 - 1

Introduction to Systems Programming Lecture 6

Memory Management


Memory Management

• Ideally programmers want memory that is– large

– fast

– non volatile (does not get erased when power goes off)

• Memory hierarchy – small amount of fast, expensive memory – cache

– some medium-speed, medium price main memory

– gigabytes of slow, cheap disk storage

• Memory manager handles the memory hierarchy


The Memory Hierarchy

Registers

On-chip Cache

Main Memory

Magnetic (Hard) Disk

Magnetic Tape

1 nsec

2 nsec

10 nsec

10 msec

100 sec

Access Time Capacity

< 1 KB

4 MB

512MB-2GB

200GB-1000GB

multi-TB

Other types of memory: ROM, EEPROM, Flash RAM


Basic Memory Management

An operating system with one user process

(Palm computers) (MS-DOS)

BIOS


Why is multi-programming good?

• Running several processes in parallel seems to “lets users get more done”

• Can we show a model that can quantify this?

• From the systems’ perspective:

Multi-programming improves utilization


Modeling Multiprogramming• A process waits for I/O a fraction p of time

– (1-p) of the time is spent in CPU bursts

• Degree of Multiprogramming: The number n of processes in memory

• Pr(CPU busy running processes) = utilization

Utilization = 1 - pn

• For an interactive process, p=80% is realistic


CPU utilization as a function of number of processes in memory

Degree of multiprogramming


Using the simple model

• Assume 32MB of memory

• OS uses 16MB, user processes use 4MB

• 4-way multi-programming possible

• Model predicts utilization = 1- 0.84 = 60%

• If we add another 16MB 8-way multi-programming utilization = 83%


Real-Memory Partitioning


Multiprogramming with Fixed Partitions

(a) Separate input queues for each partition(Used in IBM OS/360)

(b) Single input queue


Problems with Fixed Partitions

• Separate queues: memory not used efficiently if many process in one class and few in another

• Single queue: small processes can use up a big partition, again memory not used efficiently


Basic issues in multi-programming• Programmer, and compiler, cannot be sure where

process will be loaded in memory– address locations of variables, code routines cannot be

absolute

• Relocation: the mechanism for fixing memory references in memory

• Protection: one process should not be able to access another processes’ memory partition


Relocation in Software: Compiler+OS

• Compiler assumes program loaded at address 0.• Compiler/Linker inserts a relocation table into

the binary file:– positions in code containing memory addresses

• At load (part of process creation):– OS computes offset = lowest memory address for

process– OS modifies the code - adds offset to all positions

listed in relocation table


Relocation example

Relocation table:

6, 12, …

mov ax, *200

mov bx, *100

1040

1034

1028

1024

Load 1124

1224

Relocate: add 1024

mov ax, *200

mov bx, *100

16

10

4

0

Compile time CreateProcess

mov reg Address (4bytes)


Protection – Hardware Support

• Memory partitions have ID (protection code)

• PSW has a “protection code” field (e.g. 4 bits)

• Saved in PCB as part of process state

• CPU checks each memory access: if protection code of address != protection code of process error


Alternative hardware support : Base and Limit Registers

• Special CPU registers: “base”, “limit”• Address locations added to base value to map to physical

address– Replaces software relocation– OS sets the base & limit registers during CreateProcess

• Access to address locations over limit value is a CPU exception error – solves protection too

• Intel 8088 used a weak version of this: base register but no limit


Swapping


Swapping

• Fixed partitions are too inflexible, waste memory

• Next step up in complexity: dynamic partitions

• Allocate as much memory as needed by each process

• Swap processes out to disk to allow more multi-programming


Swapping - example

Memory allocation changes as – processes come into memory– leave memory

Shaded regions are unused memory


How much memory to allocate?

(a) Allocating space for growing data segment(b) Allocating space for growing stack & data segment


Issues in Swapping

• When a process terminates – compact memory?– Move all processes above the hole down in memory.

• Can be very slow: 256MB of memory, copy 4 bytes in 40ns compacting memory in 2.7 sec

• Almost never used

• Result: OS needs to keep track of holes.

• Problem to avoid: memory fragmentation.


Swapping Data Structure: Bit Maps

• Part of memory with 5 processes, 3 holes– tick marks show allocation units– shaded regions are free

• Corresponding bit map


Properties of Bit-Map Swapping

• Memory of M bytes, allocation unit is k bytesbitmap uses M/k bits = M/8k bytes.

Could be quite large.

• E.g., allocation unit is 4 bytesBit map uses 1/32 of memory

• Searching bit-map for a hole is slow


Swapping Data Structure: Linked Lists

• Variant #1: keep a list of blocks (process=P, hole=H)


What Happens When a Process Terminates?

Merge neighboring holes to create a bigger hole


Variant #2

• Keep separate lists for processes and for holes

• E.g., Process information can be in PCB

• Maintain hole list inside the holes

Process Asize nextprev nextsize prev

Hole 1 Hole 2


Hole Selection Strategy

• We have a list of holes of sizes 10, 20, 10, 50, 5A process that needs size 4. Which hole to use?

• First fit : pick the 1st hole that’s big enough (use hole of size 10)

• Break up the hole into a used piece and a hole of size 10 - 4 = 6

• Simple and fast


Best Fit• For a process of size s, use smallest hole that

has size(hole) >= s.• In example, use last hole, of size 5.

• Problems: – Slower (needs to search whole list)– Creates many tiny holes that fragment memory

• Can be made as fast as first fit if blocks sorted by size (but then slower termination processing)


Other Options

• Worst fit: find the biggest hole that fits.– Simulations show that this is not very good

• Quick Fit: maintain separate lists for common block sizes.– Improved performance of “find-hole” operation– More complicated termination processing


Related Problems

• The hole-list system is used in other places:

• C language dynamic memory runtime system – malloc() / calloc(), or C++ “new” keyword– free()

• File systems can use this type of system to maintain free and used blocks on the disk.


Virtual Memory


Main Idea

• Processes use virtual address space (e.g., 00000000-FFFFFFFF for 32-bit addresses).

• Every process has its own address space

• The address space of each process can be larger than physical memory.


Memory Mapping

• Only part of the virtual address space is mapped to physical memory at any time.

• Parts of processes’ memory content is on disk.

• Hardware & OS collaborate to move memory contents to and from disk.


Advantages of Virtual Memory• No need for software relocation: process code

uses virtual addresses.

• Solves protection requirement: Impossible for a process to refer to another process’s memory.

• For virtual memory protection to work:– Per-process memory mapping (page table)– Only OS can modify the mapping


Hardware support: the MMU(Memory Management Unit)


Example

• 16-bit memory addresses

• Virtual address space size: 64 KB

• Physical memory: 32 KB (15 bit)

• Virtual address space split into 4KB pages.– 16 pages

• Physical memory is split into 4KB page frames.– 8 frames


Paging

The relation between

virtual addresses and

physical memory

addresses given by

the page table

• OS maintains table• Page table per process• MMU uses table


Example (cont)• CPU executes the command

mov rx, *5• MMU gets the address “5”.• Virtual address 5 is in page 0 (addresses 0-4095)• Page 0 is mapped to frame 2 (physical addresses

8192-12287).

• MMU puts the address 8197 (=8192+5) on the bus.


Page Faults• What if CPU issues

mov rx, *32780• That page (page 8) is un-mapped (not in any

frame)

• MMU causes a page fault (interrupt to CPU)• OS handles the page fault:

– Evict some page from a frame– Copy the requested page from disk into the frame– Re-execute instruction


How the MMU Works

• Splits 32-bit virtual address into – A k-bit page number: the top k MSB– A (32-k) bit offset

• Uses page number as index into page table, and adds offset.

• Page table has 2k pages.

• Each page is of size 232-k.


4-bit page number


Issues with Virtual Memory

• Page table can be very large:– 32 bit addresses, 4KB pages (12-bit offsets)

over 1 million pages– Each process needs its own page table

• Page lookup has to be very fast:– Instruction in 4ns page table lookup should be

around 1ns

• Page fault rate has to be very low.


Concepts for review• Degree of multi-programming

• Processor utilization

• Fixed partitions

• Code relocation

• Memory protection

• Dynamic partitions – Swapping

• Memory fragmentation

• Data structures: bitmaps; list of holes

• First-fit/worst-fit/best-fit

• Virtual memory

• Address space

• MMU

• Pages and Frames

• Page table

• Page fault

• Page lookup

avishai wool lecture 6 - 1 introduction to systems programming lecture 6 memory management

Documents

avishai wool lecture

memory hierarchy slide

memory management slide

types of memory

memory addresses

memory protection

mb of memory os

realmemory partitioning