virtual memory

Virtual Memory

Rabi Mahapatra & Hank Walker

Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley

View of Memory HierarchiesRegs

L2 Cache

Memory

Disk

Tape

Instr. Operands

Blocks

Pages

Files

Upper Level

Lower Level

Faster

Larger

CacheBlocks

Thus far{{Next:

VirtualMemory

Memory Hierarchy: Some Facts

CPU Registers100s Bytes<10s ns

CacheK Bytes10-100 ns$.01-.001/bit

Main MemoryM Bytes100ns-1us$.01-.001

DiskG Bytesms10 - 10 cents-3 -4

CapacityAccess TimeCost

Tapeinfinitesec-min10-6

Registers

Cache

Memory

Disk

Tape

Instr. Operands

Blocks

Pages

Files

StagingXfer Unit

prog./compiler1-8 bytes

cache cntl8-128 bytes

OS512-4K bytes

user/operatorMbytes

Upper Level

Lower Level

faster

Larger

Virtual Memory: Motivation

• If Principle of Locality allows caches to offer (usually) speed of cache memory with size of DRAM memory,then recursively why not use at next level to give speed of DRAM memory, size of Disk memory?

• Treat Memory as “cache” for Disk !!!

Advantages of Virtual Memory• Translation:

– Program can be given consistent view of memory, even though physical memory is scrambled

– Makes multithreading reasonable (now used a lot!)– Only the most important part of program (“Working Set”) must be in

physical memory.– Contiguous structures (like stacks) use only as much physical memory as

necessary yet still grow later.• Protection:

– Different threads (or processes) protected from each other.– Different pages can be given special behavior

• (Read Only, Invisible to user programs, etc).– Kernel data protected from User programs– Very important for protection from malicious programs

=> Far more “viruses” under Microsoft Windows• Sharing:

– Can map same physical page to multiple users(“Shared memory”)

Virtual Memory MappingVirtual Address:

page no. offset

Page TableBase Reg

Page Table located in physical memory

(actually, concatenation)

indexintopagetable

+

PhysicalMemoryAddress

Page Table

Val-id

AccessRights

PhysicalPageAddress

.

V A.R. P. P. A.

...

...

Issues in VM Design

What is the size of information blocks that are transferred from secondary to main storage (M)? page size(Contrast with physical block size on disk, I.e. sector size)

Which region of M is to hold the new block placement policy

How do we find a page when we look for it? block identification

Block of information brought into M, and M is full, then some region of M must be released to make room for the new block replacement policy

What do we do on a write? write policy

Missing item fetched from secondary memory only on the occurrence of a fault demand load policy

pages

reg

cachemem disk

frame

Virtual to Physical Address Translation

• Each program operates in its own virtual address space; ~only program running

• Each is protected from the other

• OS can decide where each goes in memory

• Hardware (HW) provides virtual -> physical mapping

virtualaddress(inst. fetchload, store)

Programoperates inits virtualaddressspace

HWmapping

physicaladdress(inst. fetchload, store)

Physicalmemory(incl. caches)

Mapping Virtual Memory to Physical Memory

0

Physical Memory

Code

Static

Heap

Stack

64 MB

• Divide into equal sizedchunks (about 4KB)

0

• Any chunk of Virtual Memory assigned to any chuck of Physical Memory (“page”)

Paging Organization (eg: 1KB Page)

AddrTransMAP

Page is unit of mapping

Page also unit of transfer from disk to physical memory

page 0 1K1K

1K

01024

31744

Virtual Memory

VirtualAddress

page 1

page 31

1K2048 page 2

...... ...

page 001024

7168

PhysicalAddress

PhysicalMemory

1K1K

1K

page 1

page 7...... ...

Virtual Memory Problem # 1• Map every address 1 extra memory access

for every memory access

• Observation: since locality in pages of data, must be locality in virtual addresses of those pages

• Why not use a cache of virtual to physical address translations to make translation fast? (small is fast)

• For historical reasons, cache is called a Translation Lookaside Buffer, or TLB

Virtual Memory Problem # 2

Processor Trans-lation

CacheMain

Memory

VA PA miss

hit

data

• Cache typically operates on physical addresses• Page Table access is another memory access for each program memory access!•Need to fix this!

Virtual Memory Problem # 2• Map every address 1 extra cache (TLB)

access for every memory access

• Why not just have cache use virtual addresses?

• Problem: Different processes use same virtual addresses

• Could invalidate cache on context switch, but results in lots of compulsory misses

• Prefix addresses with process ID

• Other issues with virtual caches…

Virtual Memory Problem #2

• Note: Only high order bits of address are translated, low order bits are not translated

• Feed low order index bits to cache while also doing translation

• Get translated tag and match with cache• TLB is usually smaller and faster, so has

translation ready before tag is ready• Why it is called a translation LOOKASIDE buffer

Lookaside

• Works if page bits >= index+offset size• Example:

– 4 KB page => 12 bit page index– 256 lines of 16-byte blocks => 8 index, 4 offset bits– But only 256*16=4 KB cache size!

• Solution: Use larger set size– 16-way cache would be 64 KB

• Solution: Use larger page size– But causes internal fragmentation – unused part of page

Virtual Page #VA Page Index

TLB

Physical Page # Page IndexPA

Tag Index Offset

TLB

Tag

= Hit

VPN PI

Data

To CPU

Typical TLB Format

Virtual Physical Dirty Ref Valid AccessAddress Address Rights

• TLB just a cache on the page table mappings

• TLB access time comparable to cache (much less than main memory access time) • Ref: Used to help calculate LRU on replacement• Dirty: since use write back, need to know whether or not to write page to disk when replaced

What if not in TLB

• Option 1: Hardware checks page table and loads new Page Table Entry into TLB

• Option 2: Hardware traps to OS, up to OS to decide what to do

• MIPS follows Option 2: Hardware knows nothing about page table format

TLB Miss

• If the address is not in the TLB, MIPS traps to the operating system

• The operating system knows which program caused the TLB fault, page fault, and knows what the virtual address desired was requested

2 91

valid virtual physical

TLB Miss: If data is in Memory

• We simply add the entry to the TLB, evicting an old entry from the TLB

7 3212 91

valid virtual physical

What if data is on disk ?

• We load the page off the disk into a free block of memory, using a DMA transfer– Meantime we switch to some other process

waiting to be run

• When the DMA is complete, we get an interrupt and update the process's page table– So when we switch back to the task, the desired

data will be in memory

What if the memory is full ?

• We load the page off the disk into a free block of memory, using a DMA transfer– Meantime we switch to some other process

waiting to be run

• When the DMA is complete, we get an interrupt and update the process's page table– So when we switch back to the task, the desired

data will be in memory

Memory Organization with TLB

•TLBs usually small, typically 128 - 256 entries

• Like any other cache, the TLB can be fully associative, set associative, or direct mapped

ProcessorTLB

LookupCache

MainMemory

VA PA miss

hit

data

Trans-lation

hit

miss

Virtual Memory Problem # 3

• Page Table too big!– 4GB Virtual Memory ÷ 4 KB page

~ 1 million Page Table Entries 4 MB just for Page Table for 1 process, 25 processes 100 MB for Page Tables!

• Variety of solutions to tradeoff memory size of mapping function for slower when miss TLB– Make TLB large enough, highly associative so

rarely miss on address translation

Two Level Page Tables

0

Physical Memory64

MB

Virtual Memory

Code

Static

Heap

Stack

0

...

2nd LevelPage Tables

SuperPageTable

Summary

• Apply Principle of Locality Recursively

• Reduce Miss Penalty? add a (L2) cache

• Manage memory to disk? Treat as cache– Included protection as bonus, now critical– Use Page Table of mappings

vs. tag/data in cache

• Virtual memory to Physical Memory Translation too slow? – Add a cache of Virtual to Physical Address

Translations, called a TLB

Summary• Virtual Memory allows protected sharing of memory between processes with less swapping to disk, less fragmentation than always swap or base/bound

• Spatial Locality means Working Set of Pages is all that must be in memory for process to run fairly well

• TLB to reduce performance cost of VM

• Need more compact representation to reduce memory size cost of simple 1-level page table (especially 32- 64-bit address)

virtual memory

Documents

secondary memory

memory accessobservation

virtual physical

size of disk memory

program memory access

cachesmapping virtual

size of dram memory

speed of cache memory