1 adapted from uc berkeley cs252 s01 lecture 19: virtual memory virtual memory concept, virtual-...
TRANSCRIPT
1Adapted from UC Berkeley CS252 S01
Lecture 19: Virtual Memory
Virtual Memory concept, Virtual-physical translation, page table, TLB, Alpha 21264 memory hierarchy
Virtual MemoryVirtual memory (VM) allows programs to have the illusion of a very large memory that is not limited by physical memory size
Make main memory (DRAM) acts like a cache for secondary storage (magnetic disk)
Otherwise, application programmers have to move data in/out main memory
That’s how virtual memory was first proposed
Virtual memory also provides the following functions
Allowing multiple processes share the physical memory in multiprogramming environment
Providing protection for processes (compare Intel 8086: without VM applications can overwrite OS kernel)
Facilitating program relocation in physical memory space
3
VM Example
4
Virtual Memory and CacheVM address translation a provides a mapping from the virtual address of the processor to the physical address in main memory and secondary storage.
Cache terms vs. VM terms Cache block => page Cache Miss => page fault
Tasks of hardware and OS TLB does fast address translations OS handles less frequently events:
page fault TLB miss (when software approach is used)
Virtual Memory and Cache
Parameter L1 Cache Main Memory
Block (page) size 16-128 bytes 4KB – 64KB
Hit time 1-3 cycles 50-150 cycles
Miss Penalty 8-300 cycles 1M to 10M cycles
Miss rate 0.1-10% 0.00001-0.001%
Address mapping 25-45 bits => 13-21 bits
32-64 bits => 25-45 bits
4 Qs for Virtual Memory
Q1: Where can a block be placed in the upper level?
Miss penalty for virtual memory is very high => Full associativity is desirable (so allow blocks to be placed anywhere in the memory)
Have software determine the location while accessing disk (10M cycles enough to do sophisticated replacement)
Q2: How is a block found if it is in the upper level?
Address divided into page number and page offset Page table and translation buffer used for address
translation Q: why fully associativity does not affect hit time?
7
4 Qs for Virtual MemoryQ3: Which block should be replaced on a miss? Want to reduce miss rate & can handle in
software Least Recently Used typically used A typical approximation of LRU
Hardware set reference bits OS record reference bits and clear them
periodically OS selects a page among least-recently referenced
for replacement
Q4: What happens on a write? Writing to disk is very expensive Use a write-back strategy
8
Virtual and Physical AddressesA virtual address consists of a virtual page number and a page offset. The virtual page number gets translated to a physical page number.The page offset is not changed
Virtual Page Number Page offset
Physical Page Number Page offset
Translation
Virtual Address
Physical Address
36 bits
33 bits
12 bits
12 bits
9
Address Translation Via Page Table
Assume the access hits in main memory
Address Translation with Page TablesA page table translates a virtual page number into a physical page numberA page table register indicates the start of the page table.The virtual page number is used as an index into the page table that contains
The physical page number A valid bit that indicates if the page is present in main
memory A dirty bit to indicate if the page has been written Protection information about the page (read only,
read/write, etc.)Since page tables contain a mapping for every virtual page, no tags are required (how to compare it with cache?)
Page table access is slow; we will see the solution
11
Page Table Diagram
12
Accessing Main Memory or Disk
Valit bit being zero means the page is not in main memoryThen a page fault occurs, and the missing page is read in from disk.
13
How Large Is Page Table?Suppose
48-bit virtual address 41-bit physical address 8 KB pages => 13 bit page offset Each page table entry is 8 bytes
How large is the page table? Virtual page number = 48 - 13 = 25 bytes Number of entries = number of pages = 225 =
32M Total size = number of entries x bytes/entry = 32M x 8B = 256 Mbytes Each process needs its own page table
Page tables have to be very large, thus must be stored in main page or even paged, resulting in slow accessWe need techniques to reduce page table size
14
TLB: Improving Page Table AccessCannot afford accessing page table for every access include cache hits (then cache itself makes no sense)Again, use cache to speed up accesses to page table! (cache for cache?)TLB is translation lookaside buffer storing frequently accessed page table entryA TLB entry is like a cache entry Tag holds portions of virtual address Data portion holds physical page number,
protection field, valid bit, use bit, and dirty bit (like in page table entry)
Usually fully associative or highly set associative
Usually 64 or 128 entriesAccess page table only for TLB misses
15
TLB CharacteristicsThe following are characteristics of TLBs
TLB size : 32 to 4,096 entries Block size : 1 or 2 page table entries (4 or
8 bytes each) Hit time: 0.5 to 1 clock cycle Miss penalty: 10 to 30 clock cycles (go to
page table) Miss rate: 0.01% to 0.1% Associative : Fully associative or set
associative Write policy : Write back (replace
infrequently)
16
Alpha 21264 Data TLB128 entries, fully associativeASN (like PID) to avoid flushingAlso check protection
17
Determine Page SizeLarger Size Comments
Page table size Inversely proportionalFast L1 cache hit L1 cache can be largerI/O utilization Longer burst transferTLB hit rate Increasing TLB coverageStorage efficiency Reducing fragmentationI/O efficiency Unnecessary data
transferProcess start-up Small processes are
popular
Most commonly used size: 4KB or 8KB Hardware may support a range of page sizes OS selects the best one(s) for its purpose
18
Alpha 21264 TLB Access
Virtual indexedPhysically tagged
Physically indexedPhysically tagged
19
Alpha 21264 Virtual MemoryCombining segmentation and paging Segmentation: variable-size memory space range,
usually defined by a base register and a limit field Segmentation assign meanings to address spaces,
and reduce address space that needs paging (reducing page table size)
Paging is used on the address space of each segment
Three segments in Alpha kseg: reserved for OS kernel, not VM management seg0: virtual address accessible to user process seg1: virtual address accessible to OS kernel
20
Two Viewpoints of Virtual Memory
Application programs Sees a large, flat memory space Assumes fast access to every place Hardware/OS hide the complexity
OS Kernel Manages multiple process spaces Reserves direct accesses to some portions of
physical memory May access physical memory, its own virtual
memory, and virtual memory of the current process
Hardware facilitates fast VM accesses, and OS manages slow, less frequent events
21
Alpha 21264 Page Table10-bit
1024 8B PTEs
13-bit
13-bit28-bit
Page table access on TLB miss managed bysoftware
22
Memory Protection Memory protection: preventing unauthorized accesses to process and kernel memoryMemory protection implementation: User programs can only access through
virtual memory PTE entry contains protection bits to allow
shared but protected accesses
Protection fields in Alpha Valid, user read enable, kernel read enable,
user write enable, and kernel write enable
23
Memory Hierarchy Example:Alpha 21264 in AlphaServer ES40
L1 instruction cache: 2-way, 64KB, 64-byte block, Virtually indexed and tagged
Use way prediction and line prediction to allow instruction fetching
Inst prefetcher: store four prefetched instructions, accessed before L2 cacheL1 data cache: 2-way, 64KB, 64-byte block, Virtually indexed, physically tagged, write-throughVictim buffer: 8-entry, checked before L2 accessL2 unified cache: 1-way 1MB to 16MB, off-chip, write-back;
Allow critical-word transfer to L1 cache, transfers 16B per 2.25ns
TLB: 128-entry fully associative for inst and data (each)ES40: L1 miss penalty 22ns, L2 130 ns; up to 32GB memory; 256-bit memory buses (64-bit into processor)Read 5.13 for more details