vm and io topics in linux

Post on 29-Jun-2015

2.141 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Brief introduction to Linux memory management, focus on page reclamation. Swap and IO architecture are also mentioned.

TRANSCRIPT

VM and I/O Topics in Linux

Page Replacement, Swap and I/O

Jiannan Ouyang

Ph.D. Student

Computer Science Department

University of Pittsburgh

05/05/2011

Outline

• Overview of Linux Memory Management

• Page Reclamation

• Swap & I/O

Jiannan Ouyang, CS PhD@PITT 2

Describing Physical Memory

Jiannan Ouyang, CS PhD@PITT 3

Node: NUMA memory region

Zone: memory type

Struct Page: page frame

Physical Page Allocation

Jiannan Ouyang, CS PhD@PITT 4

Binary Buddy Allocator:

• If a block of the desired size is not available, a large block is broken up in half, and the

two blocks are buddies to each other. One half is used for the allocation, and the other is

free. The blocks are continuously halved as necessary until a block of the desired size is

available.

• When a block is later freed, the buddy is examined, and the two are coalesced if it is free.

Page Table Management

• Three Level Mapping

Jiannan Ouyang, CS PhD@PITT 5

Kernel Memory Mapping

Jiannan Ouyang, CS PhD@PITT 6 Virtual Memory

0x00000000

4-GB

Physical memory 0x00000000

0x3FFFFFFF

1-GB 896-MB

896-MB

0xC0000000

display memory

device memory

User Memory Mapping

Jiannan Ouyang, CS PhD@PITT 7 virtual memory

kernel

space

user space

text data

stack

text

data

stack

physical memory

mappings

3-GB

User Memory Mapping

Jiannan Ouyang, CS PhD@PITT 8

user space

kernel

space

user space

text

data

stack

kernel

space

text

data

stack

text

data

data

stack

stack

physical memory virtual memory virtual memory

Outline

• Overview of Linux Memory Management

• Page Reclamation

• Swap & I/O

Jiannan Ouyang, CS PhD@PITT 9

Memory Customers

Jiannan Ouyang, CS PhD@PITT 10

Kernel Code & data

User Code & Data

Slab Cache

Page Cache

Icache & dcache Buddy

System

Request

Reclaim

• All memory except “User Code & data” are used by the kernel

• “User Code & Data” are managed in user space, i.e. malloc/free,

kernel can only swap out user pages

Slab Cache

Jiannan Ouyang, CS PhD@PITT 11

• Cache for commonly used objects kept in an initialized state

available for use by the kernel.

• Save time of allocating, initializing and freeing the same object.

Disk related caches

• Dcache (metadata): dentry objects representing filesystem pathnames.

• Icache (metadata): inode objects representing disk inodes.

• Page Cache (data): data pages from disk, main disk cache used

Jiannan Ouyang, CS PhD@PITT 12

Memory Customers Review

Jiannan Ouyang, CS PhD@PITT 13

Kernel Code & data

User Code & Data

Slab Cache

Page Cache

Icache & dcache Buddy

System

Request

Reclaim

We’ll see when will the kernel start reclaim pages, which pages to

reclaim, and the replacement policy.

Reclamation: When?

Jiannan Ouyang, CS PhD@PITT 14

Zone Watermarks • Pages Low: kswapd is woken up by the buddy

allocator to start freeing pages. The value is twice the value of pages min by default.

• Pages Min: the allocator will do the kswapd work in a synchronous fashion, sometimes referred to as the direct-reclaim path.

• Pages High: kswapd will go back to sleep. The default for pages high is three times the value of pages min.

Jiannan Ouyang, CS PhD@PITT 15

Reclamation: Which?

Jiannan Ouyang, CS PhD@PITT 16

Reclamation: Which? (Con.)

Jiannan Ouyang, CS PhD@PITT 17

• Mapped & Anonymous Pages

– Mapped: backed up by a file

– Anonymous: anonymous memory region of a process

• Shared & Non-shared Pages

– Unmapping from all page table entries at once: reverse mapping, important improvement in Linux 2.6 Kernel

Reclamation: Which? (Con.)

shrink_caches until given target number of pages is met,

1. slab cache (Kmem_cache_reap)

2. User pages & page cache (refill & shrink_cache)

3. dcache and icache

Jiannan Ouyang, CS PhD@PITT 18

Replacement Policy

Jiannan Ouyang, CS PhD@PITT 19

active

inactive

Ref=1, clear

Ref=0

(active, ref) = {11,10, 01, 00}

reclaim

access

access

active=1

active=0

Moving pages across the list

Jiannan Ouyang, CS PhD@PITT 20

mark_page_accessed( ):

on each access increase the (active, ref) counter;

if active=1 move inactive->active;

Refill_inactive_zone():

if (ref=1) {ref=0; move to head of active list;}

else {move active -> inactive;}

Outline

• Overview of Linux Memory Management

• Page Reclamation

• Swap & I/O

Jiannan Ouyang, CS PhD@PITT 21

Swap

• Able to reclaim all the page frames obtained by a process, and not only those have an image on disk

– anonymous pages (User stack or heap)

– Dirty pages that belong to a private memory mapping of a process

– IPC shared pages

Jiannan Ouyang, CS PhD@PITT 22

Swap (Con.)

• Set up “swap areas” on disk

• allocating and freeing “page slots” in swap areas

• Provide functions both to “swap out” pages from RAM into a swap area and to “swap in” pages from a swap area into RAM.

• Mark Page Table entries to keep track of the positions of data in the swap areas.

Jiannan Ouyang, CS PhD@PITT 23

Example

total used free shared buffers cached

Mem: 2013 1811 201 0 157 872

-/+ buffers/cache: 782 1231

Swap: 397 0 397

Jiannan Ouyang, CS PhD@PITT 24

While(1){

p = malloc(N);

memset(p, 0, N);

//demand paging

}

$free -m

total used free shared buffers cached

Mem: 2013 1956(+) 56(-) 0 4(-) 109(-)

-/+ buffers/cache: 1842(+) 170(-)

Swap: 397 8 389

Linux I/O Architecture

Jiannan Ouyang, CS PhD@PITT 25

• How to do bypassing?

• Default file I/O API,

fwrite(), are buffered

• File System:

(dir, name, offset) -> LBA

• Device File: not normal

file

I/O Bypassing

• Disk Cache

– O_DIRECT

• File System

– Device file

• I/O Scheduler

– To be solved

Jiannan Ouyang, CS PhD@PITT 26

Thanks Q&A

Jiannan Ouyang, CS PhD@PITT 27

Reference

• Understanding the Linux Kernel, 3rd

• Understanding the Linux Virtual Memory Manager

Jiannan Ouyang, CS PhD@PITT 28

BACKUP SLICES

Jiannan Ouyang, CS PhD@PITT 29

Page Table Management

• Three Level Mapping

Jiannan Ouyang, CS PhD@PITT 30

Page Table Management (Con.)

Jiannan Ouyang, CS PhD@PITT 31

MMU Linear Address Physical Address

PGD Address

top related