allocating memory ted baker andy wang cis 4930 / cop 5641

49
Allocating Memory Ted Baker Andy Wang CIS 4930 / COP 5641

Upload: paul-weaver

Post on 14-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Allocating Memory

Ted Baker Andy Wang

CIS 4930 / COP 5641

Page 2: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Topics

kmalloc and friends get_free_page and “friends” vmalloc and “friends” Memory usage pitfalls

Page 3: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Linux Memory Manager (1)

Page allocator maintains individual pages

Page allocator

Page 4: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Linux Memory Manager (2)

Zone allocator allocates memory in power-of-two sizes

Page allocator

Zone allocator

Page 5: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Linux Memory Manager (3)

Slab allocator groups allocations by sizes to reduce internal memory fragmentation

Page allocator

Zone allocator

Slab allocator

Page 6: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

kmalloc

Does not clear the memory Allocates consecutive virtual/physical

memory pages Offset by PAGE_OFFSET No changes to page tables

Tries its best to fulfill allocation requests Large memory allocations can degrade the

system performance significantly

Page 7: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

The Flags Argument

kmalloc prototype#include <linux/slab.h>

void *kmalloc(size_t size, int flags);

GFP_KERNEL is the most commonly used flag Eventually calls __get_free_pages

The origin of the GFP prefix Can put the current process to sleep while

waiting for a page in low-memory situations Cannot be used in atomic context

Page 8: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

The Flags Argument

To obtain more memory Flush dirty buffers to disk Swapping out memory from user processes

GFP_ATOMIC is called in atomic context Interrupt handlers, tasklets, and kernel timers Does not sleep If the memory is used up, the allocation fails

No flushing and swapping

Page 9: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

The Flags Argument

Other flags are available Defined in <linux/gfp.h>

GFP_USER allocates user pages; it may sleep GFP_HIGHUSER allocates high memory user

pages GFP_NOIO disallows I/Os GFP_NOFS does not allow making FS calls

Used in file system and virtual memory code Disallow kmalloc to make recursive calls

Page 10: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Priority Flags

Used in combination with GFP flags (via ORs) __GFP_DMA requests allocation to happen in

the DMA-capable memory zone __GFP_HIGHMEM indicates that the allocation

may be allocated in high memory __GFP_COLD requests for a page not used for

some time (to avoid DMA contention) __GFP_NOWARN disables printk warnings

when an allocation cannot be satisfied

Page 11: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Priority Flags

__GFP_HIGH marks a high priority request Not for kmalloc

__GFP_REPEAT Try harder

__GFP_NOFAIL Failure is not an option (strongly discouraged)

__GFP_NORETRY Give up immediately if the requested memory is

not available

Page 12: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Memory Zones

DMA-capable memory Platform dependent First 16MB of RAM on the x86 for ISA devices PCI devices have no such limit

Normal memory High memory

Platform dependent > 32-bit addressable range

Page 13: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Memory Zones

If __GFP_DMA is specified Allocation will only search for the DMA zone

If nothing is specified Allocation will search both normal and DMA

zones If __GFP_HIGHMEM is specified

Allocation will search all three zones

Page 14: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

The Size Argument

Kernel manages physical memory in pages Needs special management to allocate small

memory chunks Linux creates pools of memory objects in

predefined fixed sizes (32-byte, 64-byte, 128-byte memory objects)

Smallest allocation unit for kmalloc is 32 or 64 bytes

Largest portable allocation unit is 128KB

Page 15: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Lookaside Caches (Slab Allocator)

Nothing to do with TLB or hardware caching Useful for USB and SCSI drivers

Improved performance To create a cache for a tailored size#include <linux/slab.h>

kmem_cache_t *

kmem_cache_create(const char *name, size_t size,

size_t offset, unsigned long flags,

void (*constructor) (void *, kmem_cache_t *,

unsigned long flags));

Page 16: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Lookaside Caches (Slab Allocator)

name: memory cache identifier Allocated string without blanks

size: allocation unit offset: starting offset in a page to align

memory Most likely 0

Page 17: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Lookaside Caches (Slab Allocator)

flags: control how the allocation is done SLAB_HWCACHE_ALIGN

Requires each data object to be aligned to a cache line

Good option for frequently accessed objects on SMP machines

Potential fragmentation problems SLAB_CACHE_DMA

Requires each object to be allocated in the DMA zone

See linux/slab.h for other flags

Page 18: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Lookaside Caches (Slab Allocator)

constructor: initialize newly allocated objects

Constructor may not sleep due to atomic context

Page 19: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Lookaside Caches (Slab Allocator)

To allocate an memory object from the memory cache, callvoid *kmem_cache_alloc(kmem_cache_t *cache, int flags);

cache: the cache created previously flags: same flags for kmalloc Failure rate is rather high

Must check the return value

To free an memory object, callvoid kmem_cache_free(kmem_cache_t *cache,

const void *obj);

Page 20: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Lookaside Caches (Slab Allocator)

To free a memory cache, callint kmem_cache_destroy(kmem_cache_t *cache);

Need to check the return value Failure indicates memory leak

Slab statistics are kept in /proc/slabinfo

Page 21: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

A scull Based on the Slab Caches: scullc Declare slab cachekmem_cache_t *scullc_cache;

Create a slab cache in the init function/* no constructor */

scullc_cache

= kmem_cache_create("scullc", scullc_quantum, 0,

SLAB_HWCACHE_ALIGN, NULL);

if (!scullc_cache) {

scullc_cleanup();

return -ENOMEM;

}

Page 22: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

A scull Based on the Slab Caches: scullc To allocate memory quantaif (!dptr->data[s_pos]) {

dptr->data[s_pos] = kmem_cache_alloc(scullc_cache,

GFP_KERNEL);

if (!dptr->data[s_pos])

goto nomem;

memset(dptr->data[s_pos], 0, scullc_quantum);

}

To release memoryfor (i = 0; i < qset; i++) {

if (dptr->data[i]) {

kmem_cache_free(scullc_cache, dptr->data[i]);

}

}

Page 23: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

A scull Based on the Slab Caches: scullc To destroy the memory cache at module

unload time/* scullc_cleanup: release the cache of our quanta */

if (scullc_cache) {

kmem_cache_destroy(scullc_cache);

}

Page 24: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Memory Pools

Similar to memory cache Reserve a pool of memory to guarantee the

success of memory allocations Can be wasteful

To create a memory pool, call#include <linux/mempool.h>

mempool_t *mempool_create(int min_nr,

mempool_alloc_t *alloc_fn,

mempool_free_t *free_fn,

void *pool_data);

Page 25: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Memory Pools

min_nr is the minimum number of allocation objects

alloc_fn and free_fn are the allocation and freeing functions

typedef void *(mempool_alloc_t)(int gfp_mask,

void *pool_data);

typedef void (mempool_free_t)(void *element,

void *pool_data);

pool_data is passed to the allocation and freeing functions

Page 26: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Memory Pools

To allow the slab allocator to handle allocation and deallocation, use predefined functions

cache = kmem_cache_create(...);

pool = mempool_create(MY_POOL_MINIMUM, mempool_alloc_slab,

mempool_free_slab, cache);

To allocate and deallocate a memory pool object, call

void *mempool_alloc(mempool_t *pool, int gfp_mask);

void mempool_free(void *element, mempool_t *pool);

Page 27: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Memory Pools

To resize the memory pool, callint mempool_resize(mempool_t *pool, int new_min_nr,

int gfp_mask);

To deallocate the memory poll, callvoid mempool_destroy(mempool_t *pool);

Page 28: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

get_free_page and Friends

For allocating big chunks of memory, it is more efficient to use a page-oriented allocator

To allocate pages, call/* returns a pointer to a zeroed page */

get_zeroed_page(unsigned int flags);

/* does not clear the page */

__get_free_page(unsigned int flags);

/* allocates multiple physically contiguous pages */

__get_free_pages(unsigned int flags, unsigned int order);

Page 29: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

get_free_page and Friends

flags Same as flags for kmalloc

order Allocate 2order pages

order = 0 for 1 page order = 3 for 8 pages

Can use get_order(size)to find out order Maximum allowed value is about 10 or 11 See /proc/buddyinfo statistics

Page 30: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

get_free_page and Friends

Subject to the same rules as kmalloc To free pages, callvoid free_page(unsigned long addr);

void free_pages(unsigned long addr, unsigned long order);

Make sure to free the same number of pages Or the memory map becomes corrupted

Page 31: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

A scull Using Whole Pages: scullp Memory allocationif (!dptr->data[s_pos]) {

dptr->data[s_pos] =

(void *) __get_free_pages(GFP_KERNEL, dptr->order);

if (!dptr->data[s_pos])

goto nomem;

memset(dptr->data[s_pos], 0, PAGE_SIZE << dptr->order);

}

Memory deallocationfor (i = 0; i < qset; i++) {

if (dptr->data[i]) {

free_pages((unsigned long) (dptr->data[i]), dptr->order);

}

}

Page 32: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

The alloc_pages Interface

Core Linux page allocator functionstruct page *alloc_pages_node(int nid, unsigned int flags,

unsigned int order);

nid: NUMA node ID Two higher level macrosstruct page *alloc_pages(unsigned int flags,

unsigned int order);

struct page *alloc_page(unsigned int flags);

Allocate memory on the current NUMA node

Page 33: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

The alloc_pages Interface

To release pages, call

void __free_page(struct page *page);

void __free_pages(struct page *page, unsigned int order);

/* optimized calls for cache-resident or non-cache-resident pages */

void free_hot_page(struct page *page);

void free_cold_page(struct page *page);

Page 34: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

vmalloc and Friends

Allocates a virtually contiguous memory region Not consecutive pages in physical memory

Each page retrieved with a separate alloc_page call

Less efficient Can sleep (cannot be used in atomic context) Returns 0 on error, or a pointer to the

allocated memory Its use is discouraged

Page 35: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

vmalloc and Friends

vmalloc-related prototypes#include <linux/vmalloc.h>

void *vmalloc(unsigned long size);

void vfree(void *addr);

#include <asm/io.h>

void *ioremap(unsigned long offset, unsigned long size);

void iounmap(void *addr);

Page 36: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

vmalloc and Friends

Each allocation via vmalloc involves setting up and modifying page tables Return address range between VMALLOC_START and VMALLOC_END (defined in <asm/pgtable_32_type.h>)

Used for allocating memory for a large sequential buffer

Page 37: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

vmalloc and Friends

ioremap builds page tables Does not allocate memory Takes a physical address (offset) and return

a virtual address Useful to map the address of a PCI buffer to

kernel space Should use readb and other functions to

access remapped memory

Page 38: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

A scull Using Virtual Addresses: scullv This module allocates 16 pages at a time To obtain new memory

if (!dptr->data[s_pos]) {

dptr->data[s_pos]

= (void *) vmalloc(PAGE_SIZE << dptr->order);

if (!dptr->data[s_pos]) {

goto nomem;

}

memset(dptr->data[s_pos], 0,

PAGE_SIZE << dptr->order);

}

Page 39: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

A scull Using Virtual Addresses: scullv To release memory

for (i = 0; i < qset; i++) {

if (dptr->data[i]) {

vfree(dptr->data[i]);

}

}

Page 40: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Per-CPU Variables

Each CPU gets its own copy of a variable Almost no locking for each CPU to work with

its own copy Better performance for frequent updates

Example: networking subsystem Each CPU counts the number of processed

packets by type When user space requests to see the value,

just add up each CPU’s version and return the total

Page 41: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Per-CPU Variables

To create a per-CPU variable#include <linux/percpu.h>

DEFINE_PER_CPU(type, name);

name: an arrayDEFINE_PER_CPU(int[3], my_percpu_array);

Declares a per-CPU array of three integers To access a per-CPU variable

Need to prevent process migrationget_cpu_var(name); /* disables preemption */

put_cpu_var(name); /* enables preemption */

Page 42: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Per-CPU Variables

To access another CPU’s copy of the variable, callper_cpu(name, int cpu_id);

To dynamically allocate and release per-CPU variables, callvoid *alloc_percpu(type);

void *__alloc_percpu(size_t size);

void free_percpu(const void *data);

Page 43: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Per-CPU Variables

To access dynamically allocated per-CPU variables, callper_cpu_ptr(void *per_cpu_var, int cpu_id);

To ensure that a process cannot be moved out of a processor, call get_cpu (returns cpu ID) to block preemptionint cpu;

cpu = get_cpu()

ptr = per_cpu_ptr(per_cpu_var, cpu);

/* work with ptr */

put_cpu();

Page 44: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Per-CPU Variables

To export per-CPU variables, callEXPORT_PER_CPU_SYMBOL(per_cpu_var);

EXPORT_PER_CPU_SYMBOL_GPL(per_cpu_var);

To access an exported variable, call/* instead of DEFINE_PER_CPU() */

DECLARE_PER_CPU(type, name);

More examples in <linux/percpu_counter.h>

Page 45: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Obtaining Large Buffers

First, consider the alternatives Optimize the data representation Export the feature to the user space

Use scatter-gather mappings Allocate at boot time

Page 46: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Acquiring a Dedicated Buffer at Boot Time Advantages

Least prone to failure Bypass all memory management policies

Disadvantages Inelegant and inflexible Not a feasible option for the average user Available only for code linked to the kernel

Need to rebuild and reboot the computer to install or replace a device driver

Page 47: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Acquiring a Dedicated Buffer at Boot Time To allocate, call one of these functions

#include <linux/bootmem.h>

void *alloc_bootmem(unsigned long size);

/* need low memory for DMA */

void *alloc_bootmem_low(unsigned long size);

/* allocated in whole pages */

void *alloc_bootmem_pages(unsigned long size);

void *alloc_bootmem_low_pages(unsigned long size);

Page 48: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Acquiring a Dedicated Buffer at Boot Time To free, call void free_bootmem(unsigned long addr, unsigned long size);

Need to link your driver into the kernel See Documentation/kbuild

Page 49: Allocating Memory Ted Baker  Andy Wang CIS 4930 / COP 5641

Memory Usage Pitfalls

Failure to handle failed memory allocation Needed for every allocation

Allocate too much memory No built-in limit on memory usage