cmpsci 377 memory management 3877

Upload: chandrakottapalli

Post on 06-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    1/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Emery Berger

    University of Massachusetts Amherst

    Operating Systems

    CMPSCI 377

    Dynamic Memory Management

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    2/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 2

    Dynamic Memory Management

    How the heap manager is implemented malloc, free

    new, delete

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    3/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Memory Management

    Ideal memory manager: Fast

    Raw time, asymptotic runtime, locality

    Memory efficient Low fragmentation

    With multicore & multiprocessors:

    Scalable to multiple processors New issues:

    Secure from attack

    Reliable in face of errors3

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    4/47UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Memory Manager Functions

    Not just malloc/free realloc

    Change size of object, copying old contents

    ptr = realloc (ptr, 10); But: realloc(ptr, 0) = ?

    How about: realloc (NULL, 16) ?

    Other fun

    calloc

    memalign

    Needs ability to locate size & object start

    4

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    5/47UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Fragmentation

    Intuitively, fragmentation stems frombreaking up heap into unusable spaces

    More fragmentation = worse utilization of

    memory External fragmentation

    Wasted space outside allocated objects

    Internal fragmentation Wasted space inside an object

    5

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    6/47UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Classical Algorithms

    First-fit find first chunk of desired size

    6

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    7/47UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Classical Algorithms

    Best-fit find chunk that fits best

    Minimizes wasted space

    7

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    8/47UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Classical Algorithms

    Worst-fit find chunk that fits worst

    then split object

    Reclaim space: coalesce free adjacent

    objects into one big object

    8

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    9/47UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Implementation Techniques

    Freelists Linked lists of objects in same size class

    Range of object sizes

    First-fit, best-fit in this context

    9

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    10/47UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Implementation Techniques

    Segregated size classes Use free lists, but never coalesce or split

    Choice of size classes

    Exact

    Powers-of-two

    10

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    11/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Implementation Techniques

    Big Bag of Pages (BiBOP) Page or pages (multiples of 4K)

    Usually segregated size classes

    Header contains metadata Locate with bitmasking

    Limits external fragmentation

    Can be very fast

    11

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    12/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Runtime Analysis

    Key components Cost ofmalloc (best, worst, average)

    Cost offree

    Cost of size lookup (for realloc & free)

    Examine for first-fit, best-fit, segregated

    (with BiBOP)

    12

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    13/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Space Bounds

    Fragmentation worst-case for optimal:O(log M/m)

    M = largest object size

    m = smallest object size Best-fit = O(M * m) !

    Goal: perform well for typical programs

    Considerations: Internal fragmentation

    External fragmentation

    Headers (metadata)

    13

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    14/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Performance Issues

    Well talk about scalability later Reliability, too

    But: general-purpose allocator often seen

    as too slow

    14

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    15/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 15

    Custom Memory Allocation

    Very commonpractice

    Apache, gcc, lcc, STL,

    database servers

    Language-level

    support in C++

    Widely

    recommended

    Programmers replacenew/delete, bypassing

    system allocator

    Reduce runtime often

    Expand functionality

    sometimes

    Reduce space rarely

    Use custom

    allocators

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    16/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 16

    Drawbacks of Custom Allocators

    Avoiding system allocator:

    More code to maintain & debug

    Cant use memory debuggers

    Not modular or robust:

    Mix memory from custom

    and general-purpose allocators crash!

    Increased burden on programmers

    Are custom allocators really a win?

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    17/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 17

    Class1

    free list

    (1) Per-Class Allocators

    a

    b

    c

    a = new Class1;

    b = new Class1;c = new Class1;

    delete a;

    delete b;

    delete c;

    a = new Class1;

    b = new Class1;

    c = new Class1;

    Recycle freed objects from a free list

    + Fast+ Linked list operations

    + Simple

    + Identical semantics

    + C++ language support- Possibly space-inefficient

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    18/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 18

    (II) Custom Patterns

    Tailor-made to fit allocation patterns Example: 197.parser(natural language

    parser)

    char[MEMORY_LIMIT]

    a =xalloc(8);b =xalloc(16);c =xalloc(8);xfree(b);xfree(c);

    d =xalloc(8);

    a b cd

    end_of_arrayend_of_arrayend_of_arrayend_of_arrayend_of_arrayend_of_array

    +

    Fast+ Pointer-bumping allocation

    - Brittle

    - Fixed memory size

    -Requires stack-like lifetimes

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    19/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 19

    (III) Regions

    + Fast

    + Pointer-bumping allocation

    + Deletion of chunks

    + Convenient

    + One call frees all memory

    regionmalloc(r, sz)regiondelete(r)

    Separate areas, deletion only en masse

    regioncreate(r) r

    - Risky

    - Dangling

    references

    - Too much space

    Increasingly popular custom allocator

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    20/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 20

    Custom Allocators Are Faster

    Runtime - Custom Allocator Benchmarks

    0

    0.25

    0.5

    0.75

    1

    1.25

    1.51.75

    197.

    parser

    boxe

    d-sim

    c-br

    eeze

    175.

    vpr

    176.gc

    c

    apache lc

    c

    mudlle

    NormalizedRuntime

    Custom Win32

    non-regions regions

    As good as and sometimes much fasterthan Win32

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    21/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 21

    Not So Fast

    Runtime - Custom Allocator Benchmarks

    0

    0.25

    0.5

    0.75

    1

    1.25

    1.5

    1.75

    197.

    parser

    boxe

    d-sim

    c-br

    eeze

    175.

    vpr

    176.gc

    c

    apache lc

    c

    mudlle

    No

    rmalizedRuntime

    Custom Win32 DLmalloc

    non-regions regions

    DLmalloc: as fast orfasterfor most benchmarks

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    22/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 22

    The Lea Allocator (DLmalloc 2.7.0)

    Mature public-domain general-purposeallocator

    Optimized for common allocation patterns

    Per-size quicklists per-class allocation

    Deferred coalescing(combining adjacent free objects)

    Highly-optimized fastpath

    Space-efficient

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    23/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science 23

    Space Consumption: Mixed Results

    Space - Custom Allocator Benchmarks

    0

    0.25

    0.5

    0.75

    1

    1.25

    1.5

    1.75

    197.

    parser

    boxe

    d-sim

    c-br

    eeze

    175.

    vpr

    176.gc

    c

    apache lc

    c

    mudlle

    NormalizedSpace

    Custom DLmalloc

    regionsnon-regions

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    24/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Custom Allocators?

    Generally not worth the trouble:use good general-purpose allocator

    Avoids risky software engineering errors

    24

    P bl i h U f L

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    25/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Problems with Unsafe Languages

    C, C++: pervasive apps, but langs.memory unsafe

    Numerous opportunities for security

    vulnerabilities, errors Double free

    Invalid free

    Uninitialized reads Dangling pointers

    Buffer overflows (stack & heap)

    S d f

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    26/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Soundness for Erroneous Programs

    Normally: memory errors)? Consider infinite-heap allocator:

    All newsfresh;

    ignore delete No dangling pointers, invalid frees,

    double frees

    Every object infinitely large

    No buffer overflows, data overwrites

    Transparent to correct program

    Erroneous programs sound

    P b bili ti M S f t

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    27/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Probabilistic Memory Safety

    Approximate with M-heaps (e.g., M=2)

    DieHard: fully-randomized M-heap

    Increases odds ofbenign errors

    Probabilistic memory safety i.e., P(no error) n

    Errors independent across heaps

    E(users with no error) n * |users|

    ? Efficient implementation

    I l t ti Ch i

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    28/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Implementation Choices

    Conventional, freelist-based heaps Hard to randomize, protect from errors

    Double frees, heap corruption

    What about bitmaps? [Wilson90] Catastrophic fragmentation

    Each small object likely to occupy one page

    obj obj objobj

    pages

    Randomized Heap Layout

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    29/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Randomized Heap Layout

    Bitmap-based, segregated size classes

    Bit represents one object of given size

    i.e., one bit = 2i+3 bytes, etc.

    Prevents fragmentation

    00000001 1010 10

    size = 2i+3 2i+4 2i+5

    metadata

    heap

    Randomized Allocation

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    30/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Randomized Allocation

    malloc(8):

    compute size class = ceil(log2 sz) 3

    randomly probe bitmap for zero-bit (free)

    Fast: runtime O(1)

    M=2 E[# of probes] 2

    00000001 1010 10

    size = 2i+3 2i+4 2i+5

    metadata

    heap

    Randomized Allocation

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    31/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    malloc(8):

    compute size class = ceil(log2 sz) 3

    randomly probe bitmap for zero-bit (free)

    Fast: runtime O(1)

    M=2 E[# of probes] 2

    00010001 1010 10

    size = 2i+3 2i+4 2i+5

    metadata

    heap

    Randomized Allocation

    Randomized Deallocation

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    32/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    free(ptr):

    Ensure object valid aligned to right address

    Ensure allocated bit set

    Resets bit

    Prevents invalid frees, double frees

    00010001 1010 10

    size = 2i+3 2i+4 2i+5

    metadata

    heap

    Randomized Deallocation

    Randomized Deallocation

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    33/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Randomized Deallocation

    free(ptr):

    Ensure object valid aligned to right address

    Ensure allocated bit set

    Resets bit

    Prevents invalid frees, double frees

    00010001 1010 10

    size = 2i+3 2i+4 2i+5

    metadata

    heap

    Randomized Deallocation

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    34/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    free(ptr):

    Ensure object valid aligned to right address

    Ensure allocated bit set

    Resets bit

    Prevents invalid frees, double frees

    00000001 1010 10

    size = 2i+3 2i+4 2i+5

    metadata

    heap

    Randomized Deallocation

    Randomized Heaps & Reliability

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    35/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Randomized Heaps & Reliability

    2 34 5 3 1 6

    object size = 2i+4object size = 2i+3

    11 6 3 2 5 4

    My Mozilla: malignant overflow

    Your Mozilla: benign overflow

    Objects randomly spread across heap Different run = different heap

    Errors across heaps independent

    DieHard software architecture

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    36/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    DieHard software architecture

    broadcast vote

    input output

    execute replicas

    (separateprocesses)

    replica3seed3

    replica1seed1

    replica2seed2

    Replication-based fault-tolerance Requires randomization: errors independent

    DieHard Results

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    37/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    DieHard Results

    Analytical results (pictures!) Buffer overflows

    Uninitialized reads

    Dangling pointer errors (the best) Empirical results

    Runtime overhead

    Error avoidance Injected faults & actual applications

    Analytical Results: Buffer Overflows

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    38/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Analytical Results: Buffer Overflows

    Model overflow as write of live data Heap half full (max occupancy)

    Analytical Results: Buffer Overflows

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    39/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Analytical Results: Buffer Overflows

    Model overflow as write of live data Heap half full (max occupancy)

    Analytical Results: Buffer Overflows

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    40/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Analytical Results: Buffer Overflows

    Model overflow: random write of livedata

    Heap half full (max occupancy)

    Analytical Results: Buffer Overflows

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    41/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Analytical Results: Buffer Overflows

    Replicas: Increase odds of avoidingoverflow in at least one replica

    replic

    as

    Analytical Results: Buffer Overflows

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    42/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    Analytical Results: uffer Overflows

    Replicas: Increase odds of avoidingoverflow in at least one replica

    replic

    as

    Analytical Results: Buffer Overflows

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    43/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    y

    Replicas: Increase odds of avoidingoverflow in at least one replica

    replic

    as

    P(Overflow in all replicas) = ()3 = 1/8

    P(No overflow in > 1 replica) = 1-()3 = 7/8

    Analytical Results: Buffer Overflows

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    44/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    y

    F = free space

    H = heap size

    N = # objectsworth ofoverflow

    k= replicas

    Overflow one object

    Empirical Results: Runtime

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    45/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    p

    Empirical Results: Error Avoidance

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    46/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    p

    Injected faults: Dangling pointers (@50%, 10 allocations)

    glibc: crashes; DieHard: 9/10 correct

    Overflows (@1%, 4 bytes over)

    glibc: crashes 9/10, inf loop; DieHard: 10/10 correct

    Real faults:

    Avoids Squid web cache overflow

    Crashes BDW & glibc

    Avoids dangling pointer error in Mozilla

    DoS in glibc & Windows

    The End

  • 8/3/2019 Cmpsci 377 Memory Management 3877

    47/47

    UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

    The End

    47