mark and split kostis sagonas uppsala univ., sweden ntua, greece jesper wilhelmsson uppsala univ.,...

29
Mark and Split Mark and Split Kostis Sagonas Uppsala Univ., Sweden NTUA, Greece Jesper Wilhelmsson Uppsala Univ., Sweden

Post on 22-Dec-2015

223 views

Category:

Documents


3 download

TRANSCRIPT

Mark and SplitMark and Split

Kostis SagonasUppsala Univ.,

SwedenNTUA, Greece

Jesper WilhelmssonUppsala Univ.,

Sweden

Mark and Split 2

Copying vs. Mark-Sweep

Copying Collection

+ GC time proportional to the size of the live data set

- requires non-negligible additional space

• moves objects• compacts the heap

Mark-Sweep Collection

- GC time proportional to the size of the collected heap

+ requires relatively little additional space

• non-moving collector• may require

compaction

Mark and Split 3

Mark-Sweep Collection Algorithm

proc mark_sweep_gc() foreach root rootset do mark(*root)sweep()

proc mark(object) if marked(object) = false

marked(object) := true foreach pointer in object

do mark(*pointer)

Mark and Split 4

Variants of Mark-Sweep

• Lazy sweeping [Hughes 1982; Boehm 2000]– Defer the sweep phase until allocation time and then

perform it on a demand-driven (“pay-as-you-go”) way

– Improves paging and/or cache behavior

• Selective sweeping [Chung, Moon, Ebcioĝlu, Sahlin]– During marking, record the addresses of all marked

objects in an array (outside the heap)– Once marking is finished, sort these addresses– Perform the sweep phase selectively guided by the

sorted addresses

Mark and Split 5

Mark-Split Collection: Idea

Rather than (lazily/selectively) sweepingthe heap after marking to locate free areas,maintain information about them during marking.

More specifically, optimistically assume thatthe entire heap will be free after collection and let the mark phase “repair” the free listby “rescuing” the memory of live objects.

Mark and Split 6

Mark-Split Collection: Illustration

Heap to be collected

One free interval

Marking splits a free interval

Two free intervals

Marking splits another free interval

Three free intervals

Marking does not always increase the number of free intervals!

Three free intervals

Marking can actually decrease the number of free intervals!

Two free intervals

Mark and Split 7

proc mark_sweep_gc()

foreach root rootset do mark(*root)sweep()

proc mark(object) if marked(object) = false

marked(object) := true

foreach pointer in object do mark(*pointer)

proc mark_?????_gc()

foreach root rootset do mark(*root)

proc mark(object) if marked(object) = false

marked(object) := true

foreach pointer in object do mark(*pointer)

proc mark_sweep_gc() foreach root rootset do mark(*root)sweep()

proc mark(object) if marked(object) = false

marked(object) := trueforeach pointer in object do mark(*pointer)

proc mark_split_gc() insert_interval(heap_start, heap_end)foreach root rootset do mark(*root)

proc mark(object) if marked(object) = false

marked(object) := truesplit(find_interval(&object),

object)foreach pointer in object do mark(*pointer)

Mark-Split Collection: Algorithm (1)

Mark and Split 8

Mark-Split Collection: Algorithm (2)

proc split(interval, object) objectEnd := &object + size(object)keepLeft := keep_interval(&object – interval.start)keepRight := keep_interval(interval.end – objectEnd)if keepLeft keepRight

insert_interval(objectEnd, interval.end) // Case 1

interval.end := &objectelse if keepLeft interval.end := &object // Case 2else if keepRight interval.start := objectEnd // Case 3else remove_interval(interval.end)// Case 4

funct keep_interval(size) return size T // T is a threshold

Mark and Split 9

Mark-Split Collection: Data Structure

For storing the free intervals we need a data structure that allows for:– Fast location of an interval (find_interval )– Fast insertion of new intervals (insert_interval )

Data structures with these properties are:– Balanced search trees– Splay trees– Skip lists– …

In our implementation we used the AA tree [Andersson 1993]

Mark and Split 10

Mark-Split Collection: Best Cases

When nothing is live

When marking is consecutive

When live data set is a small percentage of the heap

Mark and Split 11

Mark-Split Collection: Worst Case

Note: - the number of free intervals is at most #L + 1 - this number will start decreasing once L H/2

Mark and Split 12

Time Complexity

CopyingO(L)

Mark-sweepO(L) + O(H)

Selective sweepingO(L) + O(L log L) +

O(L)Mark-split

O(L log I)

where:L = size of live data setH = size of heapI = number of free

intervals

Note:1. I L H2. I is bounded by

• #L+1 if L < H/2• H/(2o) if L H/2 where

o = size of smallest object

Mark and Split 13

Space Requirements

Best WorstCopying L HMark-sweep M MSelective sweeping M + #L M + #HMark-split M + k M +

k(H/2o)

where:L = size of live data set o = size of smallest objectH = size of heap k = size of interval nodeM = size of mark bit area

Mark and Split 14

Mark-Split vs. Selective Sweeping

• Mark-coalesce (the dual of mark-split)– Maintains information about occupied intervals– Can be seen as a variant of selective sweeping

that eagerly merges neighboring marked intervals

– Requires an extra pass at the end of collection to construct the free intervals list

Assume marking is consecutive

• Mark-split requires significantly less auxiliary space than selective sweeping

Mark and Split 15

Mark-Split vs. Lazy Sweeping

• Lazy sweeping does not affect the complexity of collection

• But often improves the cache performance of applications run with GC because– It avoids (some) negative caching effects

• Sweep phase disturbs the cache

– Compared with “plain” mark-sweep, it has positive caching effects

• Memory to allocate to is typically in the cache during object initialization

Mark and Split 16

Adaptive Schemes

• Basic idea is simple:– Optimistically start with mark-split– If it is detected that the cost will be too high, revert to

mark-sweep

• Criteria for switching:– Auxiliary space is exhausted– Number of tree nodes visited is too big– Keep a record of prior history (last N collections)– …

• Note that no single mark-split collection that reverts to mark-sweep can be faster than a mark-sweep only collection, but a sequence of adaptive collections can!

Mark and Split 17

Implementation

• Done in BEA’s JRockit– Mark-sweep collector has existed for quite long– Sweeps the heap by examining whole words of

the bitmap array

• Mark-split’s code is about 600 lines of C– The threshold T is set at 2KB (because of TLA)

Benchmarking environment:– 4 processor Intel Xeon 2GHz with hyper-

threading– 512KB of cache, 8GB of RAM running Linux– SPECjvm98 benchmarks run for 50 iterations

Mark and Split 18

Performance Evaluation on SPECjvm98

compress

0.00.10.20.30.40.50.60.70.8

64M 128M 256M 512M 1G 2G

To

tal

GC

tim

e (s

)

mark-sweep

mark-split

Mark and Split 19

Performance Evaluation on SPECjvm98

jess

0

1

2

3

4

5

64M 128M 256M 512M 1G 2G

To

tal

GC

tim

e (s

) mark-sweep

mark-split

Mark and Split 20

Performance Evaluation on SPECjvm98

db

javac

mtrt

jack

0

1

2

3

4

5

6

7

64M 128M 256M 512M 1G 2G

To

tal

GC

tim

e (s

) mark-sweep

mark-split

05

10152025303540

64M 128M 256M 512M 1G 2G

To

tal

GC

tim

e (s

)

mark-sweep

mark-split

0

4

8

12

16

20

64M 128M 256M 512M 1G 2G

To

tal

GC

tim

e (

s) mark-sweep

mark-split

0.0

0.5

1.0

1.5

2.0

2.5

3.0

64M 128M 256M 512M 1G 2G

To

tal

GC

tim

e (s

) mark-sweep

mark-split

Mark and Split 21

Performance Evaluation on SPECjvm98

compress

01020304050607080

64M 128M 256M 512M 1G 2G

Ave

rag

e G

C t

ime

(ms) mark-sweep

mark-split

Mark and Split 22

Performance Evaluation on SPECjvm98

jess

0

20

40

60

80

100

64M 128M 256M 512M 1G 2G

Ave

rag

e G

C t

ime

(ms) mark-sweep

mark-split

Mark and Split 23

Performance Evaluation on SPECjvm98

db

javac

mtrt

jack

020406080

100120140160

64M 128M 256M 512M 1G 2G

Ave

rag

e G

C t

ime

(ms) mark-sweep

mark-split

0

20

40

60

80

100

64M 128M 256M 512M 1G 2G

Ave

rage

GC

tim

e (m

s) mark-sweep

mark-split

0

20

40

60

80

100

120

140

64M 128M 256M 512M 1G 2G

Ave

rag

e G

C t

ime

(ms) mark-sweep

mark-split

0

10

20

30

40

50

60

70

64M 128M 256M 512M 1G 2G

Ave

rage

GC

tim

e (m

s)

mark-sweep

mark-split

Mark and Split 24

SPECjvm98 – GC times on a 128MB heap

0%

20%

40%

60%

80%

100%

120%

140%

160%

mark sweep mark-split

compress jess db javac mtrt jack

Mark and Split 25

SPECjvm98 – GC times on a 512MB heap

0%

20%

40%

60%

80%

100%

120%

compress jess db javac mtrt jack

Mark and Split 26

SPECjvm98 – GC times on a 2GB heap

0%

20%

40%

60%

80%

100%

compress jess db javac mtrt jack

Mark and Split 27

Other Measurements (on SPECjvm98)

Nodes Max Tree Comparisons Benchmark Max Final %live set Total Avg compress 267 205 0.08% 56k 7.1 jess 2731 2472 0.84% 270k 10.2 db 207 186 0.04% 45k 6.3 javac 7802 7561 0.32% 456k 12.0 mtrt 509 275 0.05% 1320k 9.1 jack 1953 1928 0.25% 199k 9.6

Mark and Split 28

Performance Evaluation on SPECjbb

0%

20%

40%

60%

80%

100%

120%

140%

160%

180%

mark-sweep mark-split adaptivescheme

mark-sweep mark-split adaptivescheme

mark-split

sw eep

mark

SPECjbb2000 SPECjbb2005

Mark and Split 29

Concluding Remarks on Mark-Split

New non-moving garbage collection algorithm:– Based on a simple idea:

• maintaining free intervals during marking, rather than sweeping the heap to find them

– Makes GC cost proportional to the size of the live data set, not the size of the heap that is collected

– Requires very small additional space– Exploits the fact that in most programs live

data tends to form (large) neighborhoods