flash research report

38
Flash research report Da Zhou 2009-7-4

Upload: mariko-phillips

Post on 31-Dec-2015

29 views

Category:

Documents


0 download

DESCRIPTION

Flash research report. Da Zhou 2009-7-4. Outline. Query Processing Techniques for Solid State Drives ( Research Paper ) Join Processing for Flash SSDs: Remembering Past Lessons ( DaMoN ) Evaluating and Repairing Write Performance on Flash Devices ( DaMoN ) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Flash research report

Flash research report

Da Zhou

2009-7-4

Page 2: Flash research report

Outline

• Query Processing Techniques for Solid State Drives (Research Paper)

• Join Processing for Flash SSDs: Remembering Past Lessons (DaMoN)

• Evaluating and Repairing Write Performance on Flash Devices (DaMoN)

• Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Page 3: Flash research report

Outline

• Query Processing Techniques for Solid State Drives (Research Paper)

• Join Processing for Flash SSDs: Remembering Past Lessons (DaMoN)

• Evaluating and Repairing Write Performance on Flash Devices (DaMoN)

• Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Page 4: Flash research report

Query Processing Techniques for Solid State Drives

• Dimitris Tsirogiannis – University of Toronto, Toronto, ON, Canada

• Stavros Harizopoulos, Mehul A. Shah, Janet L. Wiener, Goetz Graefe– HP Labs, Palo Alto, CA, USA

Page 5: Flash research report

Motivation

• Although SSD may benefit applications that stress random reads immediately, they may not improve database applications, especially those running long data analysis queries.

• Database query processing engines have been designed around the speed mismatch between random and sequential I/O on hard disks and their algorithms currently emphasize sequential accesses for disk-resident data.

Page 6: Flash research report

Contributions

• Column-based layout: PAX

• FlashScan

• FlashJoin

Page 7: Flash research report

PAX

traditional row-based (NSM) and column-based (PAX) layouts

Page 8: Flash research report

FlashScan

• FlashScan takes advantage of the small transfer unit of SSDs to read only the minipages of the attributes that it needs.

Page 9: Flash research report

FlashScan(Opt)

• FlashScan can improve performance even further by reading only the minipages that contribute to the final result.

Page 10: Flash research report

FlashScan

Page 11: Flash research report

FlashScan

When applying the predicate on a sorted attribute, however, FlashScanOpt outperforms plain Flash-Scan for all selectivities below 100%: only a few pages contain the contiguous matching tuples and all other minipages can be skipped.

Page 12: Flash research report

FlashJoin

The join kernel computes the join and outputs a join index. Each join index tuple consists of the join attributes as well as the row-ids (RIDs) of the participating rows from base relations.

The fetch kernel retrieves the needed attributes using the RIDs specied in the join index.

Page 13: Flash research report

Outline

• Query Processing Techniques for Solid State Drives (Research Paper)

• Join Processing for Flash SSDs: Remembering Past Lessons (DaMoN)

• Evaluating and Repairing Write Performance on Flash Devices (DaMoN)

• Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Page 14: Flash research report

Join Processing for Flash SSDs: Remembering Past Lessons

• Jaeyoung Do, Jignesh M. Patel– Univ. of Wisconsin-Madison

• My current interests are: energy-efficient data processing, multi-core query processing, methods for searching and mining large graph and sequence/string data sets, and spatial data management.

• Towards Eco-friendly Database Management Systems, Willis Lang, Jignesh M. Patel, CIDR 2009

• Data Morphing: An Adaptive, Cache-Conscious Storage Technique, R. A. Hankins and J. M. Patel, VLDB 2003.

• Effect of Node Size on the Performance of Cache-Conscious B+-trees, R. A. Hankins and J. M. Patel, SIGMETRICS 2003.

Page 15: Flash research report

Motivation

• We must carefully consider the lessons that we have learnt from over three decades of designing and tuning algorithms for magnetic HDD-based systems, so that we continue to reuse techniques that worked for magnetic HDDs and also work with flash SSDs.

Page 16: Flash research report

Four classic ad hoc join algorithms

• Block Nested Loops Join– Block nested loops join first logically splits the smaller

relation R into same size chunks. For each chunk of R that is read, a hash table is built to efficiently find matching pairs of tuples. Then, all of S is scanned, and the hash table is probed with the tuples.

• Sort-Merge Join– Sort-merge join starts by producing sorted runs of eac

h R and S. After R and S are sorted into runs on disk, sort-merge join reads the runs of both relations and merges/joins them.

Page 17: Flash research report

Four classic ad hoc join algorithms

• Grace Hash Join– Grace hash join has two phases. In the first phase, ha

shes tuples into buckets. – In the second phase, the first bucket of R is loaded int

o the buffer pool, and a hash table is built on it. Then, the corresponding bucket of S is read and used to probe the hash table.

• Hybrid Hash Join– Since a portion of the buffer pool is reserved for an in-

memory hash bucket for R– Furthermore, as S is read and hashed, tuples of S ma

tching with the in-memory R bucket can be joined immediately, and need not be written to disk.

Page 18: Flash research report

Experimental Setup

• DB: SQLite3, Our experiments were performed on a Dual Core 3.2GHz Intel Pentium machine with 1 GB of RAM running Red HatEnterprise 5. For the comparison, we used a 5400 RPM TOSHIBA 320 GB external HDD and a OCZ Core Series60GB SATA II 2.5 inch flash SSD.

• As our test query, we used a primary/foreign key join between the TPC-H customer and the orders tables, generated with a scale factor of 30. The customer table contains 4,500,000 tuples (730 MB), and the orders table has 45,000,000 (5 GB).

Page 19: Flash research report

Effect of Varying the Buffer Pool Size

The block nested loops join whose I/O pattern is sequential reads shows the biggest performance improvement, with speedup factors between 1.59X to 1.73X.

Other join algorithms also performed better on the flash SSD compared to the magnetic HDD, with smaller speedup improvements than the block nested loops join. This is because the write transfer rate is slower than the read transfer rate on the flash SSD, and unexpected erase operations might degrade write performance further.

Page 20: Flash research report

Effect of Varying the Buffer Pool Size

While the I/O speedup of the second phase was between 2.63X and 3.0X due to faster random reads, the I/O speedup in the first phase (that has sequential writes as the dominant I/O pattern), was only between 1.52X and 2.0X.

Note that the dominant I/O pattern of Grace hash join is random writes in the first phase, followed by sequential reads in the second phase.

Page 21: Flash research report

Summary

1. Joins on flash SSDs have a greater tendency to become CPU-bound (rather than I/O-bound), so ways to improve the CPU performance, such as better cache utilization, is of greater importance with flash SSDs.

2. Trading random reads for random writes is likely a good design choice for flash SSDs.

3. Compared to sequential writes, random writes produce more I/O variations with flash SSDs, which makes the join performance less predictable.

Page 22: Flash research report

Effect of Varying the Page Size

As can be seen from Figure 2, when blocked I/O is used, the page size has a small impact on the join performance in both the magnetic HDD and the flash SSD cases.

Page 23: Flash research report

Effect of Varying the Page Size

When the I/O size is less than the flash page size (4 KB), every write operation is likely to generate an erase operation, which severely degrades performance.

Page 24: Flash research report

Summary

1. Using blocked I/O significantly improves the join performance on flash SSDs over magnetic HDDs.

2. The I/O size should be a multiple of the flash page size.

Page 25: Flash research report

Outline

• Query Processing Techniques for Solid State Drives (Research Paper)

• Join Processing for Flash SSDs: Remembering Past Lessons (DaMoN)

• Evaluating and Repairing Write Performance on Flash Devices (DaMoN)

• Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Page 26: Flash research report

Evaluating and Repairing Write Performance on Flash Devices

Anastasia Ailamaki• EPFL, VD, Switzerland• CMU, PA, USA• In 2001, she joined the Computer Science Department at

Carnegie Mellon University, where she is currently an Associate Professor. In February 2007, she joined EPFL as a visiting professor.

• S. Harizopoulos and A. Ailamaki. Improving instruction cache performance in OLTP. ACM Transactions on Database Systems, 31(3):887-920, 2006.

Page 27: Flash research report

An Append and Pack Data Layout

• The layer always writes dirty pages, flushed by the buffer manager of the overlying DBMS, sequentially and in multiples of the erase block size.

• From a conceptual point of view, the physical database representation is an append-only structure.

• As a result, our writing mechanism benefits from optimal flash memory performance as long as enough space is available.

Page 28: Flash research report

An Append and Pack Data Layout

• The proposed layer consolidates the least recently updated logical pages, starting from the head of the append structure, packs them together, then writes them back sequentially to the tail.

• We append them to the write-cold dataset because pages which reach the beginning of the hot dataset have gone the longest without being updated and are therefore likely to be write-cold.

• We read data from the head of the cold log structure and write them to the end

Page 29: Flash research report

Outline

• Query Processing Techniques for Solid State Drives (Research Paper)

• Join Processing for Flash SSDs: Remembering Past Lessons (DaMoN)

• Evaluating and Repairing Write Performance on Flash Devices (DaMoN)

• Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Page 30: Flash research report

Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices

• Yanlei Diao

• Department of Computer Science

• University of Massachusetts Amherst

Page 31: Flash research report

Motivation

• They present significant challenges in designing tree indexes due to their fundamentally different read and write characteristics in comparison to magnetic disks.

Page 32: Flash research report

Key Features

• Cascaded Buffers

• Adaptive Buffering

Page 33: Flash research report
Page 34: Flash research report

The scan cost of lookup L1 is 75, while that of lookup L2 is 90.

Each of the three lookups after L1 saves s1. Hence the benefit of emptying at lookup L1, denoted by payoff p1, is given by p1 = 3 · s1 = 225.

Page 35: Flash research report

Raw Flash Memory

Page 36: Flash research report

SSD

Page 37: Flash research report

Outline

• Query Processing Techniques for Solid State Drives (Research Paper)

• Join Processing for Flash SSDs: Remembering Past Lessons (DaMoN)

• Evaluating and Repairing Write Performance on Flash Devices (DaMoN)

• Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Page 38: Flash research report

Thank You