swissbox redesigning systems from the ground up€¦ · query workload • up to 4000 queries /...

48
SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP Gustavo Alonso Systems Group Dept. of Computer Science ETH Zürich, Switzerland SwissBox – IBM– March-2013

Upload: others

Post on 16-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP

Gustavo Alonso Systems Group

Dept. of Computer Science ETH Zürich, Switzerland

SwissBox – IBM– March-2013

Page 2: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Systems Group = www.systems.ethz.ch Enterprise Computing Center = www.ecc.ethz.ch

Page 3: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

SWISSBOX

Gustavo Alonso, Donald Kossmann, Timothy Roscoe: SWissBox: An Architecture for Data Processing Appliances. CIDR 2011: 32-37

Page 4: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

The SwissBox project

Build an open source data appliance

• Hardware

• Software

What is a DB appliance?

• Database in a box

Funny database

Funny box

Page 5: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

• Intelligent storage manager • Massive caching • RAC based architecture • Fast network interconnect

ORACLE EXADATA

Page 6: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

NETEZZA (IBM) TWINFIN

• No storage manager • Distributed disks (per node) • FPGA processing • No indexing

Page 7: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

SAP HANA

• Main memory database • Column store • No indexing (automatic)

Page 8: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

SwissBox themes

System co-design

• OS/DB co-design

• HW/SW co-design

Data processing on modern hardware

New system architectures

• Databases (Crescando, SharedDB)

• Operating systems (Barrelfish)

• Intelligent storage engines (Ibex)

Page 9: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Swissbox mantras

Everything is a distributed system

• Multicore = cluster

Everything is heterogeneous:

• Computing nodes

• Memory

• Links/networks

Hardware can be tailored

Performance must be predictable

Page 10: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

How it all started: The Amadeus use case

Page 11: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Amadeus Workload

Passenger-Booking Database

• ~ 600 GB of raw data (two years of bookings)

• single table, denormalized

• ~ 50 attributes: flight-no, name, date, ..., many flags

Query Workload

• up to 4000 queries / second

• latency guarantees: 2 seconds

• today: only pre-canned queries allowed

Update Workload • avg. 600 updates per second

(1 update per GB per sec) • peak of 12000 updates per

second • data freshness guarantee: 2

seconds

Problems with State-of-the Art • Simple queries work only

because of mat. views multi-month project to

implement new query / process

• Complex queries do not work at all

Page 12: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Better the devil you know …

Performance depends on workload parameters

• changes in load (updates, columns accessed) -> huge variance

• Unpredictable performance, impossible to tune correctly

0

5'000

10'000

15'000

20'000

0 20 40 60 80 100

Qu

ery

Lat

en

cy in

mse

c

Update Load in Updates/sec

MySQL Query 50th

MySQL Query 90th

MySQL Query 99th

0

1'000

2'000

3'000

4'000

5'000

6'000

7'000

8'000

9'000

1.251.51.752

Qu

ery

Lat

en

cy in

mse

c

Synthetic Workload Parameter s

Page 13: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Hardware killed the software star

Page 14: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Hardware dominates the game

Hardware evolving faster than software

Performance gains no longer for free

Machines becoming far more complex

Design assumptions no longer hold

• Multicore heterogeneity

• Hardware acceleration

• Effect of networking

Page 15: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Hardware is going crazy

Gustavo Alonso - Systems Group - ETH Zürich 15

P1

P0

P2

P3

P4

P6

P5

P7

Each die has: • 6 cores • 4HT ports • 2 memory channels

Each package has: • 12 cores • 4HT ports • 4 memory channels

Page 16: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Multicore challenge

Min Cores Partition Size [GB]

Intel Nehalem 2 4

AMD Barcelona 5 1.6

AMD Shanghai 3 2.6

AMD MagnyCours 2 2

Experiment setup • 8GB datastore size • SLA latency requirement 8s • 4 different machines

Example: deployment on multicores

Page 17: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Adding resources does not help

Gustavo Alonso - Systems Group - ETH Zürich 17

0

100

200

300

400

500

600

700

800

0 50 100 150 200 250 300 350

Th

oru

gh

pu

t(T

PS

)

Clients

MySQL TPC-WB 20 GB DB

MYSQL-48 MYSQL-24 MYSQL-12

8 cores

48 cores

24 cores

Page 18: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Load interaction (multicore)

Gustavo Alonso - Systems Group - ETH Zürich 18

System X

Page 19: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Load interaction (virtualized)

0

200

400

600

800

1000

1200

0 1000 2000 3000 4000

Th

rou

gh

pu

t [r

eq

/se

con

d]

Increasing load [#requests]

System Performance -Throughput

48 cores - isolated

48 cores - noisy

Experiment setup

• AMD MagnyCours • 4 x 2.2GHz AMD Opteron 6174 processors • total Datastore size 53GB • Noise: another CPU-intensive task running on core 0

What we actually get

What we expect to get

Page 20: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

SWISSBOX: storage engine Software part

Philipp Unterbrunner, Georgios Giannikis, Gustavo Alonso, Dietmar Fauser, Donald Kossmann: Predictable Performance for Unpredictable Workloads. PVLDB 2(1): 706-717 (2009)

Page 21: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Crescando: the Amadeus use case

Remove load interaction

Remove unpredictability

Simplify design for scalability and modeling

Treat a multicore machine as a collection of individual nodes (not as a parallel machine)

Run only on main memory

One thread per core

Highly tune the code at each core

Page 22: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Scan on a core

READ CURSOR

WRITE CURSOR DATA IN

CIRCULAR BUFFER

(WIDE TABLE)

BUILD QUERY INDEX FOR NEXT SCAN QUERIES

UPDATES

Page 23: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Crescando on 1 Machine (N Cores)

...

Split

Scan Thread

Scan Thread

Scan Thread

Scan Thread

Scan Thread

Merge

Input Queue

(Operations)

Input Queue

(Operations)

Output Queue

(Result Tuples)

Output Queue

(Result Tuples)

Page 24: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Crescando in a Data Center (N Machines)

...

Aggregation

Layers

Replication

Groups

...

...

External Clients

Crescando

...

Page 25: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Why is this interesting (industry)?

Fully predictable performance

• Response time determined by design regardless of load

Only two parameters:

• Size of the scan

• Number of queries per scan

Scalable to arbitrary numbers of nodes

Page 26: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Why is this interesting (research)?

Storage engine with a different interface

Can be used as intelligent (active) storage engine

Modular component

No multi-threading, no fancy parallelism, no synchronization, no shared data structures, etc.

Suitable for hardware acceleration

Page 27: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

SWISSBOX: data processing engine

Georgios Giannikis, Gustavo Alonso, Donald Kossmann: SharedDB: Killing One Thousand Queries With One Stone. PVLDB 5(6): 526-537 (2012)

Page 28: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

SharedDB

SharedDB does not run queries individually (each one in one thread). Instead, it runs operators that process queries in batches thousands of queries at a time

Page 29: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Shared DB can run TPC-W!

Page 30: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Predictability at scale

SharedDB can run complex joins (and shorts) in predictable time with large update loads

Linear scalability with number of processing units (cores)

Page 31: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Raw performance

Page 32: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Predictability, robustness

Page 33: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Why is this interesting (industry)?

Optimize whole loads rather than individual queries/services

Fully predictable performance

Better use of resources and parallelism

Eliminates complex database administration problems

Page 34: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Why is this interesting (research)?

One plan to

• Optimize

• Deploy NEW

• Schedule NEW

DB/OS codesign:

• The database does not know about the hardware or its state …

Page 35: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Why is this interesting?

SharedDB runs on a heterogeneous storage engine:

Crescando

Key value store

Same idea can be generalized to different representations and/or hardware architectures -> see next

Page 36: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

SWISSBOX: hardware acceleration

Parallel Computation of Skyline Queries Louis Woods, Jens Teubner and Gustavo Alonso. IEEE FCCM, March, 2013

Page 37: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

ORACLE EXADATA

Page 38: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Intelligent storage engine

Page 39: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Inserting the FPGA in the data path

Page 40: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Engine design

Page 41: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

So far so good

Page 42: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Points of interest

Page 43: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

SWISSBOX: Challenges ahead

Page 44: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Execution platform very complex

Optimization not trivial

• Multicore

• Load interaction

• Virtualization

• Lack of precise cost models

• Not stable/standard platforms

Page 45: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Beyond plan optimization

Other aspects to optimization

• Deployment on heterogeneous architecture

• Scheduling

Parallel threads

Across queries

• Load interaction

Page 46: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

Hardware dictates everything

Many options enabled by hardware

• Custom hardware

• Custom configurations

• Multiplicity computing

Reasonable development cost

Can beat almost any software design by tuning the hardware

Page 47: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

CONCLUSIONS

Page 48: SWISSBOX REDESIGNING SYSTEMS FROM THE GROUND UP€¦ · Query Workload • up to 4000 queries / second • latency guarantees: 2 seconds • today: only pre-canned queries allowed

May you live in interesting times

Many radical changes in IT infrastructure

• Cloud computing

• Hardware & architecture relevant again

• Specialization / tailoring

• Large scale clusters

• Geographic distribution

Great opportunity for research