designing hybrid data processing systems for heterogeneous …ey204/teaching/acs/r244_2017... ·...

40
Designing Hybrid Data Processing Systems for Heterogeneous Servers Peter Pietzuch Large-Scale Distributed Systems (LSDS) Group Imperial College London http://lsds.doc.ic.ac.uk <[email protected]> University of Cambridge – Cambridge, United Kingdom – November 2017 1

Upload: others

Post on 12-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Designing Hybrid Data ProcessingSystems for Heterogeneous Servers

Peter PietzuchLarge-Scale Distributed Systems (LSDS) Group

Imperial College Londonhttp://lsds.doc.ic.ac.uk<[email protected]>

University of Cambridge – Cambridge, United Kingdom – November 2017 1

Page 2: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Data is the New Oil

• Many new sources of data become available– Most data is produced continuously

• Data powers plethora of new and personalised services…

2Peter Pietzuch - Imperial College London

Mobiledevices Scientific

instruments

CamerasSocial feeds IoTdevices

Internet services,web sites

RFIDtags

Datarepositories

Page 3: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Data-Intensive Systems

• Data analytics over web click streams– How to maximise user experience with

relevant content?– How to analyse “click paths” to trace most

common user routes?

• Machine learning models foronline prediction– E.g. serving adverts on search engines

Peter Pietzuch - Imperial College London 3

Hits

Page Views

Visits

Unique Visitors

Uniquely Identified Visitors

Volume of Available Data

f1

fn

y E {−1,1}

predict

update

• Solution: AdPredictor– Bayesian learning algorithm

ranks adverts according toclick probabilities

Page 4: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Throughout and Result Freshness Matter

Facebook Insights: Aggregates 9 GB/s < 10 sec latencyFeedzai: 40K credit card transactions/s < 25 ms latencyGoogle Zeitgeist: 40K user queries/s < 1 ms latencyNovaSparks: 150M trade options/s < 1 ms latency

Peter Pietzuch - Imperial College London 4

High-throughputprocessing

Low-latencyresults

Data-intensivesystem

Page 5: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Design Space for Data-Intensive Systems

• Tension between performanceand algorithmic complexity

Peter Pietzuch - Imperial College London 5

Easy formostalgorithms

Hard formachinelearningalgorithms

Hard forall algorithms

Result latency

Dat

a am

ount

MBs

GBs

TBs

10s 1s 100ms 10ms 1ms

Page 6: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Algorithmic Complexity Increases

…Share state

Aggregate

Iterate…

Pre-process

Parallelize…

Online machinelearning, data

mining

Topic-basedfiltering

Content-basedfiltering

Complexpattern

matchingStreamqueries

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

T1

T2

T3

T1(a, b, c)

T2(c, d, e)

T3(g, i, h)

Publish/Subscribe Complex EventProcessing (CEP)

Streamprocessing

Peter Pietzuch - Imperial College London 6

Page 7: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Scale Out Model in Data Centres

Peter Pietzuch - Imperial College London 7

Page 8: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Task Parallelism vs. Data Parallelism

Peter Pietzuch - Imperial College London 8

Input data ...

Servers indata centre

Results

select highway, segment, direction, AVG(speed) from Vehicles[range 5 seconds slide 1 second]group by highway, segment, directionhaving avg < 40

Task parallelism:Multiple data processing jobs

Data parallelism:Single data processing job

select distinct W.cidFrom Payments [range 300 seconds] as W,

Payments [partition-by 1 row] as Lwhere W.cid = L.cid and W.region != L.region

select distinct W.cidFrom Payments [range 300 seconds] as W,

Payments [partition-by 1 row] as Lwhere W.cid = L.cid and W.region != L.region

select distinct W.cidFrom Payments [range 300 seconds] as W,

Payments [partition-by 1 row] as Lwhere W.cid = L.cid and W.region != L.region

select highway, segment, direction, AVG(speed) from Vehicles[range 5 seconds slide 1 second]group by highway, segment, directionhaving avg < 40

Page 9: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Distributed Dataflow Systems

• Idea:Execute data-parallel taskson cluster nodes

• Tasks organised as dataflow graph

• Almost all big data systems do this:• Apache Hadoop, Apache Spark,

Apache Storm, Apache Flink,Google TensorFlow, ...

•Peter Pietzuch - Imperial College London 9

parallelismdegree 3

parallelismdegree 2

Page 10: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Nobody Ever Got Fired For Using Hadoop/Spark

• 2012 study of MapReduce workloads– Microsoft: median job size < 14 GB– Yahoo: median job size < 12.5 GB– Facebook: 90% of jobs < 100 GB

• Many data-intensive jobs easily fit into memory

• One server cheaper/more efficient than compute cluster

Peter Pietzuch - Imperial College London 10

(A. Rowstron, D. Narayanan, A. Donnely, G. O’Shea, A. Douglas, HotCDP’12)

Page 11: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Parallelism of Heterogeneous Servers

Servers have many parallel CPU coresHeterogeneous servers with GPUs common

New types of compute accelerators: Xeon Phi, Google's TPUs, FPGAs, ...Peter Pietzuch - Imperial College London 11

L3

C1

C2

C3

C4

C5

C6

C7

C8

L3

C1

C2

C3

C4

C5

C6

C7

C8

L2 Cache

DRAM DRAM

SMX1 ... SMXN

Sock

et 1

Sock

et 2

Command QueuePCIe Bus

DMA

1000s ofGPU cores

10s ofCPU cores

Page 12: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Slide courtesy of Torsten Hoefler (Systems Group, ETH Zürich)

Servers Are Becoming Increasingly Heterogeneous

Peter Pietzuch - Imperial College London 12

E How can Data-Intensive Systems Exploit Heterogeneous Hardware?

Page 13: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Roadmap

• SABER: Hybrid stream processing engine for heterogeneous servers• [SIGMOD’16]

• (1) How to parallelise computation on modern hardware?

• (2) How to utilise heterogeneous servers?

• (3) Experimental performance results

Peter Pietzuch - Imperial College London 13

Page 14: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Analytics with Window-based Stream Queries

• Real-time analytics over data streams• Windows define finite data amount for processing

Peter Pietzuch - Imperial College London 14

window

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

highwaysegmentdirectionspeed

now

Time-based window with size τ at current time t[t - τ : t] Vehicles[Range τ seconds]

Count-based window with size n:last n tuples Vehicles[Rows n]

Page 15: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Defining Stream Query Semantics

• Windows convert data streams to dynamic relations (database table)

Peter Pietzuch - Imperial College London 15

Streams Relations

Window specification

Stream operators: Istream, Dstream, Rstream

Any relational query

(select, project,join, group by, etc)

Page 16: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

SQL Stream Queries

SQL provides well-defined declarative semantics for queries– Based on relational algebra (select, project, join, …)

• Example: Identify slow moving traffic on highway– Input stream: Vehicles(highway, segment, direction, speed)– Find highway segments with average speed below 40 km/h

Peter Pietzuch - Imperial College London 16

select highway, segment, direction, AVG(speed) as avg

from Vehicles[range 5 sec slide 1 sec]group by highway, segment, directionhaving avg < 40

Input data

Output

Operators

Page 17: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

(1) How to Parallelise Computation?

• Perform query evaluation across sliding windows in parallel– Exploit data parallelism across stream

Peter Pietzuch - Imperial College London 17

123456

w1

w2

w3

w4

size: 4 secslide: 1 sec

Page 18: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

How to use GPUs with Stream Queries?

• Naive strategy parallelises computation along window boundaries

Peter Pietzuch - Imperial College London 18

123456 size: 4 secslide: 1 sec

Task T1

Task T2

Combine partial results

E Window-based parallelism results in redundant computation

Page 19: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

How to use GPUs with Stream Queries?

• Parallel processing of non-overlapping window data?

Peter Pietzuch - Imperial College London 19

w1

w2

w3

w4

123456 size: 4 secslide: 1 sec

T1

T2

T3

T4

T5

Combine partial results

E Slide-based parallelism limits degree of parallelism

Page 20: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Apache Spark: Small Slides à Low Throughput

• Spark relates window slide to micro-batch size used for parallelisation

Peter Pietzuch - Imperial College London 20

00.20.40.60.8

11.21.41.61.8

2

0 1 2 3 4 5 6 7 8 9

Thro

ughp

ut

(106

tupl

es/s

)

Window slide (106 tuples)

select AVG(S.1) from S [rows 1024 slide x]

E Avoid coupling system parameters with query definition

Page 21: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

SABER: Parallel Window Processing

• Idea: Parallelise using task size that is best for hardware

• Task contains one or more window fragments

Peter Pietzuch - Imperial College London 21

10 9 8 7 6 5 4 3 2 115 14 13 12 11 10 9 8 7 615 14 13 12 11

T1T2T3

w1w2

w3w4

w5

w1w2

w3w4

w5

size: 7 tuplesslide: 2 tuples

5 tuples/task

Page 22: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

SABER: Window Fragment Processing

• Process window fragments in parallel• Reassemble partial results to obtain overall result

Peter Pietzuch - Imperial College London 22

Worker B: T2

w1w2w3

w4w5

Worker A: T1w1

w2w3

w1result

w2result

Result stageSlot 2 Slot 1

EmptyEmpty

Output result circular buffer

Partial result reassembly must also be done in parallel

Page 23: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

• Fragment function ff– Processes window fragments

• Assembly function fa– Merges partial window results

• Batch function fb– Composes fragment functions within task– Allows incremental processing

10 9 8 7 6 5 4 3 2 1

T1T2

w1w2

size: 7 tuplesslide: 2 tuples

5 tuples/task

fa fa

ff ff

w2 results

ff ff

w1 results

output

fb

API for Operator Implementation

Peter Pietzuch - Imperial College London 23

Page 24: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

SABER: Performance of Window-based Queries

Peter Pietzuch - Imperial College London 24

0

0.05

0.1

0.15

0.2

0

2

4

6

8

64 256 1024 4096 16384

Late

ncy

(sec

)

Thro

ughp

ut (G

B/s)

Window slide (tuples)

select AVG(S.1) from S [rows 1024 slide x]

E Performance of window-based queries remains predictable

Page 25: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

0

2

4

6

8

0.0625 0.25 1 4 16 64 256 1024 4096

Thro

ughp

ut (G

B/s)

Task Size (KB)

CPU-only processing

GPU-only processing

How to Pick the Task Size?

• Problem: Small data transfers over PCIe bus costly– Example: select * from S where p1 [rows 1 slide 1]

Limited by dispatcher and thread contention

Limited by data movement

Peter Pietzuch - Imperial College London 25

Page 26: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Roadmap

• SABER: Hybrid stream processing engine for heterogeneous servers• [SIGMOD’16]

• (1) How to parallelise computation on modern hardware?

• (2) How to utilise heterogeneous servers?

Peter Pietzuch - Imperial College London 26

E Avoid coupling system parameters with processing semantics

Page 27: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

(2) How to Utilise Heterogeneous Servers?

• Hard to decide acceleration potential of heterogeneous processors– Depends on operator semantics, window definition,

data distribution, …

Peter Pietzuch - Imperial College London 27

E Don’t leave decision about heterogeneous processors to users

0

2

4

6

Thro

ughp

ut (G

B/s)

0

0.1

0.2

0.3 CPU execution

GPU execution

Aggregation Group-by θ-join

GPU is faster CPU is faster

Page 28: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

SABER: Hybrid Execution Model

• Idea: Execute tasks on all heterogeneous processors (CPUs, GPUs, ...)

• Fully utilise all hardware parallelism available in dedicated servers

Peter Pietzuch - Imperial College London 28

T2 T1T3T4T5T6T7T8T9

QBQAQBQBQBQBQAQBQA

T10

QA

Task queue CPU worker

GPU worker

0 3 6 9 12

CPUGPU T1 T2

T3

T2T1 T4 T5 T6

T7

T8 T9

T10

T4 T5 T6

Page 29: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Static Task Scheduling using Cost Model?

Peter Pietzuch - Imperial College London 29

CPU GPUQA 3 ms 2 msQB 3 ms 1 ms

T2 T1T3T4T5T6T7T8T9

QBQAQBQBQBQBQAQBQA

T10

QA

Task queue

T2

T1

CPUGPU

Static: QA on CPU and QB on GPU

T7 T9 T10

T3 T4 T5 T6 T8

0 3 6 9 12

CPU workers

GPU worker

E Static scheduling under-utilises processors

• Profile tasks to obtain cost model• Assign tasks to processor with shortest execution time

Page 30: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

First-Come First-Serve Task Scheduling?

Peter Pietzuch - Imperial College London 30

E FCFS ignores effectiveness of processors for given task

CPU GPUQA 3 ms 2 msQB 3 ms 1 ms

T2 T1T3T4T5T6T7T8T9

QBQAQBQBQBQBQAQBQA

T10

QA

Task queue CPU workers

GPU worker

0 3 6 9 12

CPUGPU

First-Come First-Served

T1 T4 T8

T2 T3 T5 T6 T7 T9

T10

• Assign tasks to processors first-come, first-serve– CPU/GPU execute both QA and QB tasks

Page 31: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Heterogeneous Lookahead Scheduling (HLS)

Idea: Scheduler assigns tasks to idle processors dynamically– Skips tasks that could be executed faster by another processor

Peter Pietzuch - Imperial College London 31

CPU GPUQA 3 ms 2 msQB 3 ms 1 ms

T2 T1T3T4T5T6T7T8T9

QBQAQBQBQBQBQAQBQA

T10

QA

Task queue CPU workers

GPU worker

0 3 6 9 12

CPUGPU

HLS

T1 is slower on CPU: Skip

T1

T2 is slower on CPU: Skip

T2

T3 is slower on CPU but already have 3 ms of work for GPU

T3

T2T1

Skip T4, T5 and T6 and select T7

T4 T5 T6

T7

Finally skip T8 and T9 and select T10

T8 T9

T10

T4 T5 T6

E HLS achieves aggregate throughput of all heterogeneous processors

Page 32: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

SABER Hybrid Stream Processing Engine

Peter Pietzuch - Imperial College London 32

T1

T2

T2 T1

op

ααop

CPU

GPU

T1 T2

Scheduling & execution stageDequeue tasks based on HLS

Dispatching stageDispatch fixed-size tasks

Merge & forward partial window results

Result stage

Java15K LOC

C & OpenCL4K LOC

Page 33: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Roadmap

• SABER: Hybrid stream processing engine for heterogeneous servers• [SIGMOD’16]

• (1) How to parallelise computation on modern hardware?

• (2) How to utilise heterogeneous servers?

• (3) Experimental performance results

Peter Pietzuch - Imperial College London 33

E Hybrid execution utilises all heterogeneous processors

E Avoid coupling system parameters with processing semantics

Page 34: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Experimental Evaluation

Peter Pietzuch - Imperial College London 34

Ubuntu Linux 14.04 NVIDIA driver 346.47

Intel Xeon 2.6 GHz

NVIDIA Quadro K5200

PCIe 3.0 x16

10 Gbps NIC

16 cores

64 GB RAM

2,304cores

8 GB RAM

Google Cluster Data 144M jobs events from Google infrastructure

SmartGridMeasurements974Mplugmeasurementsfromhouses

LinearRoadBenchmark11Mcarpositionsandspeedonhighway

Page 35: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

What is SABER’s Performance?

Peter Pietzuch - Imperial College London 35

0

10

20

30

40

50

CM2 SG1 SG2 LRB3 LRB4Thro

ughp

ut (1

06tu

ples

/s)

SABER (CPU contrib.)

SABER (GPU contrib.)

Cluster Mgmt. Smart Grid LRB

aggravg group-byavg select

group-byavg group-bycnt

group-bycntgroup-byavg

select

Intel Xeon 2.6 GHz

NVIDIA Quadro K5200

16 cores

2,304 cores

E SABER exploits both CPUs and GPUs effectively for different queries

Page 36: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Is Hybrid Throughput Additive?

Peter Pietzuch - Imperial College London 36

0

2

4

6

Thro

ughp

ut (G

B/s)

0

0.1

0.2

0.3 SABER (CPU only)SABER (GPU only)SABER

Aggregation Group-by θ-join

GPU is faster CPU is faster Not additive due to queue contention

E Aggregate throughput of CPU + GPU always highest

Page 37: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

What is the Trade-Off between CPUs and GPUs?

• Hybrid processing model benefits from GPU's ability to process complex predicates fast

0

2

4

6

8

1 4 16 64

Thro

ughp

ut (G

B/s)

# selection predicates

SABER (CPU only) SABER (GPU only) SABER

Dispatch bound

0

0.1

0.2

0.3

0.4

1 4 16 64# join predicates

[rows 1024, slide 1024] [rows 1024, slide 1024]

Peter Pietzuch - Imperial College London 37

Page 38: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Does SABER Adapt to Workload Changes?

38

0

0.1

0.2

0 10 20 30 40 50 60

Selec

tivity

0

2

4

6

8

0 10 20 30 40 50 60

Thro

ughp

ut (G

B/s)

Time (seconds)

Series1 Series2SABER SABER (GPU contribution)

HLS periodically uses idle, non-preferred processor to run tasks to update query task throughput matrix

E Higher selectivity à more predicates evaluated à GPU preferredPeter Pietzuch - Imperial College London

Page 39: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Summary

• Heterogeneous servers have huge impact on data-intensive systems– Shift from scale out to scale up model– Need new general-purpose system designs for heterogeneous servers

☛SABER: Hybrid Stream Processing Engine for CPUs & GPUs

• (1) Parallelise computation to fit hardware capabilitiesE Decouple hardware/system parameters from processing semantics

• (2) Fully utilise all heterogeneous processors independently of workloadE Hybrid processing model to achieve aggregate CPU/GPU throughput

39Peter Pietzuch - Imperial College London

Page 40: Designing Hybrid Data Processing Systems for Heterogeneous …ey204/teaching/ACS/R244_2017... · 2017-11-09 · Defining Stream Query Semantics • Windows convert data streams to

Acknowledgement:LSDS Group at Imperial College London

Peter Pietzuch - Imperial College London 40

Peter Pietzuchhttp://lsds.doc.ic.ac.uk<[email protected]>Thank you!

Any Questions?

We're Hiring!Post-docs, PhDs