3 ideas for big data exploration · 3 ideas for big data exploration stratos idreos cwi, ins-1,...

122
3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch

Upload: others

Post on 07-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

3 Ideas for Big Data Exploration

Stratos IdreosCWI, INS-1, Amsterdam

adaptive indexingadaptive loading

dbTouch

Page 2: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily
Page 3: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily
Page 4: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily
Page 5: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

data is everywhere

Page 6: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

years

daily data

Page 7: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

years

daily data

Eric Schmidt: Every two days we create as much data as much we did from dawn of humanity to 2003

Page 8: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

years

daily data

Eric Schmidt: Every two days we create as much data as much we did from dawn of humanity to 2003

2012

3-4 exabytes

Page 9: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily
Page 10: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

not always sure what we are looking for (until we find it)

Page 11: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

data exploration

not always sure what we are looking for (until we find it)

Page 12: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database systems great...declarative processing, back-end to numerous apps

Page 13: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database systems great...

but databases are too heavy and blind!

declarative processing, back-end to numerous apps

Page 14: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database systems great...

but databases are too heavy and blind!

timeline

declarative processing, back-end to numerous apps

Page 15: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database systems great...

but databases are too heavy and blind!

timeline

load

declarative processing, back-end to numerous apps

Page 16: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database systems great...

but databases are too heavy and blind!

timeline

load index

declarative processing, back-end to numerous apps

Page 17: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database systems great...

but databases are too heavy and blind!

timeline

load index query

declarative processing, back-end to numerous apps

Page 18: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database systems great...

but databases are too heavy and blind!

timeline

load index query

declarative processing, back-end to numerous apps

expert users - idle time - workload knowledge

Page 19: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily
Page 20: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

systems tailored for data exploration

Page 21: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

systems tailored for data exploration

no workload knowledge

Page 22: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

systems tailored for data exploration

no installation steps

no workload knowledge

Page 23: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

systems tailored for data exploration

quick response times

no installation steps

no workload knowledge

Page 24: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

systems tailored for data exploration

quick response times

no installation steps

no workload knowledge

minimize data-to-query time

Page 25: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

adaptive indexing7 years, 7 papers

ACM SIGMOD Jim Gray Diss. Award - Cor Baayen Award

3 ideas

Page 26: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

adaptive indexing

adaptive loading

7 years, 7 papers

3 years, 3 papers

ACM SIGMOD Jim Gray Diss. Award - Cor Baayen Award

In use in Hadapt and Tableau

3 ideas

Page 27: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

adaptive indexing

adaptive loading

dbTouch

7 years, 7 papers

3 years, 3 papers

1 year, 1 paperVENI 2012

ACM SIGMOD Jim Gray Diss. Award - Cor Baayen Award

In use in Hadapt and Tableau

3 ideas

Page 28: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

adaptive indexing

adaptive loading

dbTouch

7 years, 7 papers

3 years, 3 papers

1 year, 1 paperVENI 2012

Martin Kersten, Stefan Manegold, Felix Halim, Panagiotis Karras, Roland Yap, Goetz Graefe, Harumi Kuno,

Eleni Petraki, Anastasia Ailamaki, Ioannis Alagiannis, Renata Borovica, Miguel Branco, Ryan Johnson, Erietta Liarou

ACM SIGMOD Jim Gray Diss. Award - Cor Baayen Award

In use in Hadapt and Tableau

3 ideas

Page 29: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

indexing

tune= create proper indexes offlineperformance 10-100X

load index query

Page 30: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

indexing

tune= create proper indexes offlineperformance 10-100X

but it depends on workload!which indices to build?on which data parts?and when to build them?

load index query

Page 31: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

load index query

Page 32: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

timeline

load index query

Page 33: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

timeline

sample workload

load index query

Page 34: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

timeline

sample workload analyze

load index query

Page 35: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

timeline

sample workload analyze create indices

load index query

Page 36: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

timeline

sample workload analyze create indices query

load index query

Page 37: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

timeline

sample workload analyze create indices query

complex and time consuming process

load index query

Page 38: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

timeline

sample workload analyze create indices query

complex and time consuming process

load index query

human administrators + auto-tuning tools

Page 39: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

what can go wrong?

Page 40: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

what can go wrong?

not enough idle time to finish proper tuning

Page 41: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

what can go wrong?

not enough idle time to finish proper tuning

by the time we finish tuning, the workload changes

Page 42: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database cracking

Page 43: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database crackingincremental, adaptive, partial indexing

remove all tuning steps and need for human input

Page 44: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database crackingincremental, adaptive, partial indexing

remove all tuning steps and need for human input

indexing

initialization

Page 45: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database crackingincremental, adaptive, partial indexing

remove all tuning steps and need for human input

indexing

initialization query

Page 46: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database crackingincremental, adaptive, partial indexing

remove all tuning steps and need for human input

indexing

initialization query

Page 47: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

every query is treated as an advice on how data should be stored

database crackingincremental, adaptive, partial indexing

remove all tuning steps and need for human input

indexing

initialization query

Page 48: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking exampleDatabase Cracking CIDR 2007

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Page 49: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking exampleDatabase Cracking CIDR 2007

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Page 50: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Database Cracking CIDR 2007

Page 51: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Database Cracking CIDR 2007

Page 52: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Database Cracking CIDR 2007

Page 53: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Database Cracking CIDR 2007

Page 54: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Database Cracking CIDR 2007

Page 55: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Database Cracking CIDR 2007

Page 56: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Database Cracking CIDR 2007

Page 57: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Resu

lt tu

ples

Database Cracking CIDR 2007

Page 58: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operatorRe

sult

tupl

es

Database Cracking CIDR 2007

Page 59: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operatorRe

sult

tupl

es

Database Cracking CIDR 2007

gain knowledge on how data is organized

Page 60: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Page 61: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Page 62: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Page 63: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Page 64: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Page 65: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Page 66: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Page 67: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Page 68: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Page 69: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Resu

lt tu

ples

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Page 70: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Resu

lt tu

ples

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

the more we crack, the more we learn

Page 71: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking example

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Resu

lt tu

ples

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

every query is treated as an advice on how data should be stored

the more we crack, the more we learn

Page 72: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

TPC-H

Sideways Cracking, SIGMOD 09

70

330

100

150

200

250

300

0 5 10 15 20 25 30Query sequence

TPC-H Query 15764420

1000

10000

Sel. Crack

Sid. Crack

MonetDB

PresortedMySQL

PresortedRe

spon

se ti

me

(milli

sec

s)

Page 73: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

TPC-H

Sideways Cracking, SIGMOD 09

70

330

100

150

200

250

300

0 5 10 15 20 25 30Query sequence

TPC-H Query 15764420

1000

10000

Sel. Crack

Sid. Crack

MonetDB

PresortedMySQL

PresortedRe

spon

se ti

me

(milli

sec

s)

Normal MonetDB

selection cracking

Page 74: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

TPC-H

Sideways Cracking, SIGMOD 09

70

330

100

150

200

250

300

0 5 10 15 20 25 30Query sequence

TPC-H Query 15764420

1000

10000

Sel. Crack

Sid. Crack

MonetDB

PresortedMySQL

Presorted

Preparation cost3-14 minutes

Resp

onse

tim

e (m

illi s

ecs)

Presorted MonetDB

Normal MonetDB

selection cracking

Page 75: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

TPC-H

Sideways Cracking, SIGMOD 09

70

330

100

150

200

250

300

0 5 10 15 20 25 30Query sequence

TPC-H Query 15764420

1000

10000

Sel. Crack

Sid. Crack

MonetDB

PresortedMySQL

Presorted

Preparation cost3-14 minutes

Resp

onse

tim

e (m

illi s

ecs)

Presorted MonetDB

MonetDB with sideways cracking

Normal MonetDB

selection cracking

Page 76: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

TPC-H

Sideways Cracking, SIGMOD 09

70

330

100

150

200

250

300

0 5 10 15 20 25 30Query sequence

TPC-H Query 15764420

1000

10000

Sel. Crack

Sid. Crack

MonetDB

PresortedMySQL

Presorted

Preparation cost3-14 minutes

Resp

onse

tim

e (m

illi s

ecs)

Presorted MonetDB

MonetDB with sideways cracking

Normal MonetDB

selection cracking

Page 77: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

TPC-H

Sideways Cracking, SIGMOD 09

70

330

100

150

200

250

300

0 5 10 15 20 25 30Query sequence

TPC-H Query 15764420

1000

10000

Sel. Crack

Sid. Crack

MonetDB

PresortedMySQL

Presorted

Preparation cost3-14 minutes

Resp

onse

tim

e (m

illi s

ecs)

Presorted MonetDB

MonetDB with sideways cracking

Normal MonetDB

selection cracking

Page 78: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

TPC-H

Sideways Cracking, SIGMOD 09

70

330

100

150

200

250

300

0 5 10 15 20 25 30Query sequence

TPC-H Query 15764420

1000

10000

Sel. Crack

Sid. Crack

MonetDB

PresortedMySQL

Presorted

Preparation cost3-14 minutes

Resp

onse

tim

e (m

illi s

ecs)

Presorted MonetDB

MonetDB with sideways cracking

Normal MonetDB

selection cracking

Page 79: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

skyserver experiments

cracking answers 160.000 queries while full indexing is still half way creating the index

Stochastic Cracking, PVLDB 12

Page 80: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking databasesupdates (SIGMOD07)

multiple columns (SIGMOD09)

concurrency control (PVLDB12)

storage restrictions (SIGMOD09)

robustness (PVLDB12)

benchmarking (TPCTC10)

algorithms (PVLDB11)

Page 81: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

cracking databasesupdates (SIGMOD07)

multiple columns (SIGMOD09)

concurrency control (PVLDB12)

storage restrictions (SIGMOD09)

robustness (PVLDB12)

benchmarking (TPCTC10)

algorithms (PVLDB11)

disk based

joins

multi-cores

aggregations

row-stores

vectorwised

optimization

multi-dimensional

holistic indexing

Page 82: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

load index query

loading

Page 83: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

load index query

loading

copy data inside the databasedatabase now has full control

Page 84: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

load index query

loading

copy data inside the databasedatabase now has full control

slow process...not all data might be needed all the time

Page 85: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database vs. unix tools

1 file, 4 attributes, 1 billion tuples

0 550 1,100 1,650 2,200DBAwk

single query cost (secs)

Page 86: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database vs. unix tools

93%

7%LoadingQuery Processing

1 file, 4 attributes, 1 billion tuples

0 550 1,100 1,650 2,200DBAwk

single query cost (secs)

break down db cost

Page 87: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database vs. unix tools

93%

7%LoadingQuery Processing

1 file, 4 attributes, 1 billion tuples

0 550 1,100 1,650 2,200DBAwk

single query cost (secs)

break down db cost

loading is a major bottleneck

Page 88: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

database vs. unix tools

93%

7%LoadingQuery Processing

1 file, 4 attributes, 1 billion tuples

0 550 1,100 1,650 2,200DBAwk

single query cost (secs)

break down db cost

... but writing/maintaining scripts is hard too

loading is a major bottleneck

Page 89: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

adaptive loading

load/touch only what is needed and only when it is needed

Page 90: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

but raw data access is expensive

tokenizing - parsing - no indexing - no statistics

Page 91: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

but raw data access is expensive

tokenizing - parsing - no indexing - no statistics

challenge: fast raw data access

Page 92: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

query plan

Page 93: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

query plan

scan

Page 94: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

query plan

scan

db

Page 95: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

query plan

scan

db

scan

Page 96: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

query plan

scanscan

zero loading cost

Page 97: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

query plan

scanscan

files

access raw data adaptively on-the-fly

zero loading cost

Page 98: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

query plan

scanscan

files cache

access raw data adaptively on-the-fly

zero loading cost

Page 99: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

query plan

scanscan

files cache

access raw data adaptively on-the-fly

zero loading cost

selective parsing - file indexing file splitting - online statistics

Page 100: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

40

0

200

400

600

800

1000

1200

1400

MySQL CSV�Engine�MySQL

DBMS�X DBMS�Xw/�external�files

PostgreSQL PostgresRawPM�+�C

Execution�Time�(sec)

Q20 Q19 Q18Q17 Q16 Q15Q14 Q13 Q12Q11 Q10 Q9Q8 Q7 Q6Q5 Q4 Q3Q2 Q1 Load

~7500 ~3500

NoDB SIGMOD 2012

Page 101: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

40

0

200

400

600

800

1000

1200

1400

MySQL CSV�Engine�MySQL

DBMS�X DBMS�Xw/�external�files

PostgreSQL PostgresRawPM�+�C

Execution�Time�(sec)

Q20 Q19 Q18Q17 Q16 Q15Q14 Q13 Q12Q11 Q10 Q9Q8 Q7 Q6Q5 Q4 Q3Q2 Q1 Load

~7500 ~3500

NoDB SIGMOD 2012

Page 102: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

40

0

200

400

600

800

1000

1200

1400

MySQL CSV�Engine�MySQL

DBMS�X DBMS�Xw/�external�files

PostgreSQL PostgresRawPM�+�C

Execution�Time�(sec)

Q20 Q19 Q18Q17 Q16 Q15Q14 Q13 Q12Q11 Q10 Q9Q8 Q7 Q6Q5 Q4 Q3Q2 Q1 Load

~7500 ~3500

NoDB SIGMOD 2012

Page 103: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

40

0

200

400

600

800

1000

1200

1400

MySQL CSV�Engine�MySQL

DBMS�X DBMS�Xw/�external�files

PostgreSQL PostgresRawPM�+�C

Execution�Time�(sec)

Q20 Q19 Q18Q17 Q16 Q15Q14 Q13 Q12Q11 Q10 Q9Q8 Q7 Q6Q5 Q4 Q3Q2 Q1 Load

~7500 ~3500

NoDB SIGMOD 2012

Page 104: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

40

0

200

400

600

800

1000

1200

1400

MySQL CSV�Engine�MySQL

DBMS�X DBMS�Xw/�external�files

PostgreSQL PostgresRawPM�+�C

Execution�Time�(sec)

Q20 Q19 Q18Q17 Q16 Q15Q14 Q13 Q12Q11 Q10 Q9Q8 Q7 Q6Q5 Q4 Q3Q2 Q1 Load

~7500 ~3500

NoDB SIGMOD 2012

Page 105: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

40

0

200

400

600

800

1000

1200

1400

MySQL CSV�Engine�MySQL

DBMS�X DBMS�Xw/�external�files

PostgreSQL PostgresRawPM�+�C

Execution�Time�(sec)

Q20 Q19 Q18Q17 Q16 Q15Q14 Q13 Q12Q11 Q10 Q9Q8 Q7 Q6Q5 Q4 Q3Q2 Q1 Load

~7500 ~3500

NoDB SIGMOD 2012

Page 106: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

40

0

200

400

600

800

1000

1200

1400

MySQL CSV�Engine�MySQL

DBMS�X DBMS�Xw/�external�files

PostgreSQL PostgresRawPM�+�C

Execution�Time�(sec)

Q20 Q19 Q18Q17 Q16 Q15Q14 Q13 Q12Q11 Q10 Q9Q8 Q7 Q6Q5 Q4 Q3Q2 Q1 Load

~7500 ~3500

NoDB SIGMOD 2012

Page 107: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

40

0

200

400

600

800

1000

1200

1400

MySQL CSV�Engine�MySQL

DBMS�X DBMS�Xw/�external�files

PostgreSQL PostgresRawPM�+�C

Execution�Time�(sec)

Q20 Q19 Q18Q17 Q16 Q15Q14 Q13 Q12Q11 Q10 Q9Q8 Q7 Q6Q5 Q4 Q3Q2 Q1 Load

~7500 ~3500

reducing data-to-query time

NoDB SIGMOD 2012

Page 108: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

load index query

querying

Page 109: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

load index query

querying

declarative SQL interface

Page 110: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

load index query

querying

declarative SQL interfacecorrect and complete answers

Page 111: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

load index query

querying

declarative SQL interfacecorrect and complete answers

Page 112: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

load index query

querying

declarative SQL interfacecorrect and complete answers

complex and slow - not fit for exploration

Page 113: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

dbTouch CIDR 2013

dbTouch

Page 114: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

dbTouch CIDR 2013

data

dbTouch

Page 115: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

dbTouch CIDR 2013

data

query

dbTouch

Page 116: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

dbTouch CIDR 2013

data

query

no expert knowledge - no set-up costsinteractive mode fit for exploration

dbTouch

Page 117: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

dbTouch CIDR 2013

challengedesign database systems tailored for touch exploration

Page 118: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

dbTouch CIDR 2013

challengedesign database systems tailored for touch exploration

user drives query processing actions

Page 119: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

3 Ideas for Big Data Exploration

Stratos IdreosCWI, INS-1, Amsterdam

adaptive indexingadaptive loading

dbTouch

Page 120: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

3 Ideas for Big Data Exploration

Stratos IdreosCWI, INS-1, Amsterdam

adaptive indexingadaptive loading

dbTouch

systems tailored for exploration

Page 121: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

more in INS-1

data vaults

sciql

datacell

holistic indexing

sciborg

charles

hardware-conscious

cyclotron

Page 122: 3 Ideas for Big Data Exploration · 3 Ideas for Big Data Exploration Stratos Idreos CWI, INS-1, Amsterdam adaptive indexing adaptive loading dbTouch. data is everywhere. years daily

more in INS-1

data vaults

sciql

datacell

holistic indexing

sciborg

charles

Thank you!

hardware-conscious

cyclotron