on random sampling over joins

17
SURAJIT CHAUDHURI RAJEEV MOTWANI VIVEK NARASAYYA On random sampling over Joins Presented by : Srikantha Nema

Upload: akiko

Post on 23-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

On random sampling over Joins. Surajit Chaudhuri Rajeev Motwani Vivek Narasayya. Presented by : Srikantha Nema. Outline. Semantics of Sample Difficulty of join Sampling Algorithms for Sampling Sampling strategies New strategies for join Sampling Experimental evaluation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: On random sampling over Joins

SURAJIT CHAUDHURIRAJEEV MOTWANIVIVEK NARASAYYA

On random sampling over Joins

Presented by : Srikantha Nema

Page 2: On random sampling over Joins

Outline

Semantics of SampleDifficulty of join SamplingAlgorithms for SamplingSampling strategiesNew strategies for join SamplingExperimental evaluationConclusions

Page 3: On random sampling over Joins

Terminologies

SAMPLE(R, f) is an SQL operation

When a query Q is evaluated, we obtain relation R

f is a fraction of a relation R

Page 4: On random sampling over Joins

Semantics of Sample

Sampling with Replacement (WR)

Sampling without Replacement (WoR)

Independent Coin Flips (CF)

Page 5: On random sampling over Joins

Difficulty of Join Sampling

,,,...,,,,,,, 23212011 kbabababaBAR

kcacacacaCAR ,,....,,,,,,, 12111022

),( 21 fRRSAMPLE

),(),( 2211 fRSAMPLEfRSAMPLE ?

Page 6: On random sampling over Joins

Classification of Join Sampling problem

Case A No information is available for either or

Case B No information is available for but indexes and

/or statistics are available for Case C

Indexes/statistics are available for and

1R 2R

1R2R

1R 2R

Page 7: On random sampling over Joins

Algorithms for Sampling

Unweighted Sequential WR Sampling Black-Box U1 Black-Box U2

Weighted Sequential WR Sampling Black-Box WR1 Black-Box WR2

Page 8: On random sampling over Joins

Unweighted Sequential WR Sampling

Black-Box U2

Black-Box U1

Page 9: On random sampling over Joins

Weighted Sequential Sampling

Black-Box WR1

Black-Box WR2

Page 10: On random sampling over Joins

Sampling Strategies (old)

Strategy Naïve-Sample

Strategy Olken-Sample

Page 11: On random sampling over Joins

New strategies for join Sampling

Strategy Stream-Sample

Strategy Group-Sample

Strategy Frequency-Partition-Sample

Page 12: On random sampling over Joins

Strategy Frequency-Partition-Sample

Page 13: On random sampling over Joins

Experimental Evaluation 1

Page 14: On random sampling over Joins

Experimental Evaluation 2

Page 15: On random sampling over Joins

Experimental Evaluation 3

Page 16: On random sampling over Joins

Conclusions

Difficulty of join samplingClassification of the problem into 3 casesStrategies for join samplingNew schemes for sequential random

sampling for uniform and weighted samplingMore efficient strategies can be developed

for the case of single joinMore work needed to understand the

problem of sampling the result of join trees

Page 17: On random sampling over Joins

Thank You