microsoft sql server physical join operators

Post on 22-Nov-2014

1.841 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

SQL Server implements three different physical operators to perform joins. In this presentation you'll see how each of these operators work plus its advantages and challenges. You'll learn: * The logic behind the optimizer's decisions * Which operator to use for various joins using (semi) real life examples * How to avoid common join-related pitfalls Ami Levin is a Microsoft SQL Server MVP and a Mentor with SolidQ. For the past 14 years, he has been consulting, teaching, writing, and speaking about SQL Server worldwide. Levin’s areas of expertise are data modeling, database design, T-SQL and performance tuning. Before moving to California, he led the Israeli SQL Server user group (ISUG) and moderated the Hebrew MSDN SQL Server support forum. Ami is a regular speaker at Microsoft Tech-Ed Israel, Dev Academy, and other SQL Server conferences. He blogs at SQL Server Tuning Blog.

TRANSCRIPT

Ami Levin, SolidQPresented to the Silicon Valley SQL Server User Group, April 2013

Nesting Merged Hash Loops

Ami LevinCTO, DBSophic

SQL ServerPhysical Join Operators

Session Goals

SQL Server uses three physical join operators:Nested loops, Merge, and Hash Match.

In this session we will:

• See how each of these operators work

• Review their advantages and drawbacks

• Understand some of the logic behind theoptimizer’s decisions on which operator to use

• Learn to identify common join-related pitfalls

2

Not This Time

• Outer joins

• Non equi-joins

• Logical processing order

• NULL issues

• Join parallelism

• Partitioned joins

• …

3

Equi-Inner-Join

SELECT Foo, Bar, ...

FROM T1 INNER JOIN T2

ON T1.C1 = T2.C1

AND T1.C2 = T2.C2

AND ...

WHERE ...

4

Visual Join Simulator

5

Nested Loops

6

Fetch next rowfrom blue input

Rowexists

Quit

Find matchingrows in red input

True

False

Start

?

Nested Loops I

• Outer loop determines number of iterations

• At least one input should be (relatively) small

• Inner operation is performed for everyiteration of the outer loop

• Index or table scan (naïve)

• Index seek + lookup

• Covering index seek

• Index spool

7

• Data pages may be accessed repeatedly

• Risky a-sequential page access path

• Output of matching row sets is fast

• Unordered, but typically grouped

• Physical resources

• CPU Very low

• Physical IO low to very high

• Memory low

Nested Loops II

8

Nested Loopswith Foreign Key Joins

• Foreign keys join parent and child

• Most common relationship is one-to-many

• Often parent input is significantly smaller

• Parent must already be indexed

• Either primary key or unique constraint

• Therefore, indexing foreign keys oftenenables efficient use of nested loops

9

Nested Loops

10

Merge

11

Fetch next rowfrom blue input

Rowexists

Quit

Fetch next rowfrom red input

True

False

Start

RowsmatchTrue

False

? ?

Merge I

• Inputs must be sorted prior to merge

• Sorted by (all?) join expression(s)

• Pre-sorted in plan, but not necessarily in DB

• Preferred when sorting supports additionalplan operations

• Merge join types

• One to many

• Many to many - requires temporary worktable

12

Merge II

• Residual predicates

• Fast, ordered and grouped output

• Physical resources

• CPU Very low

• Physical IO Very low

• Memory Very low

• * Excluding sorting costs

13

Merge

14

Hash Match - Phase I (Build)

15

Fetch next rowfrom blue input

Rowexists

Phase II

Apply hashfunction

True

False

Start

?

Hash Match - Phase II (Probe)

16

Fetch next rowfrom red input

Rowexists

Quit

Apply hashfunction

True

False

Phase I

?

• Hash function selection

• Extremely complex

• CPU intensive

• Build and probe costs are hidden

• Do not constitute logical reads

• Output of matching row sets is slow

• Unordered and typically ungrouped

Hash Match I

17

• In memory hash join

Grace hash join

Recursive hash join

• Hash bailout

• Hash warnings event class

• Update Statistics

• Add more RAM

• Role reversal

Hash Match II

18

Hash Match III

• May indicate sub-optimal indexing

• Best for very large, non covered joins

• Physical resources

• CPU Very high

• Physical IO Low to very high

• Memory Very high

19

Hash Match

20

Summary

21

Nested Loops Merge Hash

Good whenSmall outer input

Inner input indexed

Pre-sorted inputs

Sorting needed

Very large inputs

Not well indexed

CPU LowLow

* Excluding sortingHigh

Memory LowLow

* Excluding sortingHigh

Physical IO Low / High Low Low / High

Logical reads High LowLow

* Misleading

OutputFast, unordered,

grouped*

Fast, ordered,

grouped

Slow, unordered,

ungrouped*

For More Information

• Books on line

• White papers

• “Inside Microsoft SQL server” books

• Craig Freedman’s blog

• http://blogs.msdn.com/craigfr/about.aspx

22

Physical Join Operators

23

Complete the Evaluation Formto Win!Win a Dell Mini Netbook – every day – just for handingin your completed form. Each session evaluation formrepresents a chance to win.

Pick up your evaluation form:• In each presentation room

• Online on the PASS Summit website

Drop off your completed form:• Near the exit of each presentation room

• At the Registration desk

• Online on the PASS Summit website

Sponsored by Dell

24

Thank youAmi Levin, SolidQ

top related