n1ql: query optimizer improvements in couchbase 5.0. by, sitaram vemulapalli

27
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. N1QL QUERY OPTIMIZER AND IMPROVMENTS IN 5.0

Upload: keshav-murthy

Post on 22-Jan-2018

960 views

Category:

Software


0 download

TRANSCRIPT

Page 1: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.

N1QL QUERY OPTIMIZER AND IMPROVMENTS IN 5.0

Page 2: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.

AGENDA01/

02

03

Optimizer Overview

Improvements in 5.0

Q&A

Page 3: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.

1OPTIMIZER OVERVIEW

Page 4: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 4

Query Execution Flow

Page 5: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 5

Query Service

Page 6: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 6

Query Execution Phases

Page 7: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 7

Optimizer

• Query Rewrite

• N1QL does very limited rewrite.

• Access Path Selection

• KeyScan Access

• IndexScan Access

• PrimaryScan Access

• JOIN ORDER, Types and Methods

• The keyspaces specified in the FROM clause are joined in the exact order given in the query.

• Nested Loop Join

• LOOK UP JOIN

• Index JOIN

• Execution Plan

Page 8: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 8

Optimizer

• Optimizer considers all possible ways to execute query and decides best query plan.

• Query plan generated based on rule based optimization

• If index can’t satisfy the query that index will not be chosen.

• If an index scan can be performed, will not perform a full / primary scan.

• Each query block (i.e. SELECT… ) has its own query plan

Page 9: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 9

Index Selection

• Online indexes

• Only online indexes are considered

• Preferred indexes

• USE INDEX hint is provided the indexes in that list are considered

• Satisfying Index condition

• Partial / filtered indexes that index condition is super set of query predicate are considered

• Satisfying Index keys

• Indexes whose leading keys satisfy query predicate are considered

• Longest satisfying index keys

• Redundancy is eliminated by keeping longest satisfying index keys in same order.

• Index with satisfying keys (a,b,c) is retained over index with satisfying (a,b)

Page 10: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 10

Access Path Selection

• Key Scan

• If the query contains a USE KEYS clause, no index scan or primary scan is performed. The input document keys are taken directly

from the USE KEYS clause.

• Index Count Scan

• Covering Secondary Scan

• Regular secondary scan -- longest satisfying keys, intersect scan;

• To avoid IntersectScan, provide a hint with USE INDEX.

• UNNEST scan;

• Only array indexes with an index key matching the predicates are used for UNNEST scan.

• Regular primary scan

• If a primary scan is selected, and there is no primary index available, the query errors out.

Page 11: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 11

Scan Methods

• Covering Primary Scan

• A covering primary scan is a primary scan that does not perform a subsequent document fetch. It is used for queries

that need a full / primary scan and only reference META().id.

SELECT META(t).id FROM `travel-sample` t;

• Regular Primary Scan

• A regular primary scan also performs a subsequent document fetch. It is used for queries that need a full / primary

scan and reference some document data other than META().id.

SELECT META(t).cas FROM `travel-sample` t;

SELECT * FROM `travel-sample` t;

Page 12: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 12

©2016 Couchbase Inc.

Scan Methods

Covering Secondary Scan

• Each satisfied index with most number of index keys is examined for query coverage

• Shortest covering index will be used.

CREATE INDEX ts_name ON `travel-sample`(country, name) WHERE type = "hotel";

SELECT country, name, type, META().id

FROM `travel-sample`

WHERE type = "hotel" AND country = "United States";

Regular Secondary Scan

• Indexes in with most number of matching index keys are used

• When more than one index are qualified, IntersectScan is used.

• To avoid IntersectScan provide hint with USE INDEX.

SELECT country, name, type, META().id, phone

FROM `travel-sample`

WHERE type = "hotel" AND country = "United States";

12

Page 13: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 13

©2016 Couchbase Inc.

Scan Methods

UNNEST Scan

• Only array indexes are considered. And only queries with UNNEST clauses are considered

Index Count Scan

• Queries with single projection of COUNT aggregate, NO JOIN’s, GROUP BY is considered

• Chosen Index needs to be covered with single range, exact range will be able to push to indexer and argument to COUNT needs to be constant or leading key

SELECT COUNT(1)

FROM `travel-sample`

WHERE type = "hotel" AND country = "United States";

13

Page 14: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.

2IMPROVMENTS IN 5.0

Page 15: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 15

UnionScan

• OR predicate can use multiple indexes.

• Each Index perform IndexScan and results are merged using

UnionScan.

• Each IndexScan can push variable length of index keys.

• All IndexScan under UnionScan are covered the UnionScan

is covered.

• CREATE INDEX ts_cc ON `travel-sample` (country, city)

WHERE type = "hotel";

• CREATE INDEX ts_n ON `travel-sample` (name) WHERE

type = "hotel";

EXPLAIN SELECT name, country, city

FROM `travel-sample`

WHERE type = "hotel" AND

((country = "United States" AND city = "San Francisco")

OR (name = "White Wolf"));

{ "#operator": "UnionScan",

"scans": [{ "index": "ts_cc",

"spans": [ { "range": [

{ "high": "\"United States\"",

"inclusion": 3, "low": "\"United States\"" },

{ "high": "\"San Francisco\"",

"inclusion": 3, "low": "\"San Francisco\"" } ]

} ],

},

{ "index": "ts_n",

"spans": [ { "range": [ { "high": "\"White

Wolf\"", "inclusion": 3, "low": "\"White Wolf\"" } ] }],

} ]

}

Page 16: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 16

IntersectScan

• IntersectScan is improved by terminating scans early when

one of the scan completed or limit is reached. Also only

completed scan results are considered as possible

candidates.

• If query has ORDER BY and predicate on the order by

clausesand when possible it uses OrderedIntersectScan.

EXPLAIN

SELECT name, country, city

FROM `travel-sample`

WHERE type = "hotel" AND

country = "United States" AND

city = "San Francisco" AND

name >= "White Wolf"

ORDER BY name;

{ "#operator": "OrderedIntersectScan",

"scans": [ { "index": "ts_n",

"spans": [ {

"range": [ { "inclusion": 1,

"low": "\"White Wolf\"" } ] } ],

},

{ "index": "ts_cc",

"spans": [ {

"range": [ { "high": "\"United States\"",

"inclusion": 3, "low": "\"United States\"" },

{ "high": "\"San Francisco\"",

"inclusion": 3, "low": "\"San Francisco\"" } ]

} ],

} ]

}

Page 17: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 17

Implicit Covering Array Index

• N1QL supports simplified Implicit Covering Array

Index syntax in certain cases where the mandatory

array index-key requirement is relaxed to create a

covering array-index.

• The predicates that can be exactly and completely

pushed to the indexer during the array index scan.

• No false positives

CREATE INDEX ts_r_simple ON `travel-sample` ( DISTINCT

ARRAY v.flight FOR v IN schedule END) WHERE type = "route";

EXPLAIN SELECT meta().id

FROM `travel-sample`

WHERE type = "route" AND

ANY v IN schedule SATISFIES v.flight LIKE 'UA%'

END;

Page 18: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 18

Stable Scans

Earlier versions IndexScan use to do single range scan

(i.e single Span)

If the query has multiple ranges (i.e. OR, IN, NOT

clauses) N1QL use to do separate IndexScan for each

range.

• This causes Indexer can use different snapshot

for each scan (make it unstable scan)

• Number of IndexScans are higher, result in

increase in index connections.

In 5.0.0 multiple ranges are passed into indexer and

indexer uses same snapshot for all the ranges.

If Explain shows operator IndexScan2, It uses stables

Scans.

EXPLAIN SELECT name, country, city

FROM `travel-sample`

WHERE type = "hotel" AND

country IN ["United States" , "France"];

{ "#operator": "IndexScan2",

"index": "ts_cc",

"spans": [

{ "range": [ { "high": "\"France\"",

"inclusion": 3,

"low": "\"France\""

}]

},

{ "range": [ { "high": "\"United States\"",

"inclusion": 3,

"low": "\"United States\""

}]

}

]

}

Page 19: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 19

Efficiently Pushdown Composite Filters

• Earlier versions composite Index the spans that

pushed to indexer contains single range for all

composite keys together.

• Indexer will not applying range for each part of the

key separately. This may result in lot of false

positives.

• In 5.0.0 with IndexScan2 each index key range

separately pushed and indexer will apply keys

separately.

• This results in no/less false positives and aides push

more information to indexer.

EXPLAIN SELECT name, country, city

FROM `travel-sample`

WHERE type = "hotel" AND

country >= "United States" AND

city = "San Francisco";

{ "#operator": "IndexScan2",

"index": "ts_cc",

"spans": [

{ "range": [ {"inclusion": 1,

"low": "\"United States\""

},

{ "high": "\"San Francisco\"",

"inclusion": 3,

"low": "\"San Francisco\""

}

]

}

]

}

Page 20: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 20

Pagination (ORDER, OFFSET, LIMIT)

• Pagination queries can contain any combination of

ORDER, LIMIT, OFFSET clauses.

• Predicates are completely and exactly pushed to

indexer, by pushing offset, limit to indexer can

improve query performance significantly. If that

happened IndexScan2 section of EXPLAIN will have

limit, offset.

• If query ORDER BY matches index key order query

can exploit index order and avoid sort. If that

happened order operator is not present in the

EXPLAIN.

EXPLAIN SELECT country, city

FROM `travel-sample`

WHERE type = "hotel" AND

country >= "United States"

ORDER BY country, city

OFFSET 1 LIMIT 10;

{ "#operator": "IndexScan2",

"index": "ts_cc",

"limit": "10",

"offset": "1",

"spans": [

{ "range": [ {"inclusion": 1,

"low": "\"United States\""

}

]

}

]

}

Page 21: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 21

DESC Index Collation

• Index can be created with ASC/DESC collation on

each index key

• Query can utilize index collation

CREATE INDEX ts_acc ON `travel-sample` (country DESC,

city ASC) WHERE type = "airline";

EXPLAIN SELECT country, city

FROM `travel-sample`

WHERE type = "airline" AND

country >= "United States"

ORDER BY country DESC , city

OFFSET 1 LIMIT 10;

{ "#operator": "IndexScan2",

"index": "ts_acc",

"limit": "10",

"offset": "1",

"spans": [

{ "range": [ {"inclusion": 1,

"low": "\"United States\""

}

]

}

]

}

Page 22: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 22

MAX pushdown

• If the MAX arguments matched with Index leading

key exploit index order for MAX.

• MAX can only be use DESC on leading index key.

• MIN can only be use ASC on leading index key.

• If pushdown happens "limit: 1 will appear in

IndexScan2 section of the EXPLAIN.

CREATE INDEX ts_acc ON `travel-sample` (country DESC,

city ASC) WHERE type = "airline";

EXPLAIN SELECT MAX(country)

FROM `travel-sample`

WHERE type = "airline" AND

country >= "United States";

{ "#operator": "IndexScan2",

"index": "ts_acc",

"limit": "1",

"spans": [

{ "range": [ {"inclusion": 1,

"low": "\"United States\""

}

]

}

]

}

Page 23: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 23

COUNT (DISTINCT expr)

• If the expr matched with Index leading key, COUNT

DISTINCT can be pushed to indexer

• Complete predicates needs to pushed to indexer exactly

• No false positives are possible

• No group or JOIN

• Only single projection

• When pushdown IndexCountDistinctScan2 will

appear in EXPLAIN

EXPLAIN SELECT COUNT( DISTINCT country)

FROM `travel-sample`

WHERE type = "hotel" AND

country >= "United States";

{

"#operator": "IndexCountDistinctScan2"

"index": "ts_cc",

"spans": [

{ "range": [ {"inclusion": 1,

"low": "\"United States\""

}

]

}

]

}

Page 24: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 24

Index Projection

• The index can have many keys but query might be

interested only subset of keys.

• By only requesting required information from indexer

can save lot of network transportation, memory, cpu,

backfill etc. All this can help in performance and

scaling the cluster.

• The requested information can be found in

"IndexScan2" Section of EXPLAIN as

"index_projection"

"index_projection": {

"entry_keys": [ xxx,....... ]

"primary_key": true

}

EXPLAIN SELECT country FROM `travel-sample`

WHERE type = "hotel" AND country >= "United

States";"index_projection": { "entry_keys": [ 0 ] }

EXPLAIN SELECT country,city FROM `travel-sample`

WHERE type = "hotel" AND country >= "United

States" ;"index_projection": { "entry_keys": [ 0 ,1] }

EXPLAIN SELECT country,city, META().id FROM `travel-

sample`

WHERE type = "hotel" AND country >= "United

States" ;"index_projection": { "entry_keys": [ 0 ,1], "primary_key":true }

EXPLAIN SELECT country,city, META().id, name

FROM `travel-sample`

WHERE type = "hotel" AND country >= "United

States" ;non covered query

"index_projection": {"primary_key":true }

Page 25: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 25

Index Cas and Expiration

• META().cas, META().expiration can be indexed and

used in queries.

• Note: META().expiration will work in covered queries.

For non covered queries it gives 0

CREATE INDEX ts_cas ON `travel-sample` (country,

META().cas, META().expiration) WHERE type = "airport";

EXPLAIN SELECT country, META().cas, META().expiration

FROM `travel-sample`

WHERE type = "airport" AND country = "United

States";

{

"#operator": "IndexScan2"

"index": "ts_cas",

"spans": [

{ "range": [ { "high": "\"United States\""

"inclusion": 3,

"low": "\"United States\""

}

]

}

]

}

Page 26: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.

3 Q&A

Page 27: N1QL: Query Optimizer Improvements in Couchbase 5.0. By, Sitaram Vemulapalli

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.

THANK YOU