sql server performance tuning
Post on 06-Jan-2017
657 Views
Preview:
TRANSCRIPT
1
SQL Server Performance Tuning (Tips & Tricks)By Nitin K niting123@gmail.com
2
Agenda
‣SQL Server Concepts/Structure‣Performance Measuring & Troubleshooting Tools‣Locking‣Performance Problem : CPU‣Performance Problem : Memory‣Performance Problem : I/O‣Performance Problem : Blocking‣Query Tuning‣Indexing
3
Who am I
‣12 + years of experience on Microsoft Technologies ‣ Data Architect on SQL Server Platform ‣Working as a Datawarehouse Architect from last 5 years‣Playing a role of BI-DBA‣Certified MCSE- SQL Server 2012 Data track‣Passionate about performance tuning on Both database as well
as Business Intelligence Area
4
Performance Tuning in SQL Server
‣Why is performance tuning necessary?
5
Why is Performance Tuning Necessary?
‣Allowing your system to scale •Adding more customers •Adding more features‣ Improve overall system performance‣Save money but not wasting resources‣The database is typically one of the most expensive resources
in a datacenter
6
Performance Focus
‣Response TimeTime interval when a request is submitted and when 1st
character of response is received.
‣ThroughputNumber of transaction processed in fixed unit of time
‣ScalabilityThroughput and response time changes as we add more
hardware resources
7
‣ Best Practices:Optimize for real-
world workloads
Monitor/review performance regularly
Focus on specific issues
Performance Monitoring Approaches
8
Overview of Performance Monitoring Tools and Methods
9
Performance Tools: No Extra cost
‣SQL Server Profiler
‣System Monitor (windows performance monitor)
‣Dynamic Management Views (DMV) : SQL 2005+
‣Database tuning advisor (DTA)
‣Microsoft Free tools (SQLDiag, PSSDiag, SQL Nexus – www.codeplex.com)
‣MDW – Management Datawarehouse
10
11
Major Performance Killers
‣ Insufficient indexing‣ Inaccurate statistics‣ Improper query design‣ Poorly generated execution plans‣ Excessive blocking and deadlocks‣ Non-set-based operations, usually T-SQL cursors‣ Inappropriate database design‣ Excessive fragmentation‣ Nonreusable execution plans‣ Frequent recompilation of queries‣ Improper use of cursors‣ Improper configuration of the database transaction log‣ Excessive use or improper configuration of tempdb
12
Monitoring SQL ServerUsing various SQL Server features to monitor database activity
13
‣Overviews of SQL Server usageCan export to Excel or PDF
‣Server-Level Report Examples:Server DashboardMemory ConsumptionActivity – All Block TransactionsActivity – Top SessionsPerformance – Batch Execution StatisticsPerformance – Top Queries by Average CPUObject Execution Statistics
SQL Server Management Studio Reports
14
‣Examples:Disk UsageAll TransactionsAll Blocking TransactionsIndex Usage StatisticsTop Transactions by AgeSchema Changes History
‣New reports added in Service PacksAbility to use custom reports
Database-Level Reports
15
‣Windows Event Logs / Event Viewer Application and System Event Logs
‣SQL Server Management StudioSQL Server Logs
‣ Can configure max. # of log files
SQL Server Agent Error logs‣ Can configure logging levels (Errors, Warnings, Information)
‣Using the Log File ViewerCan Export / Load log informationCan search for specific errors/messages
Monitoring SQL Server Logs
16
SQL Server Architecture
17
SQL Server Architecture
18
Supported Protocols
19
SQL Server Databases
20
SQL Server Files
21
SQL Server Transaction Log
22
Recovery Models
23
Data File
24
Extents
25
Log Files
26
Transaction Isolation Levels
Level DefinitionRead Un-Committed Don’t need a lock to read a data
Read Committed “Default” Read only committed data otherwise wait. Request ‘S’ lock to read. No guarantee that read is repeatable
Repeatable read Guarantees that data read in a transaction will not change for it duration. Holds the ‘S’ lock until the duration of transaction.
Serializable Prevent phantoms
In any of above transaction, an ‘X’ lock acquired is held for total duration of transaction.
27
Locks
‣Lock types/modesX (exclusive), S (shared) , U(Update), IX (Intent Exclusive), IS….. imposed on DB (database), RID (row id), PAG(page), TAB(table)
‣Locks Compatibility (matrix)
‣Lock HintsSelect * from employee with (Nolock) More - Rowlock, HoldLock, Tablock…….
‣Locks HierarchyDatabase (DB) ->Table (TB) -> Page (PG)-> Row (key)
Lock modes Shared(S) Exclusive (X)
Shared (S) – READ LOCKS OK NO
Exclusive(X) NO NO
28
What happens when you CRUD ‣Two places get affected
Memory: Data looked into Memory. If doesn’t exist, Page brought into Memory(RAM). Operation performed on Page in memoryTransaction Log: Entry in transaction log file (.ndf file)
‣No “instant” changes to mdf file (data file)
Latest Changes are in Memory and Transaction Log file
Backup transaction Log file in case of SQL failure
‣Checkpoint : Lazywriter write to data changes in memory to disk
29
Performance TuningServer Configuration
30
Server configuration
31
Tools to Identify the Performance Metric
‣Performance counters
‣Activity Monitor
‣Task Manager
‣SQL Server DMV’s
32
Memory Performance Analysis
‣The basics of the Performance Monitor tool
‣Some of the dynamic management objects used to observe system behavior
‣How and why hardware resources can be bottlenecks
‣Methods of observing and measuring memory use within SQL Server and Windows
‣Possible resolutions to memory bottlenecks
33
Performance Monitor Tool
34
Performance Counters
35
DMO’s and DMV’s
‣SELECT dopc.cntr_value, dopc.cntr_type FROM sys.dm_os_performance_counters AS dopc WHERE dopc.object_name = 'SQLServer:General Statistics' AND dopc.counter_name = 'Logins/sec';
36
SQL Server Memory Management
‣Max Memory
‣Min Memory
37
Additional Memory Monitoring Tools
‣DBCC MEMORYSTATUS
‣Sys.dm_os_memory_brokers
‣Sys.dm_os_memory_clerks
‣Sys.dm_os_ring_buffers
38
Resolution for Common Memory Problems
‣ Optimizing application workload
‣ Allocating more memory to SQL Server
‣ Moving in-memory tables back to standard storage
‣ Increasing system memory
‣ Changing from a 32-bit to a 64-bit processor
‣ Enabling 3GB of process space
‣ Compressing data
‣ Addressing fragmentation
39
Disk Performance Analysis
‣Using system counters to gather disk performance metrics
‣Using other mechanisms of gathering disk behavior
‣Resolving disk performance issues
40
Performance Counters
41
DMV’s and DMO’s
‣Sys.dm_io_virtual_file_stats
‣Sys.dm_os_wait_stats
‣ExampleSELECT *FROM sys.dm_os_wait_stats AS dowsWHERE wait_type LIKE 'PAGEIOLATCH%'
42
Resolution for common Disk bottlenecks‣ Optimizing application workload
‣ Using a faster I/O path
‣ Using a RAID array
‣ Using a SAN system
‣ Using Solid State Drives
43
Resolution for common Disk bottlenecks‣ Aligning disks properly
‣ Using a battery-backed controller cache
‣ Adding system memory
‣ Creating multiple files and filegroups
‣ Moving the log files to a separate physical drive
‣ Using partitioned tables
44
CPU Performance Analysis
‣How to gather metrics on the processor
‣Additional metrics available through T-SQL queries
‣Methods for resolving processor bottlenecks
45
CPU Bottleneck Analysis
46
DMV’s
‣Sys.dm_os_wait_stats
‣Sys.dm_os_workers and Sys.dm_os_schedulers
47
Resolution for common bottlenecks
‣Optimizing application workload
‣Eliminating or reducing excessive compiles/recompiles
‣Using more or faster processors
‣Not running unnecessary software
48
Performance TuningQuery Level
49
Costly Queries
‣ Identify costly queries using - SQL Server DMV’s(sys.dm_exec_query_stats) - Extended events
50
Query Tuning
‣DetectionProfiler
‣ Look for Queries/Stored Proc with High reads, CPU, & Duration. These are candidates of tuning.
‣ Look for Stored proc that’s Recompiling (it’s an event)
DMV’s ‣ Find Queries with missing indexes‣ Find tables that are defragmented‣ Find TempDB database bottlenecks
51
Query Tuning cont.‣Troubleshoot : Query Execution Plan
Operator types‣ Seek (Best and preferred) ‣ Scan (not preferred)‣ Bookmark lookup (better than scan and mostly with non-
clustered index)
Join type‣ Nested‣ Merge‣ Hash (Avoid)
Graphical Execution Plan Icons :http://msdn2.microsoft.com/en-us/library/ms175913.aspxhttp://www.sql-server-performance.com/articles/per/select_indexes_p1.aspx
52
53
Why do we need an optimizer?The Query Optimizer
•T-SQL is a “What” not “how” language.
•We write “logical” requests.
•SQL Optimizer Engine converts logical requests into physical plans.
54
The job of the SQL Optimizer is to find “the best plan possible”.
The Query Optimizer
XWhat is the goal of the Optimizer?
55
Query optimization explained simply
1. Query submitted
2. Magic happens
3. Shedload of data returned
56
Optimizer stepsQuery Optimization (in a bit more detail)
Bind
Execute
Optimize
Parse
57
Parse
Builds a tree structure based upon the logical operators in the query.For example:
SELECT SSOD.[SalesOrderID], PP.[Name], PP.[Weight], SSOD.[UnitPrice]
FROM [Sales].[SalesOrderDetail] SSODINNER JOIN [Production].[Product] PP
ON SSOD.ProductID = PP.ProductIDWHERE PP.Weight > 100
Project
Filter
Join
ProductSalesOrderDetail
Logi
cal O
pera
tions
Nod
es
58
Bind
• Series of validation steps
• Schema validation
• Table validation
• Attribute validation
• Permission validation
SELECT SSOD.[SalesOrderID], PP.[Name], PP.[Weight], SSOD.[UnitPrice]
FROM [Sales].[SalesOrderDetail] SSODINNER JOIN [Production].[Product] PP
ON SSOD.ProductID = PP.ProductIDWHERE PP.Weight > 100
59
Optimize
Works though many rules and heuristics.
These Include:
•Commutativity
•Substitution rules
•Exploration rules
•Implementation rules
60
SELECT prod_category, AVG(amount_sold)FROM o_sales s, o_products pWHERE p.prod_id = s.prod_idGROUP BY prod_category;
61
SQL: set based expression / serial execution
• SQL syntax based on “set based” expressions (no processing rules)
• Query execution is serial– SQL Server “compiles” query into a series of sequential steps which are
executed one after the other– Individual steps also have internal sequential processing
• (eg table scans are processed one page after another & per row within page)
Returns CustID, OrderID & OrderDate for orders >
1st Jan 2005
No processing rules included in SQL statement, just the “set” of
data to be returned
• Execution Plans Display these steps
62
Intro to execution plans – a simple example
• Execution Plan shows how SQL Server compiled & executes this query– Ctl-L in SSMS for “estimated” plan (without running query)– Ctl-M displays “actual” plan after query has completed
• Read Execution Plans from top right to bottom left– In this case, plan starts with Clustered Index Scan of [SalesOrderHeader]– Then, for every row returned, performs an index seek into [Customers]
Plan node cost shown as % of total
plan “cost”
Table / index access methods displayed (Scan, Seek etc)
Join physical operators displayed
(Loops, Merge, Hash) Thickness of intra node rows denotes estimated / actual number of rows carried between nodes in
the plan
63
Execution plan node properties
• Mouse over execution plan node reveals extra properties..
Search predicate. WHERE filter in this case, but can also
be join filter
Number of rows returned shown in
Actual Execution Plan
Name of Schema object accessed to physically process query –
typically an index, but also possibly a heap structure
Ordered / Unordered – displays whether scan operation follows page
“chain” linked list (next / previous page # in page header) or follows Index Allocation Map (IAM) page
64
“Heap” Table Storage
• Query execution example:Select FName, Lname, PhNo from Customers where Lname = ‘Smith’
No b-tree with HEAPs, so no lookup method
available unless other indexes are present. Only
option is to scan heap
No physical ordering of table rows (despite this
display)Scan cannot complete just because a row is
located. Because data is not ordered, scan must
continue through to end of table (heap)
• Table storage structure used when no clustered index on table– Rarely used as CIXs added to PKs by default– Oracle uses Heap storage by default (even with PKs)
• No physical ordering of rows– Stored in order of insertion– New pages added to end of “heap” as needed
• NO B-Tree index nodes (no “index”)
65
Interpreting Execution plan
66
Understanding the Execution Plans
67
Access Method
69
Commonly used operators
70
Blocking and Non-blocking Operators
• Operators / Iterators can be put in two categories:1. Blocking2. Non-blocking
• Having a blocking operator in your plan means other operators further down the line are sitting idle.This will reduce the overall performance of your query
• Some examples…
71
Blocking exampleBlocking and Non-blocking operators
• An example using the sort operator:
Row 1Row 2Row 3Row 4Row 5
? Sort Desc
72
Hints can be placed in SQL to force optimizer to follow our desired retrieval path rather then calculated by the optimizer
Select /* +RULE */ From emp , deptWhere…..
Select statement instructs the optimizer to use the rule based optimizer rather than the cost based optimizer.Delete /*+RULE*/ . . . . . . . . Update /*+RULE*/ . . . . . . . .
72
73
Index Tuning
‣What an index is
‣The benefits and overhead of an index
‣General recommendations for index design
‣Clustered and nonclustered index behavior and comparisons
‣Recommendations for clustered and nonclustered indexes
74
What Is an Index?
‣One of the best ways to reduce disk I/O is to use an index
‣Allows SQL Server to find data in a table without scanning the entire table
‣ExampleSELECT TOP 10 p.ProductID, p.[Name], p.StandardCost, p.
[Weight], ROW_NUMBER() OVER (ORDER BY p.Name DESC) AS RowNumber FROM Production.Product p ORDER BY p.Name DESC;
75
Types of Index
‣Clustered Index –Primary Key Default (but not necessary) –Data is stored at the leaf level –Data is ordered by the key ‣Non-clustered Index –Uses cluster key or RID of a heap –INCLUDE stored at leaf ‣And the rest – outside the scope of this session
76
Index Rules‣Clustered Index
Choose wisely. Only one per table possiblePrimary key by default is clustered. Evaluate default behaviour
‣Non-Clustered Index More than one possible.Foreign keys are always good candidate for non-clustered index
(because of joins)
‣Evaluate ‘Included Columns’ in Indexing. Every NonClustered index contains Clustered Keys
‣Choose Index Fill Factor wisely.
‣Find out tables with large rowcount but no indexing. May be it needs index.
77
Index design recommendations
Examine the WHERE clause and JOIN criteria columns.
Use narrow indexes.
Examine column uniqueness.
Examine the column data type.
Consider column order.
Consider the type of index (clustered versus nonclustered).
78
Lookups & Joins
‣Key
‣RID
79
Joins optimization
‣Hash joins
‣Merge joins
‣Nested loop joins
80
Join Operators (intra-table operators)• Nested Loop Join
– Original & only join operator until SQL Server 7.0– “For Each Row…” type operator– Takes output from one plan node & executes another
operation “for each” output row from that plan node
• Merge Join– Scans both sides of join in parallel – Ideal for large range scans where joined columns are
indexed• If joined columns aren’t indexed, requires expensive sort
operation prior to Merge• Hash Join
– “Hashes” values of join column/s from one side of join• Usually smaller side
– “Probes” with the other side• Usually larger side
– Hash is conceptually similar to building an index for every execution of a query
• Hash buckets not shared between executions– Worst case join operator– Useful for large scale range scans which occur infrequently
81
Hash Join
‣A hash join uses the two join inputs as a build input and a probe input.
‣The build input is shown as the top input in the execution plan, and the probe input is shown as the bottom input.
‣Usually the smaller of the two inputs serves as the build input because it's going to be stored on the system, so the optimizer attempts to minimize the memory used.
‣The hash join performs its operation in two phases: the build phase and the probe phase.
82
Hash Join – Example
‣SELECT p.* FROM Production.Product p JOIN Production.ProductCategory pc ON p.ProductSubcategoryID = pc.ProductCategoryID;
83
Merge Join
‣A merge join requires both join inputs to be sorted on the merge columns, as defined by the join criterion
‣Since each join input is sorted, the merge join gets a row from each input and compares them for equality
‣A matching row is produced if they are equal. This process is repeated until all rows are processed
84
Merge Join - Example
85
Nested Loop Join
‣A nested loop join uses one join input as the outer input table and the other as the inner input table
‣The outer input table is shown as the top input in the execution plan, and the inner input table is shown as the bottom input table
‣The inner loop, executed for each outer row, searches for matching rows in the inner input table
‣Nested loop joins are highly effective if the outer input is quite small and the inner input is larger but indexed
86
Nested Loop Join - Example
87
Quick comparison
88
Statistics, Data Distribution, and Cardinality‣The role of statistics in query optimization
‣The importance of statistics on columns with indexes
‣The importance of statistics on non-indexed columns used in join and filter criteria
‣Analysis of single-column and multicolumn statistics, including the computation of selectivity of a column for indexing
‣Statistics maintenance
‣Effective evaluation of statistics used in a query execution
89
Statistics :Query OptimizerThe query optimizer in SQL Server is cost-based. It
includes:
1. Cost for using different resources (CPU and IO)2. Total execution time It determines the cost by using: ‣Cardinality: The total number of rows processed at each
level of a query plan with the help of histograms , predicates and constraint
‣Cost model of the algorithm: To perform various operations like sorting, searching, comparisons etc.
90
Statistics Analysis
‣The query optimizer uses statistics to create query plans that improve query performance
‣A correct statistics will lead to high-quality query plan.
‣The query optimizer determines when statistics might be out-of-date by counting the number of data modifications since the last statistics update and comparing the number of modifications to a threshold.
91
Auto create statistics
‣Default setting of auto create statistics is ON.
‣ It creates when:‣Clustered and non clustered Index is created‣Select query is executed.
‣Auto create and updates applies strictly to single-column statistics.
92
Why query 2 is performing better
‣ If we perform following operations on field of any table in query predicate:
1. Using any system function or user defined function2. Scalar operation like addition, multiplication etc.3. Type casting
‣ In this situation sql server query optimizer is not able to estimate correct cardinality using statistics.
93
To improve cardinality
‣If possible, simplify expressions with constants in them.‣If possible, don't perform any operation on the any field of a
table in WHERE Clause, ON Clause, HAVING Clause
‣Don't use local variables in WHERE Clause, ON Clause, HAVING Clause.
‣If there is any cross relationship among fields or there is a complex expression in a field in a query predicates, it is better to create a computed column and then create a non-clustered index on it.
94
Statistics Tools and commands
‣Create Statistics‣Sp_Updatestats‣Sp_autostats‣Sp_helpstats‣DBCC Show_statistics
95
Statistics Maintenance
‣Auto Create Statistics(DB level)
‣Auto Update Statistics
‣Manual Maintenance
96
Index fragmentation‣The causes of index fragmentation, including an analysis of
page splits caused by INSERT and UPDATE statements
‣The overhead costs associated with fragmentation
‣How to analyze the amount of fragmentation‣Techniques used to resolve fragmentation‣The significance of the fill factor in helping to control
fragmentation
97
Cause for Fragmentation
‣Fragmentation occurs when data is modified in a table.‣Page splits cause database fragmentation‣A new leaf page will then be added that contains part of the
original page and maintains the logical order of the rows in the index key
‣New leaf page maintains the logical order of the data rows in the original page, this new page usually won't be physically adjacent to the original page on the disk.
‣The logical key order of the index doesn't match the physical order within the file
98
Identify Fragmentation & Resolution
‣Checking the fragmentation using sys.dm_db_index_physical_stats
‣Only rebuild or reorganize indexes that are fragmented‣Rebuild heavily fragmented indexes‣Reorganize moderately fragmented indexes
99
Reorganize Index
‣If database fragmentation is less than 10%, no action is required
‣20 – 30% requires you to reorganize indexes‣Use ALTER INDEX REORGANIZE
USE AdventureWorks ALTER INDEX PK_ProductPhoto_ProductPhotoID
ON Production.ProductPhoto REORGANIZEreorganize the PK_Product_Product-PhotoID index on the Production.ProductPhoto
table
100
Rebuild Index
‣More than 30% fragmentation requires you to rebuild indexes‣There are two methods
CREATE INDEX WITH DROP EXISTINGALTER INDEX REBUILD
101
Significance of the Fill Factor
‣SQL Server allows you to control the amount of free space within the leaf pages of the index by using the fill factor
‣ If there will be enough INSERT queries on the table or UPDATE queries on the index key columns, then you can pre-add free space to the index leaf page using the fill factor to minimize page splits
102
Query parsing
Parsing
Binding
Query optimization
Execution plan generation, caching, and hash plan generation
Query execution
103
Parsing Flowchart
104
Query Execution
105
Optimization Techniques
‣Syntax-based optimization of the query
‣Trivial plan match to avoid in-depth query optimization for simple queries
‣ Index and join strategies based on current distribution statistics
‣Query optimization in stepped phases to control the cost of optimization
‣Execution plan caching to avoid the regeneration of query plans
106
continued
107
Query – Execution Plan Cache
‣Saves the plans created in a memory space on the server called the plan cache.
‣SELECT * FROM sys.dm_exec_cached_plans;
108
Continued
109
Plan Reusability of an Ad Hoc Workload
‣Optimize for an Ad Hoc Workload
‣Simple Parameterization(default)
‣Forced Parameterization
110
Plan Reusability of a Prepared Workload
‣Stored Procedures
‣sp_executesql
‣Prepare/Execute Model
111
Stored Procedure – why?
‣Standard technique for improving the effectiveness of plan caching
‣When the stored procedure is compiled the generated execution plan is cached for future reuse. This plan is used for future execution
‣Performance Benefits - Reduced network traffic - Business logic is close to the data
112
sp_executesql
‣sp_executesql is a system stored procedure that provides a mechanism to submit one or more queries as a prepared workload
‣ It allows the variable parts of the query to be explicitly parameterized, and it can therefore provide execution plan reusability as effective as a stored procedure
113
Continued
‣DECLARE @query NVARCHAR(MAX), @paramlist NVARCHAR(MAX); SET @query = N'SELECT soh.SalesOrderNumber ,soh.OrderDate ,sod.OrderQty ,sod.LineTotal FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail AS sod ON soh.SalesOrderID = sod.SalesOrderID WHERE soh.CustomerID = @CustomerID AND sod.ProductID = @ProductID'; SET @paramlist = N'@CustomerID INT, @ProductID INT'; EXEC sp_executesql @query,@paramlist,@CustomerID = 29690,@ProductID = 711;
114
Prepare/Execute Model
‣ODBC and OLEDB provide a prepare/execute model to submit queries as a prepared workload
‣Like sp_executesql, this model allows the variable parts of the queries to be parameterized explicitly
‣The prepare phase allows SQL Server to generate the execution plan for the query and return a handle of the execution plan to the application
‣This execution plan handle is used by the execute phase to execute the query with different parameter values
115
Query stats and Query Hash
‣With SQL Server 2008, new functionality around execution plans and the cache was introduced called the query plan hash and the query hash
‣You can retrieve the query plan hash and the query hash from sys.dm_exec_query_stats
116
Recommendations
‣ Explicitly parameterize variable parts of a query.‣ Use stored procedures to implement business functionality.‣ Use sp_executesql to avoid stored procedure maintenance.‣ Use the prepare/execute model to avoid resending a query string.‣ Avoid ad hoc queries.‣ Use sp_executesql over EXECUTE for dynamic queries.‣ Parameterize variable parts of queries with care.‣ Avoid modifying environment settings between connections.‣ Avoid the implicit resolution of objects in queries
117
Top 10 for Building Efficient Queries
1 Favor set-based logic over procedural or cursor logic
• The most important factor to consider when tuning queries is how to properly express logic in a set-based manner. •Cursors or other procedural constructs limit the query optimizer’s ability to generate flexible query plans. •Cursors can therefore reduce the possibility of performance improvements in many situations
118
Top 10 for Building Efficient Queries
2. Test query variations for performance
•The query optimizer can often produce widely different plans for logically equivalent queries.
•Test different techniques, such as joins or subqueries, to find out which perform better in various situations.
119
Top 10 for Building Efficient Queries
3. Avoid query hints.
•You must work with the SQL Server query optimizer, rather than against it, to create efficient queries. •Query hints tell the query optimizer how to behave and therefore override the optimizer’s ability to do its job properly. •If you eliminate the optimizer’s choices, you might limit yourself to a query plan that is less than ideal. •Use query hints only when you are absolutely certain that the query optimizer is incorrect..
120
Top 10 for Building Efficient Queries
‣4. Use correlated subqueries to improve performance.
• Since the query optimizer is able to integrate subqueries into the main query flow in a variety of ways, subqueries might help in various query tuning situations.
• Subqueries can be especially useful in situations in which you create a join to a table only to verify the existence of correlated rows. For better performance, replace these kinds of joins with correlated subqueries that make use of the EXISTS operator
.
121
Top 10 for Building Efficient Queries
‣4. Continued
.
122
Top 10 for Building Efficient Queries
5. Avoid using a scalar user-defined function in the WHERE clause.
‣Scalar user-defined functions, unlike scalar subqueries, are not optimized into the main query plan. ‣Instead, you must call them row-by-row by using a hidden cursor. ‣This is especially troublesome in the WHERE clause because the function is called for every input row.‣Using a scalar function in the SELECT list is much less problematic because the rows have already been filtered in the WHERE clause
.
123
Top 10 for Building Efficient Queries
‣6. Use table-valued user-defined functions as derived tables.
‣ In contrast to scalar user-defined functions, table-valued functions are often helpful from a performance point of view when you use them as derived tables.
‣The query processor evaluates a derived table only once per query. ‣ If you embed the logic in a table-valued user-defined function, you can
encapsulate and reuse it for other queries
.
124
Top 10 for Building Efficient Queries
‣6. Continued
.
125
Top 10 for Building Efficient Queries
‣7 Avoid unnecessary GROUP BY columns
‣Use a subquery instead. ‣ •The process of grouping rows becomes more expensive as you add more columns to the GROUP BY list. ‣ •If your query has few column aggregations but many non-aggregated grouped columns, you might be able
to refactor it by using a correlated scalar subquery. ‣ •This will result in less work for grouping in the query and therefore possibly better overall query
performance.
.
126
Top 10 for Building Efficient Queries
‣7 Continued
.
127
Top 10 for Building Efficient Queries
‣8 .Use CASE expressions to include variable logic in a query
‣The CASE expression is one of the most powerful logic tools available to T-SQL programmers. ‣ •Using CASE, you can dynamically change column output on a row-by-row basis. ‣ •This enables your query to return only the data that is absolutely necessary and therefore reduces the
I/O operations and network overhead that is required to assemble and send large result sets to clients.
.
128
Top 10 for Building Efficient Queries
‣9 Divide joins into temporary tables when you query very large tables.
‣The query optimizer’s main strategy is to find query plans that satisfy queries by using single operations. ‣ •Although this strategy works for most cases, it can fail for larger sets of data because the huge joins
require so much I/O overhead. ‣ •In some cases, a better option is to reduce the working set by using temporary tables to materialize key
parts of the query. You can then join the temporary tables to produce a final result.
.
129
Stored Procedure
Best Practices‣ •Avoid using “sp_” as name prefix‣ •Avoid stored procedures that accept parameters for table
names‣ •Use the SET NOCOUNT ON option in stored procedures‣ •Limit the use of temporary tables and table variables in stored
procedures‣ •If a stored procedure does multiple data modification
operations, make sure to enlist them in a transaction.‣ •When working with dynamic T-SQL, use sp_executesqlinstead
of the EXEC statement
130
Views
Best Practices
‣Use views to abstract complex data structures‣ •Use views to encapsulate aggregate queries‣ •Use views to provide more user-friendly column names‣ •Think of reusability when designing views‣ •Avoid using the ORDER BY clause in views that contain a TOP
100 PERCENT clause.‣ •Utilize indexes on views that include aggregate data
131
Top 10 for Building Efficient Queries
‣10. Refactoring Cursors into Queries..
‣Rebuild logic as multiple queries‣ •Rebuild logic as a user-defined function‣ •Rebuild logic as a complex query with a case expression
.
top related