sql server: understanding the data workload - snia · sql server: understanding the data workload...

32
SQL Server: understanding the data workload Gunter Zink Claus Joergensen Microsoft

Upload: hadieu

Post on 12-Jun-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: understanding the data workload

Gunter Zink Claus Joergensen

Microsoft

Page 2: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

Agenda

SQL Server IO workload OLTP workload Data Warehouse workload

What’s new in SQL Server 2012 Demo: SQL Server over SMB

2

Page 3: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: Overview

To understand SQL disk IO patterns, we need to understand what SQL Server does

What’s a database? Store and retrieve structured data

SQL databases (as opposed to NoSQL) Relational ACID (Atomicity, Consistency, Isolation,

Durability) Using Schema

3

Page 4: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: database page

Data is organized in Tables Tables have one or more columns Each column has a distinct data type Tables are internally stored in pages, row by row SQL Server Pages Size is 8K

4

Page 5: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: data file format

Datafile Header GAM page (Global Allocation Map, meta data

about following 4GB of pages in the file) 4GB of data pages GAM page 4GB of data pages …

5

Page 6: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: data file

Space on the data file is allocated in extents By default, an extent is 64K (8 pages) -E startup parameter to allocate large extents

(512K, 64 pages) When using multiple, same-size files per

database, space is allocated round-robin by extent

6

Page 7: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: log file

Zeroed-out on creation of the file Header Log records (512 bytes – 60K in size), contain LSN (Log Sequence

Number) Checkpoint markers Start of logical log maker End of logical log marker Wrap-around Truncate after backup

7

Page 8: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: OLTP workload

Online Transaction Processing Examples: online ordering system, stock

transactions Many active users Lightweight transactions Transaction examples: Order quantity Y of product X Update customer Y zip code to X

8

Page 9: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: OLTP query

Query: Update Customer X Zipcode to 98052 Locate table and row that contains the zipcode for customer X

(usually using an index) If the 8K page containing this row is not in memory, post disk read

request and suspend thread until IO is completed Once the page is in memory, resume thread and change the

Zipcode to 98052 Write change record to the SQL Server log buffer Create next LSN (Log Sequence Number) Issue write of the LSN + log buffer to log file (unbuffered write) Mark the page as dirty (changed in memory but not on disk) Once the log disk write completes, the transaction is complete

9

Page 10: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: log buffer

Circular buffer one buffer is being written to disk while

another buffer is being filled with change records

When the write operation is complete, the buffer that was being filled with change records is written to disk

10

Page 11: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: data file writes

Lazy writer writes dirty pages to disk (in-place update) Write intensity is controlled by memory pressure

Checkpoint Writes all pages that have changed since last

checkpoint to disk (in-place) Updates Checkpoint markers in log file Issues buffered writes Writes whole extent at a time if all pages are dirty

11

Page 12: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: checkpoint modes

3 Checkpoint modes: 1. Automatic (default) 2. Indirect (new with SQL Server 2012) 3. Manual

All 3 modes: Write rate is reduced when the write latency reaches

20 millisec Write rate can be limited using –k startup parameter

12

Page 13: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: automatic checkpoint

Write intensity (queue depth) controlled by recovery interval

recovery interval is the time to recover from a crash in minutes

Default for recovery interval is 0 (Automatic, approximately once a minute)

Target write rate is computed based on number of changed pages and recovery interval

13

Page 14: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: indirect checkpoint

New in SQL Server 2012 Used if TARGET_RECOVERY_TIME is > 0 Less “spikey” write behavior than automatic checkpoint

by constantly writing dirty pages at a steady rate Rate is computed by taking into account random IO

necessary for redo after crash and TARGET_RECOVERY_TIME

14

Page 15: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: manual checkpoint

Checkpoint [duration] Write intensity is computed using number of dirty

pages and duration specified

15

Page 16: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: log file IO

Contains change records in sequence The log is used to: Recover from a crash (ReDo) Roll back a transaction When the write completes, data needs to be

durable!

16

Page 17: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: OLTP IO

Data file IO: Random 8K reads controlled by number of users and

queries Buffered random 8K writes controlled by lazy writer

and checkpoints Log File IO:

Unbuffered small sequential writes controlled by insert/update/delete activity and write latency

17

Page 18: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: OLTP performance

Common problems: Not enough HDDs to deliver the IOPS

necessary 20 millisec threshold is too high for SSDs, use

–k parameter to limit data write rate Log needs redundant, power-safe write cache

(can be small)

18

Page 19: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server OLTP performance 2

Checkpoint Floods Array Cache (20millisec), Cache de-staging causes log drive write performance to degrade

Log writes are un-buffered due to potential data loss Most data writes are buffered because SQL can recover form data

loss using the log From a performance point of view, the reverse is better, cache log

writes (requires redundant, power-safe cache) and not caching the data writes is usually beneficial for read latency (elevators prefer writes that are in the queue)

Tiered Storage: Pages that have just been written are not very likely to be read again, hot pages stay in memory and SQL only updates the copy on disk after changes.

19

Page 20: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: data warehouse workload

Processing large amounts of data Often without index, need to scan through data Example: How did product line X sell in region Y

in the last calendar quarter, sort products by sales descending

Heavy, long running queries with intermediate results

Hourly or daily updates (from OLTP systems) Simple or Bulk logged

20

Page 21: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQl Server: DW query

How did product line X sell in region Y in the last calendar quarter, sort products by sales descending

Find list of products in the given product line For each product:

Aggregate sales for the last quarter by scanning sales records from the last quarter (assume table partitioned by quarters)

Sort the products by sales totals

21

Page 22: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: DW IO: scan

Table Scan or Range Scan Read-ahead Read request size is extent size (64K or 512K in case

or –E) Sequential reads (within a data file) Large amount of IO can be queued (10s of MB per

file)

22

Page 23: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server DW IO: TempDB

If intermediate results can’t fit in memory, spill to TempDB

Mostly sequential 64K writes to TempDB data files

Mostly random reads from TempDB during sorting/processing of intermediate results

Good fit for Solid State Storage (EMLC or SLC)

23

Page 24: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server DW IO: log

DW workloads usually run simple or bulk logged Only the extent allocation is logged, not the

actual data

24

Page 25: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: DW Performance

Most common problems: Not enough IO bandwidth (2P Server can

ingest 10+ Gbyte/sec File layout wrong

multiple files on the same spindle Thin provisioned / Pooled LUNs

Some arrays don’t read from both drives in a mirror for sequential IO

Not enough spindles for TempDB

25

Page 26: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server and SMB

SQL Server fully supports data and log files over SMB starting with SQL Server 2012

Limited SMB support in SQL Server 2008 R2 Why SQL Server over SMB:

Networks getting more reliable Networks are getting faster (10GbE, Infiniband) Network cheaper than FC infrastructure Share management simpler than LUN management

26

Page 27: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server over SMB

Typical File Server workloads SQL Server

Medium IO rates High IO rates (OLTP: 1-5 IOPS/GB)

Capacity limited IOPS limited

Full writes of files Write in-place

27

Page 28: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: SMB versus FC SAN 1

Upgrade/Migrate to new Server

28

FC SAN SMB

Zone switch for new server Verify that new server has permissions on share

Present LUNs to new server Detach database on old server

Detach SQL Server database from old server

Attach database on new sever (using UNC)

Shutdown old server

Discover and configure LUNs on new server (Drive letters, MPIO, etc)

Attach database on new server

Page 29: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: SMB versus FC SAN 2

Connectivity cost comparison Pricing based on www.hp.com and www.cdw.com, March 2012 Pricing has changed since and new technology has become available (FDR Infiniband, 16G Fibre Channel).

29

$/NIC port $/switch port $ total/port MBs/port $/MB

1GbE NIC on-board $25 $25 110 $0.23

10GbE NIC $300 $674 $974 1,100 $0.89

QDR Infiniband $650 $394 $1,044 3,200 $0.33

8G Fibre Channel $950 $933 $1,883 750 $2.51

Page 30: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

SQL Server: SMB versions

Operating System SMB protocol version

Benefits to SQL Server 2012

Windows Server 2008 2.0 Improved performance over previous SMB versions. Durability, which helps recover from temporary network glitches.

Windows Server 2008 R2 2.1 Support for large MTU, which benefits large data transfers, such as SQL backup and restore. Significant performance improvements, specifically for SQL OLTP style workloads. These performance improvements delivered through hotfix 2536493.

Windows Server 2012 3.0 Support for transparent failover of file shares providing zero downtime with no administrator intervention required for SQL DBA or file server administrator in file server cluster configurations. Support for IO using multiple network interfaces simultaneously, as well as tolerance to network interface failure. Support for network interfaces with RDMA capabilities. Support for active-active file shares with SMB Scale-Out. More information on SMB 3.0 features: http://technet.microsoft.com/library/hh831795.aspx

30

Page 31: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

Demo

SQL Server over SMB SMB Direct (SMB over RDMA) SMB Transparent Failover

31

Page 32: SQL Server: understanding the data workload - SNIA · SQL Server: understanding the data workload ... Data Warehouse workload ... resume thread and change the

2012 Storage Developer Conference. © Microsoft Corp. All Rights Reserved.

Related SDC Presentations

SDC 2011: Converting an Enterprise Application to Run on

CIFS/SMB/SMB2 File Access Protocols

SDC 2012: High Performance File Serving with SMB3 and RDMA via the

SMBDirect Protocol SMB 3.0 Application End-to-End Performance

32