Download - Getting the most from your SAN File and Filegroup … Archbold.pdfGetting the most from your SAN – File and Filegroup design patterns Stephen Archbold. ... –For non partitioned

Getting the most from your SAN –File and Filegroup design patterns

Stephen Archbold

About me• Microsoft Certified Master SQL Server 2008• Working with SQL Server for 6+ years• Former Production DBA for 24/7 High volume operation• Currently SQL Server consultant at Prodata, Ireland• Specialising in Performance Tuning and Consolidation• Blog at http://blogs.Prodata.ie and http://simplesql.blogspot.com• Get me on twitter @StephenArchbold• LinkedIn http://ie.linkedin.com/in/stephenarchbold

http://blogs.prodata.ie/

http://simplesql.blogspot.com/

http://ie.linkedin.com/in/stephenarchbold

Agenda

• Data Filegroup/File Fundamentals• Storage Design Patterns

– OLTP– Data Warehousing – Fast Track style– Data Warehousing on a SAN

• What other go faster buttons have we got• Case Study – The unruly fact table• How do we make the changes

• General Filegroup Recommended Practices. Separate for:

– Nothing but system tables in Primary

– I/O patterns

– Different volatility

– Data Age

• If using Multiple Files in a Filegroup

– Files must be equally sized

– Files must be equally full

– SQL does not redistribute data when adding more files

Data Filegroup/File Fundamentals

• Transactional processing is all about speed

• You want to get the transaction recorded and the user out as quick as possible

• Metric for throughput becomes less about MB/Sec, and more about transactions and I/O’s per second

Pattern 1 - OLTP

• Solid State Disk becoming more commonplace

• These thrive on Random I/O

• As the databases can be small, file/filegroup layout can suffer from neglect

• Faster disk brings different challenges

Challenges of OLTP

File2.NDF

PRIMARY

Filegroup

FileMyDB.MDF

File1.NDFTransactions

Reference Ref.NDF

Volatile Volatile.NDF

PAGELATCH!

Behind the scenes

Wait Type %

SOS_SCHEDULER_YIELD 55

PAGEIOLATCH_EX 17

PAGELATCH_SH 15

ASYNC_IO_COMPLETION 5

PAGELATCH_UP 5

Wait Type %

SOS_SCHEDULER_YIELD 66

PAGEIOLATCH_EX 12

PAGELATCH_SH 10

ASYNC_IO_COMPLETION 7

SLEEP_BPOOL_FLUSH 2

Single File Two Files

156

141 139

132

120

125

130

135

140

145

150

155

160

Single file Two Files Four Files Eight Files

Average Batch Completion Time (Seconds)

256,410

283,688 287,770

303,030

230,000

240,000

250,000

260,000

270,000

280,000

290,000

300,000

310,000

Single file Two Files Four Files Eight Files

Transactions Per Second

Facts and Figures

• Resolving in memory contention lies with the file layout

• This is actually nothing new, TempDB has been tuned this way for years!

• Keep in mind, files are written to in a “round robin” fashion

What can we take away from this?

Data Warehousing

• Large Volume

• Star Schema

• Need to optimize for sequential throughput

• Scanning Entire Table

• Not Shared Storage

Pattern 2 – Fast Track Scenario

Filegroup

File / LUN

Large Partitioned Fact Table

Enclosure 1

MyFact_part1.NDF

MyFact_part2.NDF

Controller 1

HBA 1

Partition 1

Partition 2

Enclosure 2

MyFact_part5.NDF

MyFact_part6.NDF

MyFact_Part7.NDF

Myfact_Part8.NDF

Controller 1

Controller 2

HBA 2

Partition 5

Partition 6

Partition 7

Partition 8

CPU

CPU

CPU

CPU

CPU CPUMyFact_Part3.NDF

Myfact_Part4.NDFController 2

Partition 3

Partition 4CPU CPU

Enclosure

MyFact_part9.NDF

MyFact_part10.NDF

MyFact_Part11.NDF

Myfact_Part12.NDF

Controller 1

Controller 2

HBA 3

Partition 9

Partition 10

Partition 11

Partition 12

CPU

CPU

CPU

CPU

• Pros

– Easy to figure out your needs

– Simple, scalable and fast

– In depth guidance available from Microsoft

• Cons

– Not recommended for pinpoint queries

– Only really for processing entire data sets

– Need VERY understanding Infrastructure team

Fast Track – Pros and Cons

• Large Volume

• Star Schema

• Cannot optimize for sequential throughput

• Shared Storage

• More mixed workload

Pattern 3 – Datawarehouse on SAN

• We need Read Ahead

– Enterprise edition is capable of issuing a request for 512KB on a single read ahead request (Standard you’re stuck at 64K)

– It can issue several of these (outstanding I/O) at a time, up to 4MB

• But you may not even be close to 512KB…

Goal – Large Request Size

• Run something like:

• And watch this guy:

How close are you to the 512k Nirvana

• #1 killer of read ahead

• Read ahead size will be reduced if pages being requested aren’t in logical order

• Being a diligent type, you rebuild your indexes

• Because SQL is awesome, it does this using parallelism!

• So what’s the catch…?

• If Read Ahead is your goal, MAXDOP 1 to rebuild your indexes!

Fragmentation - Party Foul Champion

Enclosure 1

PRIMARY MyDB.MDF

Filegroup

File / LUN

Partition 1

Partition 2

Partition 3

Partition 4

Partition 5

Partition 6

Partition 7

Partition 8

Enclosure 1

Filegroup

File / LUN

MyDB.MDFPrimary

Dimensions

Volatile

Dimensions.NDF

Staging.NDF

Facts.NDFFacts

Fact.NDFLarge Fact

Partition1.NDF

Partition2.NDF

Partition3.NDF

Partition4.NDF

Partition5.NDF

Partition6.NDF

Partition7.NDF

Partition8.NDF

• How does Analysis Services pull in data?

Getting data out of your Data Warehouse for Analysis Services

• On read heavy workloads and Enterprise Edition, Compression

• If storing multiple Tables in a Filegroup:

– “-E” – For Data Warehouses - This allocates 64 extents (4MB) per object, per file, rather than the standard 1 (64K)

• If using multiple Files in a Filegroup

– “-T1117” – For all - This ensures that if auto growth occurs on one file, it occurs on all others. Ensures “round robin” remains in place

• In General on dedicated SQL servers

– Evaluate “-T834” – Requires Lock Pages in memory enabled

– This enables large page allocations for the Buffer Pool (2Mb – 16Mb)

– Can cause problems if memory is fragmented by other apps

Do we have any go faster buttons?

• 3 TeraByte Data Warehouse

• Table scan was topping out at 300 mb/sec

• Storage was capable of 1.7 GB/sec

• Table partitioning was in place

• All tables were in a single Filegroup

• Had to get creative on enhancing throughput

Case Study – The Unruly Fact Table

• 16 core server, Hyper Threaded to 32 cores

• 128 GB of Memory

• SQLIO fully sequential, storage gives 2.2 GB/Sec

• 32 range scans started up to simulate workload

• Page compression was enabled, the T834 trace flag was enabled

• MAXDOP of 1 on the server to ensure # of threads were controlled

Test Conditions

Facts and Figures

575615

942

1150

500

600

700

800

900

1000

1100

1200

Baseline Post IndexRebuild withMAXDOP 1

Single FG forlarge table(s)

Multiple FGone perpartition

Throughput MB/Sec

181 181

511 505

150

200

250

300

350

400

450

500

550

Baseline Post IndexRebuild withMAXDOP 1

Single FG forlarge table(s)

Multiple FGone perpartition

I/O Request Size (Kb)

Other metrics

Scenario Time Secs Avg IO (K) Avg MB/Sec Max MB/Sec

Baseline 70 181 575 622

Post Index Rebuild with MAXDOP 1 68 181 615 668

Single FG for large table(s) 56 511 942 1196

Multiple FG one per partition 42 505 1,150 1,281

• Thankfully easy - Index rebuilds!

– For non partitioned tables, drop and re-create on the new Filegroup

– For partitioned tables – Alter the partition scheme to point to the new FileGroup

– For heaps, create a Clustered Index on the table on the new filegroup, then drop it!

How do we make the changes

• File and Filegroup considerations can yield huge gains system wide

• Know your workload and optimise for it

• If you have a Hybrid workload, then have a Hybrid architecture!

• Don’t neglect your SQL Settings

• Code changes and indexes aren’t the only way to save the day!

Summary

• Paul Randal – Multi file/filegroup testing on Fusion IO• http://www.sqlskills.com/blogs/paul/benchmarking-multiple-data-files-on-ssds-plus-the-latest-fusion-io-driver/

• Fast Track Configuration Guide• http://msdn.microsoft.com/en-us/library/gg605238.aspx

• Resolving Latch contention• http://www.microsoft.com/en-us/download/details.aspx?id=26665

• Maximizing Table Scan Speed on SQL 2008 R2• http://henkvandervalk.com/maximizing-sql-server-2008-r2-table-scan-speed-from-dsi-solid-state-storage

• Specifying storage requirements (Find that sweet spot!)• http://blogs.prodata.ie/post/How-to-Specify-SQL-Storage-Requirements-to-your-SAN-Dude.aspx

• Fragmentation in Data Warehouses• http://sqlbits.com/Sessions/Event9/The_Art_of_War-Fast_Track_Data_Warehouse_and_Fragmentation

• Partial Database Availability and Piecemeal restores• http://technet.microsoft.com/en-US/sqlserver/gg545009.aspx

Useful links

http://www.sqlskills.com/blogs/paul/benchmarking-multiple-data-files-on-ssds-plus-the-latest-fusion-io-driver/

http://msdn.microsoft.com/en-us/library/gg605238.aspx

http://www.microsoft.com/en-us/download/details.aspx?id=26665

http://henkvandervalk.com/maximizing-sql-server-2008-r2-table-scan-speed-from-dsi-solid-state-storage

http://blogs.prodata.ie/post/How-to-Specify-SQL-Storage-Requirements-to-your-SAN-Dude.aspx

http://sqlbits.com/Sessions/Event9/The_Art_of_War-Fast_Track_Data_Warehouse_and_Fragmentation

http://technet.microsoft.com/en-US/sqlserver/gg545009.aspx

THANK YOU!• For attending this session and

PASS SQLRally Nordic 2013, Stockholm

Download - Getting the most from your SAN File and Filegroup … Archbold.pdfGetting the most from your SAN – File and Filegroup design patterns Stephen Archbold. ... –For non partitioned

Top Related