Download - Databases for Storage Engineers
![Page 1: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/1.jpg)
Thomas Kejser
http://blog.kejser.org
@thomaskejser
Databases
For storage People
![Page 2: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/2.jpg)
• The Microsoft Database Stack
• Hard problems the database solves
• File layout and I/O pattern
• Data and Log Files
• Analysis Services Files
• TempDb and other system databases
• Installation of SQL
• Q&A
Agenda
![Page 3: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/3.jpg)
The SQL Server Stack
![Page 4: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/4.jpg)
• SQL Server (aka: Core Engine)• SQL Server Analysis Services (SSAS)
• Tabular• Multi Dimensional
• SQL Server Service Broker (SSB)• SQL Server Integration Services (SSIS)• SQL Server Reporting Services (SSRS)• SQL Server Data Quality Tools• SQL Server Master Data Services• SQL Server Parallel Data Warehouse• .NET stuff…• Various Excel plug-ins
• A “full” stack!
Product Portfolio
![Page 5: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/5.jpg)
What Type of Workload?
BigSmall
Small
Big
Dat
a R
etu
rned
Data Touched
OLTP BI/DW
Simulation ETL
![Page 6: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/6.jpg)
A Template OLTP System
“App” tierWeb Server WindowsLicense
Database TierWeb/Core Licensing2 or 4 sockets
Core
.NET .NET .NET .NET
![Page 7: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/7.jpg)
A Template Data Warehouse
SSIS
SSIS
SSIS
SSIS
Core
Core
SSAS
SSAS
Core
Integration TierBlades
CPU Intensive low IOPS
“Enterprise” Warehouse TierLarge machines
VERY CPU greedy VERY I/O greedy (GB/sec)
BI / Presentation / CubesMedium Servers
Can be IOPS greedy
SSRS
![Page 8: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/8.jpg)
Fast Track Data Warehouses
![Page 9: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/9.jpg)
A Template MPP Warehouse
SSIS
SSIS
SSIS
SSIS
SSAS
Core
Enterprise Warehouse TierAppliance (The “hub”)
Data Marts(The “spokes”)
![Page 10: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/10.jpg)
Management Tools you Need to Know
Pre 2012 2012
Management Studio(AKA: Enterprise Manager)
(Management Studio)
Project Data Dude Data Tools
Configuration Manager Configuration Manager
SQL Server Profiler Xevent Tracing
Reporting Services ConfigManager
Reporting Services ConfigManager
Sp_configure Sp_configure / ALTER SERVER
![Page 11: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/11.jpg)
Hard problemsdatabases help you solve
![Page 12: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/12.jpg)
Query Plan Generation
Find all parts bought by Thomas Kejser
![Page 13: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/13.jpg)
Express Problem, Auto get solutions
![Page 14: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/14.jpg)
To do this well, we need Statistics
I did it
SQL Did it
THIS is not accurate and it will never be!
![Page 15: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/15.jpg)
… and we Need Indexes
B+ Tree
![Page 16: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/16.jpg)
95% of all database problems* are caused by:
A) Poor indexing
B) Wrong Statistics
A) Badly written queries
B) All of the above
* Low estimate, trying to be nice to humanity
![Page 17: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/17.jpg)
And most of the time, there is nothing you can do about that*
… which is where storage come into the picture
* AKA: “Craplications”, technical term
![Page 18: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/18.jpg)
• The CPU Bound• Have to help rewrite• Better storage does not help• But DBAs may still believe it is I/O
• The I/O bound• Can throw NAND at it• I will show you how to diagnose
• DBA people like to talk about this like…
Two types of bad Queries
CPU
L3
L2
L2
C
C
![Page 19: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/19.jpg)
Response time = Service Time + Wait Time
Algorithmsand
Data Structures
“Bottlenecks”
![Page 20: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/20.jpg)
• We normally end up talking about bad join plans
• Joins come in three flavours
• Merge
• Hash
• Loop
When Speaking about Service Time
![Page 21: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/21.jpg)
Merge Join
m row result
1
1
2
3
n row result
1
2
3
4
4
43
43
Sort
ed
Sort
ed
Complexity: O(m + n)
![Page 22: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/22.jpg)
Hash Join
m row result
1
43
13
7
n row join table
Hash(1)
n row hash table
Complexity: O(m + 2n)
3
![Page 23: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/23.jpg)
Loop Join
n row B-tree
Log(n) reads
Complexity: O(m * log(n))
m row result
1
43
13
7
3
![Page 24: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/24.jpg)
When Hash Joins hurt you
0
5
10
15
20
25
30
050100150200250300350400
Hash Memory (MB)
Runtime (seconds)
Spill Zone!
![Page 25: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/25.jpg)
Join Hints
B probed, lower table in join(second table in join statement)
A probed, upper table in join(first table in join statement)
Just the way it is …
![Page 26: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/26.jpg)
Why is it so hard to get joins right?
n
m
Time
Loop Join
Merge Join
Hash Join
![Page 27: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/27.jpg)
No-one has been able to get joins consistently right!
P = NP ?
![Page 28: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/28.jpg)
Getting I/O right…
SQL-OS (Schedulers, Buffer Pool, Memory Management, Synchronization Primitives, …)
Query Optimization (Plan Generation, View
Matching, Statistics, Costing)
Query Execution(Query Operators, Memory
Grants, Parallelism)
Language Processing (Parse/Bind)
Statement/Batch Execution
Plan Cache Management
Storage Engine (Access Methods, Database Page Cache, Locking, Transactions, …)
![Page 29: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/29.jpg)
The Storage Engines makes I/O Transparent!
RAM Storage
Storage Engine
Rest of engineonly sees the API
![Page 30: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/30.jpg)
Primitive SQL Server Analysis Services
Scheduling Voluntary Yield, User mode
Kernel mode, Preemptive
I/O Engine Dedicated I/O stack Windows Buffered I/O
Waiting / Spinning SQLOS Primitives Windows
Memory Management SQLOS / Storage Engine Windows Paging
Serialisation TDS special purpose XML
Network Fully optimizable, async,affinitized engine
Windows primitives,blocking
Two Different Philosophies
![Page 31: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/31.jpg)
• Primitives are a different beast than Windows
• Scale issues are generally specific to the core, not Windows
• Exposes own “belly of the beast” profiling
• SQL Team build their own primitives, often better than Windows core
• Highest throughput app on Windows, drives all the scale stuff there
SQL Server is different
![Page 32: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/32.jpg)
• Analysis Services relies fully on Windows primitives
• You can profile it by looking at how Windows behaves
• Upgrades to Windows are more likely to help it
• No TPC style benchmarks…
Analysis Services is “just another App”
![Page 33: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/33.jpg)
A is for Atomic
LINEITEM
ORDER
ORDER_KEYPART_KEY
COMMITDATEQUANTITY
ORDER_KEYCUSTOMER_KEY
LINEITEM
ORDER
ORDER_KEYPART_KEYCOMMITDATEQUANTITY
ORDER_KEYCUSTOMER_KEY
LINEITEM
ORDER_KEYPART_KEYCOMMITDATEQUANTITY
ORDER
ORDER_KEYCUSTOMER_KEY
![Page 34: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/34.jpg)
C is for Consistency
LINEITEM
ORDER
ORDER_KEY = 42
ORDER_KEY!= 42
LINEITEM
ORDER
COMMITDATE= 2012-02-30
ORDER_KEY
LINEITEM
ORDER_KEYPART_KEYCOMMITDATEQUANTITY
ORDER
ORDER_KEYCUSTOMER_KEY
![Page 35: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/35.jpg)
I is for Isolation
SELECT @LastTransaction_ID = LastTransaction_ID
FROM ATM
WHERE ATM_ID = 13
SET @ID = @LastTransaction_ID + 1
UPDATE ATM
SET @LastTransaction_ID = @ID
WHERE ATM_ID = 13
SELECT @LastTransaction_ID = LastTransaction_ID
FROM ATM
WHERE ATM_ID = 13
SET @ID = @LastTransaction_ID + 1
UPDATE ATM
SET @LastTransaction_ID = @ID
WHERE ATM_ID = 13
(@LastTransaction_ID = 42)
(@LastTransaction_ID = 42)
![Page 36: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/36.jpg)
D is for Durability
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
![Page 37: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/37.jpg)
• Do complex operations in optimal time
• …at high parallelism
• Optimise I/O pattern
• Be ACID compliant
• Store stuff safely…
• noSQL/Big Data systems trade off >0 of these to get more of the others
Summary – Databases Help You
![Page 38: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/38.jpg)
• Server won’t start without:
• master
• mssqlsystemressource
• System CAN start, but wont work well
• model
• msdb
• System will start under special conditions
• tempdb
System Databases
![Page 39: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/39.jpg)
• Together, contain all system information
• Mssqlsystemressource
• Lives under: MSSQL\Binn
• Contains all system code
• Hidden by default
• Master
• Lives under: MSSQL\DATA
• You should move these to a safe location
Master and mssqlsystemressources
![Page 40: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/40.jpg)
• You lost:• All passwords and server logins
• All system wide certificates (You may be unable to decrypt!)
• All System procedures you created
• You are not 100% screwed, but you are in for a long night• Both can be rebuild (empty) during server
start
• …Or restored from backup• if you remembered to take one
• Need /f and /T3608 to get back up
Disaster: Master or systemResources
![Page 41: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/41.jpg)
• Every new created database is cloned from this
• Loss is not catastrophic
• Copy from healthy machine
• Tempdb can’t boot without it
• Lives with master
Database: model
![Page 42: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/42.jpg)
• Database “swap file”
• Does not survive restarts
• No Durability guarantees here
• Fast I/O helps
Database tempdb
![Page 43: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/43.jpg)
• Will rebuild itself after instance restart
• Configuration is stored in master
• Clones from msdb
• Nearly every installation must changedefaults
• If tempdb cannot be created, server will only start from command line
Loss of Tempdb…is…Temporary
![Page 44: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/44.jpg)
• A database consists of• At least one Transaction Log File
• The PRIMARY filegroup
• At least one data file in PRIMARY
• If any of these are lost, the database is dead• You can in some cases bring a database
without a transaction log back alive
• But typically with data loss…
• Lesson: carefully protect all of above
User Databases and Failure
![Page 45: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/45.jpg)
What is in the Files?
PRIMARY
Primary File
Metadata(system objects)
GAM / SGAM
PFS Map
User Data
Transaction Log
Headers
VLF
VLF
VLF
![Page 46: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/46.jpg)
• Regular files in NTFS
• Secured
• Files can Auto Grow as needed
• Risky
• File Imbalance
Data Files
![Page 47: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/47.jpg)
• ALTER or CREATE DATABASE
• Transaction log file always zeroed out• This looks super cool
on FusionIo by the way
• Data files MAY be zeroed out• Depends in privileges
• May use instant file init
How are Database Files Created?
![Page 48: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/48.jpg)
• Filegroups (one word) are containers of files
• Used to group similar data together
• Oracle people know this concept as a table-spaces
• Files inside FG are accessed/allocated round-robin
Filegroups
PRIMARY
DATA
User Data
User Data
User Data
User Data
User Data
![Page 49: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/49.jpg)
• DBCC SHRINKFILE
• REBUILD data
Reclaiming/Moving Space in Files
![Page 50: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/50.jpg)
DBCC SHRINKFILE
1
3
5
2
4
6
87
LUN 1 LUN 2 LUN 3 LUN 4
![Page 51: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/51.jpg)
How to reclaim space the right way…
LUN 3 LUN 4
1
3
5
2
4
6
87
LUN 1 LUN 2
New Filegroup
ALTER INDEX Foo WITH REBUILD, SORT_IN_TEMPDB = ON
1
3
5
2
4
6
87
![Page 52: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/52.jpg)
• Too few PFS maps can lead to latch contention
• Diagnosed in:
sys.dm_os_waiting_tasks
• Look for PAGELATCH_UP
PFS Contention
File
PFS Map
User Data(8000 pages)
PFS Map
User Data(8000 pages)
![Page 53: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/53.jpg)
• DBAs typically diagnose issues with waits stats
• Issues they look for:
• WRITELOG/LOGBUFFER waits
• PAGELATCHIO_<X> waits
• BACKUPIO waits
• IO_COMPLETION/ASYNC_IO_COMPLETION
I/O DBA people worry about
![Page 54: Databases for Storage Engineers](https://reader034.vdocument.in/reader034/viewer/2022042607/559666bf1a28abd7048b45a4/html5/thumbnails/54.jpg)
• Diagnosing ressource waits:
• sys.dm_os_wait_stats
• Post 2008R2 – can use Xevents (harder)
• More detail in:
• sys.dm_io_virtual_filestats(NULL, NULL)
• Confirm waits here!
• SQL Server errors in log file:
Places you need to know about