databases for storage engineers
DESCRIPTION
A short introduction to SQL ServTRANSCRIPT
Thomas Kejser
http://blog.kejser.org
@thomaskejser
Databases
For storage People
• The Microsoft Database Stack
• Hard problems the database solves
• File layout and I/O pattern
• Data and Log Files
• Analysis Services Files
• TempDb and other system databases
• Installation of SQL
• Q&A
Agenda
The SQL Server Stack
• SQL Server (aka: Core Engine)• SQL Server Analysis Services (SSAS)
• Tabular• Multi Dimensional
• SQL Server Service Broker (SSB)• SQL Server Integration Services (SSIS)• SQL Server Reporting Services (SSRS)• SQL Server Data Quality Tools• SQL Server Master Data Services• SQL Server Parallel Data Warehouse• .NET stuff…• Various Excel plug-ins
• A “full” stack!
Product Portfolio
What Type of Workload?
BigSmall
Small
Big
Dat
a R
etu
rned
Data Touched
OLTP BI/DW
Simulation ETL
A Template OLTP System
“App” tierWeb Server WindowsLicense
Database TierWeb/Core Licensing2 or 4 sockets
Core
.NET .NET .NET .NET
A Template Data Warehouse
SSIS
SSIS
SSIS
SSIS
Core
Core
SSAS
SSAS
Core
Integration TierBlades
CPU Intensive low IOPS
“Enterprise” Warehouse TierLarge machines
VERY CPU greedy VERY I/O greedy (GB/sec)
BI / Presentation / CubesMedium Servers
Can be IOPS greedy
SSRS
Fast Track Data Warehouses
A Template MPP Warehouse
SSIS
SSIS
SSIS
SSIS
SSAS
Core
Enterprise Warehouse TierAppliance (The “hub”)
Data Marts(The “spokes”)
Management Tools you Need to Know
Pre 2012 2012
Management Studio(AKA: Enterprise Manager)
(Management Studio)
Project Data Dude Data Tools
Configuration Manager Configuration Manager
SQL Server Profiler Xevent Tracing
Reporting Services ConfigManager
Reporting Services ConfigManager
Sp_configure Sp_configure / ALTER SERVER
Hard problemsdatabases help you solve
Query Plan Generation
Find all parts bought by Thomas Kejser
Express Problem, Auto get solutions
To do this well, we need Statistics
I did it
SQL Did it
THIS is not accurate and it will never be!
… and we Need Indexes
B+ Tree
95% of all database problems* are caused by:
A) Poor indexing
B) Wrong Statistics
A) Badly written queries
B) All of the above
* Low estimate, trying to be nice to humanity
And most of the time, there is nothing you can do about that*
… which is where storage come into the picture
* AKA: “Craplications”, technical term
• The CPU Bound• Have to help rewrite• Better storage does not help• But DBAs may still believe it is I/O
• The I/O bound• Can throw NAND at it• I will show you how to diagnose
• DBA people like to talk about this like…
Two types of bad Queries
CPU
L3
L2
L2
C
C
Response time = Service Time + Wait Time
Algorithmsand
Data Structures
“Bottlenecks”
• We normally end up talking about bad join plans
• Joins come in three flavours
• Merge
• Hash
• Loop
When Speaking about Service Time
Merge Join
m row result
1
1
2
3
n row result
1
2
3
4
4
43
43
Sort
ed
Sort
ed
Complexity: O(m + n)
Hash Join
m row result
1
43
13
7
n row join table
Hash(1)
n row hash table
Complexity: O(m + 2n)
3
Loop Join
n row B-tree
Log(n) reads
Complexity: O(m * log(n))
m row result
1
43
13
7
3
When Hash Joins hurt you
0
5
10
15
20
25
30
050100150200250300350400
Hash Memory (MB)
Runtime (seconds)
Spill Zone!
Join Hints
B probed, lower table in join(second table in join statement)
A probed, upper table in join(first table in join statement)
Just the way it is …
Why is it so hard to get joins right?
n
m
Time
Loop Join
Merge Join
Hash Join
No-one has been able to get joins consistently right!
P = NP ?
Getting I/O right…
SQL-OS (Schedulers, Buffer Pool, Memory Management, Synchronization Primitives, …)
Query Optimization (Plan Generation, View
Matching, Statistics, Costing)
Query Execution(Query Operators, Memory
Grants, Parallelism)
Language Processing (Parse/Bind)
Statement/Batch Execution
Plan Cache Management
Storage Engine (Access Methods, Database Page Cache, Locking, Transactions, …)
The Storage Engines makes I/O Transparent!
RAM Storage
Storage Engine
Rest of engineonly sees the API
Primitive SQL Server Analysis Services
Scheduling Voluntary Yield, User mode
Kernel mode, Preemptive
I/O Engine Dedicated I/O stack Windows Buffered I/O
Waiting / Spinning SQLOS Primitives Windows
Memory Management SQLOS / Storage Engine Windows Paging
Serialisation TDS special purpose XML
Network Fully optimizable, async,affinitized engine
Windows primitives,blocking
Two Different Philosophies
• Primitives are a different beast than Windows
• Scale issues are generally specific to the core, not Windows
• Exposes own “belly of the beast” profiling
• SQL Team build their own primitives, often better than Windows core
• Highest throughput app on Windows, drives all the scale stuff there
SQL Server is different
• Analysis Services relies fully on Windows primitives
• You can profile it by looking at how Windows behaves
• Upgrades to Windows are more likely to help it
• No TPC style benchmarks…
Analysis Services is “just another App”
A is for Atomic
LINEITEM
ORDER
ORDER_KEYPART_KEY
COMMITDATEQUANTITY
ORDER_KEYCUSTOMER_KEY
LINEITEM
ORDER
ORDER_KEYPART_KEYCOMMITDATEQUANTITY
ORDER_KEYCUSTOMER_KEY
LINEITEM
ORDER_KEYPART_KEYCOMMITDATEQUANTITY
ORDER
ORDER_KEYCUSTOMER_KEY
C is for Consistency
LINEITEM
ORDER
ORDER_KEY = 42
ORDER_KEY!= 42
LINEITEM
ORDER
COMMITDATE= 2012-02-30
ORDER_KEY
LINEITEM
ORDER_KEYPART_KEYCOMMITDATEQUANTITY
ORDER
ORDER_KEYCUSTOMER_KEY
I is for Isolation
SELECT @LastTransaction_ID = LastTransaction_ID
FROM ATM
WHERE ATM_ID = 13
SET @ID = @LastTransaction_ID + 1
UPDATE ATM
SET @LastTransaction_ID = @ID
WHERE ATM_ID = 13
SELECT @LastTransaction_ID = LastTransaction_ID
FROM ATM
WHERE ATM_ID = 13
SET @ID = @LastTransaction_ID + 1
UPDATE ATM
SET @LastTransaction_ID = @ID
WHERE ATM_ID = 13
(@LastTransaction_ID = 42)
(@LastTransaction_ID = 42)
D is for Durability
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
Do Transactions
Ack
• Do complex operations in optimal time
• …at high parallelism
• Optimise I/O pattern
• Be ACID compliant
• Store stuff safely…
• noSQL/Big Data systems trade off >0 of these to get more of the others
Summary – Databases Help You
• Server won’t start without:
• master
• mssqlsystemressource
• System CAN start, but wont work well
• model
• msdb
• System will start under special conditions
• tempdb
System Databases
• Together, contain all system information
• Mssqlsystemressource
• Lives under: MSSQL\Binn
• Contains all system code
• Hidden by default
• Master
• Lives under: MSSQL\DATA
• You should move these to a safe location
Master and mssqlsystemressources
• You lost:• All passwords and server logins
• All system wide certificates (You may be unable to decrypt!)
• All System procedures you created
• You are not 100% screwed, but you are in for a long night• Both can be rebuild (empty) during server
start
• …Or restored from backup• if you remembered to take one
• Need /f and /T3608 to get back up
Disaster: Master or systemResources
• Every new created database is cloned from this
• Loss is not catastrophic
• Copy from healthy machine
• Tempdb can’t boot without it
• Lives with master
Database: model
• Database “swap file”
• Does not survive restarts
• No Durability guarantees here
• Fast I/O helps
Database tempdb
• Will rebuild itself after instance restart
• Configuration is stored in master
• Clones from msdb
• Nearly every installation must changedefaults
• If tempdb cannot be created, server will only start from command line
Loss of Tempdb…is…Temporary
• A database consists of• At least one Transaction Log File
• The PRIMARY filegroup
• At least one data file in PRIMARY
• If any of these are lost, the database is dead• You can in some cases bring a database
without a transaction log back alive
• But typically with data loss…
• Lesson: carefully protect all of above
User Databases and Failure
What is in the Files?
PRIMARY
Primary File
Metadata(system objects)
GAM / SGAM
PFS Map
User Data
Transaction Log
Headers
VLF
VLF
VLF
• Regular files in NTFS
• Secured
• Files can Auto Grow as needed
• Risky
• File Imbalance
Data Files
• ALTER or CREATE DATABASE
• Transaction log file always zeroed out• This looks super cool
on FusionIo by the way
• Data files MAY be zeroed out• Depends in privileges
• May use instant file init
How are Database Files Created?
• Filegroups (one word) are containers of files
• Used to group similar data together
• Oracle people know this concept as a table-spaces
• Files inside FG are accessed/allocated round-robin
Filegroups
PRIMARY
DATA
User Data
User Data
User Data
User Data
User Data
• DBCC SHRINKFILE
• REBUILD data
Reclaiming/Moving Space in Files
DBCC SHRINKFILE
1
3
5
2
4
6
87
LUN 1 LUN 2 LUN 3 LUN 4
How to reclaim space the right way…
LUN 3 LUN 4
1
3
5
2
4
6
87
LUN 1 LUN 2
New Filegroup
ALTER INDEX Foo WITH REBUILD, SORT_IN_TEMPDB = ON
1
3
5
2
4
6
87
• Too few PFS maps can lead to latch contention
• Diagnosed in:
sys.dm_os_waiting_tasks
• Look for PAGELATCH_UP
PFS Contention
File
PFS Map
User Data(8000 pages)
PFS Map
User Data(8000 pages)
• DBAs typically diagnose issues with waits stats
• Issues they look for:
• WRITELOG/LOGBUFFER waits
• PAGELATCHIO_<X> waits
• BACKUPIO waits
• IO_COMPLETION/ASYNC_IO_COMPLETION
I/O DBA people worry about
• Diagnosing ressource waits:
• sys.dm_os_wait_stats
• Post 2008R2 – can use Xevents (harder)
• More detail in:
• sys.dm_io_virtual_filestats(NULL, NULL)
• Confirm waits here!
• SQL Server errors in log file:
Places you need to know about