code camp nz 2011

Code Camp NZ 2011

#CCNZ

www.mscommunities.co.nz

Minimal Logging & Data Manoeuvring On very Large Tables

What is going on here???

Steven Wang

• Senior DBA/Senior BI Specialist at BNZ• MCITP/MCTS: BI Developer, Database

Developer and Database Administrator• Owner of TellYes Data Intelligence• Blogs: www.msbicoe.com• Email: [email protected]

http://www.msbicoe.com/

Agenda

• What Is minimal logging and why does it matter?

• What can be minimally logged?• Metadata-Only Operations• Things under the hood (Demo)• Data Manoeuvring technics on very large

tables• Real world example• Q & A

Small Quiz

A. A minimally logged operation will always reduce the transaction log backup size?

True False

B. A minimally logged operation will always reduce the transaction log size?

True False

C. A minimally logged operation is always faster than fully logged operation ?

True False

Answer

A. A minimally logged operation will always reduce the transaction log backup size?

True False

B. A minimally logged operation will always reduce the transaction log size?

True False

C. A minimally logged operation is always faster than fully logged operation?

True False

What is Minimal Logging and why does it matter?

• SQL server uses ‘write-ahead logging’. Everything has to be written on to log file first

• A minimally logged operation is one that does not always log every row; it only logs the extend allocations

• Transaction log will fill rapidly under full recovery model, it is likely to be grown during a large quantity of data operation

• “The transaction log is ALWAYS zero initialized when first created, manually grown, or auto-grown.” --Paul Randal

What can be minimally logged? • SELECT INTO …. FROM Table

• Bulk Import Operation: BCP BULK Insert INSERT INTO … SELECT FROM

OPENROWSET(BULK…)

• Create/Rebuild Index

• Using the .Write in the Update for new data

• The Creation of new heap by Dropping Index

Conditions

• It has to be under Bulk-Logged or Simple Recovery Model

• Table is not being replicated

• Tablock hint needs to be used

• Something talking about later

Conditions (Continue)

Heap + Tablock

Data Data Index

Data Data IndexEmpty

Not Empty

H e a p H e a p + I n d e xMinimal Logging Full Logging

Conditions (Continue)Clustered + Tablock

Data Data Index

Data Data IndexEmpty

Not Empty

C l u s t e r e d C l u s t e r e d + I n d e x

Minimal Logging Full Logging

Conditions (Continue)

• Use Trace Flag 610 Can be used to get minimal logging

for empty heap and clustered table Can be used to get minimal logging

in a non-empty B-Tree Tablock hint is not needed for table

without nonclustered index The first page is always fully logged

• Can be turned on instance-wide or session-wide

• Fully tested in your environment before use

Metadata-Only Operations

• Truncate Table

• Drop Table

• Partition Switch

• Partition Merge for 2 empty partitions

• Partition Split for an empty partition

Things Under the Hood (Demo)

Data Manoeuvring technics on very large tables

• Bulk Load Partitioned Table Use parallelism, per bulk load per

CPU core Create staging tables having exactly

same structure as target table in same filegroup with no indexes

Bulk load data into staging tables Create indexes and constraints on

staging tables. Switch the staging tables into

partitioned table partitions

Alter database MyDB Set recovery Bulk_Logged;GoAlter Database MyDB Modify Filegroup FG1 Default;GOSelect Col1, Col2, Col3,…Into Stage_1From Target_Table Where 0=1Go...GoAlter Database MyDB Modify Filegroup FG4 Default;GOSelect Col1, Col2, Col3,…Into Stage_4From Target_Table Where 0=1Go

Data_201104

Data_201103

Data_201102

Data_201101

Stage_4(FG4)

Stage_3(FG3)

Stage_2(FG2)

Stage_1(FG1)

Partition_4(FG4)

Source Data File Staging TablesDestination

Partitioned Table

Partition_3(FG3)

Partition_2(FG2)

Partition_1(FG1)

Bulk Insert

Bulk Insert

Bulk Insert

Bulk Insert

Data_201104

Data_201103

Data_201102

Data_201101

Stage_4(FG4)

Stage_3(FG3)

Stage_2(FG2)

Stage_1(FG1)

Partition_4(FG4)


Partitioned Table

Partition_3(FG3)

Partition_2(FG2)

Partition_1(FG1)

Bulk Insert

Bulk Insert

Bulk Insert

Bulk Insert

Data_201104

Data_201103

Data_201102

Data_201101

Stage_4(FG4)

Stage_3(FG3)

Stage_2(FG2)

Stage_1(FG1)

Partition_4(FG4)


Partitioned Table

Partition_3(FG3)

Partition_2(FG2)

Partition_1(FG1)

1. Create Indexes;2. Apply other settings to comply with

target;3. Create constraints;

Bulk Insert

Bulk Insert

Bulk Insert

Bulk Insert

Data_201104

Data_201103

Data_201102

Data_201101

Stage_4(FG4)

Stage_3(FG3)

Stage_2(FG2)

Stage_1(FG1)

Partition_4(FG4)


Partitioned Table

Partition_3(FG3)

Partition_2(FG2)

Partition_1(FG1)

Alter Table Stage_1 SWITCH TO MyTargetTable Partition 1;Go...GoAlter Table Stage_4 SWITCH TO MyTargetTable Partition 4;

Bulk Insert

Bulk Insert

Bulk Insert

Bulk Insert

Data_201104

Data_201103

Data_201102

Data_201101

Stage_4(FG4)

Stage_3(FG3)

Stage_2(FG2)

Stage_1(FG1)

Partition_4(FG4)


Partitioned Table

Partition_3(FG3)

Partition_2(FG2)

Partition_1(FG1)

Switch In

Switch In

Switch In

Switch In

Data Manoeuvring technics on very large tables (Continues)

• Delete a very large amount of rows in a table, for example, over 50% of the total Don’t Delete! Think of using truncate

or drop table instead.

How??

Start

Alter Database MyDB Set recovery bulk_Logged

Select Data need to keep into a new table

Create indexes and other settings as

needed

Drop the Original Table

Rename the new table to Original table name

Set the recovery model back to full

End

Delete a very large amount of data from a non-partitioned table

Partition_4(FG4)

Partitioned Table

Partition_3(FG3)

Partition_2(FG2)

Partition_1(FG1)

Delete a very large amount of data from a partitioned table

Switch OutStage_4

(FG4)

Switch OutStage_3

(FG3)

All Data in the partition need to be deleted

More than half of the data need to be

deleted

Select Into Stage_1(FG1)

Partition_4(FG4)

Partitioned Table

Partition_3(FG3)

Partition_2(FG2)

Partition_1(FG1)


Stage_4(FG4)

Stage_3(FG3)

Switch OutStage_1

(FG3)Stage_1

(FG1)

Drop Stage Table

Partition_4(FG4)

Partitioned Table

Partition_3(FG3)

Partition_2(FG2)

Partition_1(FG1)


Switch InStage_1(FG1)

Data Manoeuvring technics on very large tables (Continues)

Really????

• Update a very large amount of rows in a table for majority of columns Don’t Update! Think of using

inserting data instead to achieve minimal logging.

Update a very large amount of data for a non-partitioned table

Existing Data

MyData

New Data

MyData_New

Select A.pk_ID, Coalesce(B.Col1, A.Col1) As Col1, Coalesce(B.Col2, A.Col2) As Col2, ...

Into MyData_TempFrom MyData A

Left Outer JoinMyData_New B

On A.pk_ID = B.pk_ID

Updated Data

MyData_Temp

• Drop MyData• Truncate MyData_New• Rename MyData_Temp

to MyData

Update a very large amount of data for a partitioned table

I believe that you have already got a fairly good idea how to apply the similar technical to perform the update on partitioned table.

Real World Example

• A Table has to be partitioned daily• The data has to be kept more than 7

years• Every first day of the month a partition

merge operation will be performed to consolidate one month’s partitions which is 30 month old to one partition

• How will you do it?

Thank You

• Email: [email protected]• Blogs: www.msbicoe.com• Linkedin:

http://nz.linkedin.com/pub/steven-wang/1b/ab/b8b

mailto:[email protected]

http://www.msbicoe.com/



PremierPartners

Sponsor

Thanks to our sponsorsand partners!

AssociatedPartners

SupportingPartners

code camp nz 2011

Documents