Download - SQL Server Indexes Management Overviewahmad.jempress.com/attachments/SQL/SQL Server Indexes Manage… · Version 1.0 SQL Server Indexes Management Overview A h m a d A d e l G a d

Version 1.0

SQL Server Indexes Management

Overview

A h m a d A d e l G a d

a h m a d . a d e l @ j e m m a i l . c o m

2 1 / 0 3 / 2 0 1 5

Ahmad Adel Gad

Outlines the SQL Server Indexes types and management.

Illustrates the evolution of the databases in forms of the

RDBMS concepts using the Binary Tree and the B-Tree

algorithms.

Providing the T-SQL statements to shrink and defrag

databases.

SQL Server Indexes Management Overview

1

Ahmad Adel Gad ([email protected])

Mobile: +20-10-02501021 | +353-83-8489055 | Skype ID: eng.ahmad.adel | JemPress Blog WordPress Blog LinkedIn Twitter Google+


Author ........................................................................................................................................................... 3

Introduction .................................................................................................................................................. 4

Tree Search Algorithms ................................................................................................................................. 5

Binary Tree .................................................................................................................................... 5

B-Tree ............................................................................................................................................ 6

B+Tree ........................................................................................................................................... 6

B+Trees in SQL Server ................................................................................................................... 7

UB-Tree ......................................................................................................................................... 8

SQL Language ................................................................................................................................................ 9

History ........................................................................................................................................... 9

Language Elements ..................................................................................................................... 10

T-SQL Languages ......................................................................................................................... 10

Microsoft SQL Server Overview .................................................................................................................. 11

Microsoft & Sybase ..................................................................................................................... 11

SQL Server Release History ......................................................................................................... 12

DBMS ........................................................................................................................................................... 13

RDBMS ........................................................................................................................................................ 13

Data Access Protocols ................................................................................................................................. 14

ODBC ........................................................................................................................................... 14

OLEDB .......................................................................................................................................... 14

OLEDB vs ODBC ........................................................................................................................... 15

JDBC ............................................................................................................................................ 15

ADO.Net Data Provider ............................................................................................................... 15

Indexes ........................................................................................................................................................ 17

Index Architecture & Types ......................................................................................................... 17

Clustered Index ........................................................................................................................... 18

Non-Clustered Index ................................................................................................................... 18

Fill Factor ..................................................................................................................................... 20

Pad Index ..................................................................................................................................... 21

mailto:[email protected]

http://ahmad.jempress.com/

http://engahmadadel.wordpress.com/

http://eg.linkedin.com/in/ahmadadel/

https://twitter.com/MCTAhmad

https://plus.google.com/109354332545155785962


2



XML Index.................................................................................................................................... 21

Spatial Index ................................................................................................................................ 22

Database Shrinking ..................................................................................................................................... 23

Shrinking Log Files ....................................................................................................................... 23

Shrinking Data Files ..................................................................................................................... 23

Check Fragmentations ................................................................................................................ 23

Defragmentation ......................................................................................................................... 23

References .................................................................................................................................................. 26








3



Author

Ahmad Adel Gad was born in Egypt-Cairo on June, 1980.

He was graduated from faculty of engineering (Civil Engineering Department) on 2003.

He found himself in Software Developing rather than the civil engineering since the first year in the

college as they were teaching them how to write a code since the first year in order to implement

applications to support their work.

After he was graduated from college by two years he decided to shift to the Software Engineering.

On the early 2007, he attended a scholarship for Software Engineering in C#.

On the early of 2008, he had certified as Microsoft Solutions Developer (MCSD) for C# 2003.

On the middle of 2008 he had certified on (Microsoft Certified Trainer) MCT where he got an official

license from Microsoft to deliver Microsoft official courses.

On 2004 he started to develop Desktop Applications for Windows with his partner the CEO Wael

Ibrahim Arafa in his company Arafasoft and his role was the executive manager for the company.

Currently, Ahmad is working as fulltime in the IT field and as freelancer in the following fields as training,

consultation and developing with the two the languages VB.Net and C#.Net:

1. SharePoint Developing/Administration

2. ASP.Net Applications

3. Windows Applications

4. SQL Server Developing/Administration

5. AJAX Implementation (JavaScript/jQuery + XML Web Service)

6. Cryptography/Digital Signature Consultation

Currently, Ahmad is officially certified as:

ITIL Foundation

MCT

MCAD

MCSD

MCPD Enterprise

MCTS - SQL Server 2005

MCPD – SharePoint 2010

MCITP – SharePoint 2010

MCTS - Web Application 3.5

MCTS - Windows Application 3.5

MCTS - ADO.Net 3.5

MCPD - Web Developer 4

MCTS - .NET Framework 4, Windows Applications

MCTS - Programming in HTML5 with JavaScript and CSS3

MCITP - Database Administrator 2008

MCTS - Microsoft SQL Server 2008, Database Development

MCTS - Core Solutions of Microsoft SharePoint Server 2013

MCTS - Advanced Solutions of Microsoft SharePoint Server 2013







http://www.arafasoft.com/


4



Introduction

This article will handle the SQL database shrinking concept in consideration of the fragmentations after

the shrinking and how rebuild the index to reduce those fragmentations.

But before going in the technical details, this article will take the readers in backward to illustrate the

evolution of the databases and the RDBMS (Relational Database Management System) using the Binary

Tree and B-Tree algorithms which were evaluated to create the modern RDBMS databases with the

required indexes.

Before inventing this magnificent algorithm the computers’ users or the developers used to store their

data in plain text files (flat files) with different formats, and the most popular one is the Comma

Separated Values (CSV).

This method was very poor including the following shortfalls:

1. Not secured and can be accessible by anyone and many simple tools.

2. Unicode characters not allow.

3. Performance as the only way to get some certain data is by reading the whole file sequentially.

4. Relational Model not supported, and to overcome this scenario, they were creating file for each

table and maintaining the consistency was very complicated and kind of impossible.

Therefore, the Binary Tree was a huge revolution of the database concepts while it solved almost most

of the plain text’s short falls. And by developing this algorithm as B-Tree and later B+Tree, the RDBMS

became the common concept with any new database storage system.








5



Tree Search Algorithms

The tree methodologies were invented to store and search the data to ease and speeding of the data

query and manipulation (inserting, updating and deleting). It called a tree or a reversed tree, because

the root contains the keys while the tree leafs contain the data.

Binary Tree

The Binary Tree is a tree data structure into two nodes, which each node has at most two

children, which are referred to as the left child and the right child. The following figures are

giving an overview how the data is sorted using the Binary Tree.

The Binary Tree was a smart invention to sort and query the data but it contains a remarkable

disadvantage which is represented by its name (Binary). Each node can be branched only into

two sub nodes and each sub node can be branched only into two nodes until the end of the tree.








6



In most cases some nodes can grow more than others and we can get a tree with very short

branches and very long branches. In this case the search could take longer to query some data

rather than others.

B-Tree

In 1972, Rudolf Bayer and Ed McCreight, both working at Boeing at the time, were looking for a

way to overcome some of the shortfalls of the binary tree. Their solution was the B-Tree.

The B-Tree is the evolution of the Binary Tree algorithm which overcomes its disadvantage.

In the B-Tree, each node can be branched into many nodes as required to keep the tree nodes

balanced and symmetric as possible as could be.

Unlike the binary search trees, the B-Tree is optimized for systems that read and write large

blocks of data. It is commonly used in databases and file systems.

The “B” character confused many about the name, some of them where referring to Binary Tree

wrongly, but the most common naming is “Balanced”.

For a long time it was unclear what the "B" in the name represented. Candidates discussed in

public where "Boeing", "Bayer", "Balanced", "Bushy" and others. In 2013, the B-Tree had just

turned 40, Ed McCreight revealed in an interview, that they intentionally never published an

answer to this question. They were thinking about many of these options themselves at the time

and decided to just leave it an open question. The CPU processing of the search for Binary Tree

is the same as B-Tree, but for the memory consuming (RAM) B-Tree consumes less because the

node would be much shorter than the binary one.

B+Tree

B-Trees are a lot more efficient than binary trees when it comes to updates, but some

operations can still turn out expensive, depending on where the node that will hold the new or

updated data lives in the tree. Therefore, another optimization was made to B-Trees to help

with this problem. Instead of treating all nodes equal, the new structure has two types of nodes.

The lowest level nodes, also called leaf nodes, hold the actual data. All other nodes including the

root node only hold the key values and pointers to the next nodes. This type of tree is called a

B+Tree as seen the following figure:








7



There are no limitations on the number of key-pointer-pairs or data rows within a node. The

only limitation which is a significant advantage in the same time is that, all leaf nodes have the

same distance from the root node. That means that the work to seek for a particular data point

is always the same, no matter what the key value is. It also keeps updates very localized in the

tree. I might have to move a few rows to a new node during an update but then I probably only

need to change a single parent node to integrate that new node. It is however possible for a

single change to affect every level of the tree, but those changes are rare.

B+Trees in SQL Server

SQL Server stores its indexes in B+Tree format. There are a few exceptions - for example

temporary hash indexes, created during a hash join operation, or column store indexes, which

are not really indexes at all. However, all key-based clustered and non-clustered persisted SQL

Server indexes are organized and stored as B+Trees.

Each node in such a tree is a page in SQL Server terms and you will find two page types in each

index. The first type are the data pages (pages of type 1). Each leaf level node in a SQL Server

B+Tree index is a single data page. The second type are intermediate index pages (pages of type

2). Each node in an index B+Tree that is not a leaf level node is a single page of type 2. Those

pages contain rows just like the data pages. But in addition they contain a pointer for each row

that identifies the next child page. That child page can be either of type 1 or of type 2,

depending on the location in the B+Tree.








8



UB-Tree

The UB-Tree as proposed by Rudolf Bayer and Volker Markl is a balanced tree for storing and

efficiently retrieving multidimensional data. It is basically a B+ tree with records stored according

to Z-order, also called Morton order. Z-order is simply calculated by bitwise interlacing the keys.

Insertion, deletion, and point query are done as with ordinary B+ trees. To perform range

searches in multidimensional point data, however, an algorithm must be provided for

calculating, from a point encountered in the data base, the next Z-value which is in the

multidimensional search range.








9



SQL Language

History

The first version of SQL (Structured Query Language) was developed at IBM by Donald D.

Chamberlin and Raymond F. Boyce in the early 1970s. This version, initially called SEQUEL

(Structured English Query Language).

SQL was designed to manipulate and retrieve data stored in IBM's original relational database

product, System R. The SQL language was later formally standardized by the American National

Standards Institute (ANSI) in 1986. Subsequent versions of the SQL standard have been released

as International Organization for Standardization (ISO) standards.

The SQL language was standardized and developed with many vendors as illustrated in the

following table:

Source Common Name Full name

ANSI/ISO Standard SQL/PSM SQL/Persistent Stored Modules

Interbase / Firebird PSQL Procedural SQL

IBM DB2 SQL PL SQL Procedural Language (implements SQL/PSM)

IBM Informix SPL Stored Procedural Language

IBM Netezza NZPLSQL (based on Postgres PL/pgSQL)

Microsoft / Sybase T-SQL Transact-SQL

Mimer SQL SQL/PSM SQL/Persistent Stored Module (implements SQL/PSM)

MySQL SQL/PSM SQL/Persistent Stored Module (implements SQL/PSM)

MonetDB SQL/PSM SQL/Persistent Stored Module (implements SQL/PSM)

NuoDB SSP Starkey Stored Procedures

Oracle PL/SQL Procedural Language/SQL (based on Ada)

PostgreSQL PL/pgSQL Procedural Language/PostgreSQL (based on Oracle PL/SQL)

PostgreSQL PL/PSM

Procedural Language/Persistent Stored Modules (implements SQL/PSM)

Sybase Watcom-SQL SQL Anywhere Watcom-SQL Dialect

Teradata SPL Stored Procedural Language

SAP SAP HANA SQL Script







http://en.wikipedia.org/wiki/IBM

http://en.wikipedia.org/wiki/Donald_D._Chamberlin

http://en.wikipedia.org/wiki/Donald_D._Chamberlin

http://en.wikipedia.org/wiki/Raymond_F._Boyce

http://en.wikipedia.org/wiki/System_R

http://en.wikipedia.org/wiki/Standardization

http://en.wikipedia.org/wiki/American_National_Standards_Institute

http://en.wikipedia.org/wiki/American_National_Standards_Institute

http://en.wikipedia.org/wiki/International_Organization_for_Standardization

http://en.wikipedia.org/wiki/SQL/PSM

http://en.wikipedia.org/w/index.php?title=Procedural_SQL&action=edit&redlink=1

http://en.wikipedia.org/wiki/IBM_DB2

http://en.wikipedia.org/wiki/SQL_PL

http://en.wikipedia.org/wiki/IBM_Informix

http://en.wikipedia.org/w/index.php?title=Stored_Procedural_Language&action=edit&redlink=1

http://en.wikipedia.org/wiki/Netezza

http://en.wikipedia.org/wiki/Transact-SQL

http://en.wikipedia.org/wiki/Mimer_SQL


http://en.wikipedia.org/wiki/MySQL


http://en.wikipedia.org/wiki/MonetDB


http://en.wikipedia.org/wiki/NuoDB

http://en.wikipedia.org/wiki/Oracle_Database

http://en.wikipedia.org/wiki/PL/SQL

http://en.wikipedia.org/wiki/Ada_%28programming_language%29

http://en.wikipedia.org/wiki/PostgreSQL

http://en.wikipedia.org/wiki/PL/pgSQL

http://en.wikipedia.org/wiki/PostgreSQL

http://en.wikipedia.org/w/index.php?title=PL/PSM&action=edit&redlink=1

http://en.wikipedia.org/wiki/Sybase

http://en.wikipedia.org/w/index.php?title=Watcom-SQL&action=edit&redlink=1

http://en.wikipedia.org/wiki/Teradata

http://en.wikipedia.org/wiki/SPL

http://en.wikipedia.org/wiki/SAP_SE

http://en.wikipedia.org/wiki/SAP_HANA


10



Language Elements

The SQL language is subdivided into several language elements, including:

Queries, which retrieve the data based on specific criteria. This is an important element

of SQL.

Clauses, which are constituent components of statements and queries. (In some cases,

these are optional.) (Where clauses)

Expressions, which can produce either scalar values, or tables consisting of columns and

rows of data. (Select expression From ….;)

Predicates, which specify conditions that can be evaluated to SQL three-valued logic

(3VL) (true/false/unknown) or Boolean truth values and are used to limit the effects of

statements and queries, or to change program flow.

Statements, which may have a persistent effect on schemata and data, or may control

transactions, program flow, connections, sessions, or diagnostics. (DDL/DCL)

T-SQL Languages

The following languages are popular in any RDBMS SQL language, but we are going to expand

the commands which language in prospective of the Microsoft T-SQL.

DQL (Data Query Language) – Select

DML (Data Manipulating Language) – Insert, Update and Delete

DDL (Data Definition Language) – Create, Alter and Drop

DCL (Data Control Language) – Grant, Deny and Revoke

TCL (Transaction Control Language) – Begin Tran, Commit Tran and Rollback Tran








11



Microsoft SQL Server Overview

Microsoft & Sybase

Prior to version 7.0 the code base for MS SQL Server was sold by Sybase SQL Server to Microsoft,

and was Microsoft's entry to the enterprise-level database market, competing against Oracle,

IBM, and, later, Sybase. Microsoft, Sybase and Ashton-Tate originally worked together to create

and market the first version named SQL Server 1.0 for OS/2 (about 1989) which was essentially

the same as Sybase SQL Server 3.0 on Unix, VMS, etc. Microsoft SQL Server 4.2 was shipped

around 1992 (available bundled with IBM OS/2 version 1.3). Later Microsoft SQL Server 4.21 for

Windows NT was released at the same time as Windows NT 3.1. Microsoft SQL Server v6.0 was

the first version designed for NT, and did not include any direction from Sybase.

About the time Windows NT was released in July 1993, Sybase and Microsoft parted ways and

each pursued its own design and marketing schemes. Microsoft negotiated exclusive rights to all

versions of SQL Server written for Microsoft operating systems. (In 1996 Sybase changed the

name of its product to Adaptive Server Enterprise to avoid confusion with Microsoft SQL Server.)

Until 1994, Microsoft's SQL Server carried three Sybase copyright notices as an indication of its

origin.

SQL Server 7.0 and SQL Server 2000 included modifications and extensions to the Sybase code

base, adding support for the IA-64 architecture. By SQL Server 2005 the legacy Sybase code had

been completely rewritten.

Since the release of SQL Server 2000, advances have been made in performance, the client IDE

tools, and several complementary systems that are packaged with SQL Server 2005. These

include:

An extract-transform-load (ETL) tool (SQL Server Integration Services or SSIS)

A Reporting Server

An OLAP and data mining server (Analysis Services)

Several messaging technologies, specifically Service Broker and Notification Services








12



SQL Server Release History

Version Year Release Name Code Name Internal Version

1.0 (OS/2) 1989 SQL Server 1.0 (16 bit) Ashton-Tate / Microsoft SQL Server

-

1.1 (OS/2) 1991 SQL Server 1.1 (16 bit)

- -

4.21 (WinNT) 1993 SQL Server 4.21 SQLNT -

6 1995 SQL Server 6.0 SQL95 -

6.5 1996 SQL Server 6.5 Hydra -

7 1998 SQL Server 7.0 Sphinx 515

- 1999 SQL Server 7.0

(OLAP Tools) Palato mania -

8 2000 SQL Server 2000 Shiloh 539

8 2003 SQL Server 2000 Liberty 539

64-bit Edition

9 2005 SQL Server 2005 Yukon 611/612

10 2008 SQL Server 2008 Katmai 661

10.25 2010 Azure SQL DB Cloud Database or CloudDB

-

10.5 2010 SQL Server 2008 R2 Kilimanjaro (aka KJ) 665

11 2012 SQL Server 2012 Denali 706

12 2014 SQL Server 2014 SQL14 782







http://en.wikipedia.org/wiki/OLAP


13



DBMS Database Management Systems (DBMS) are computer software applications that interact with the user,

other applications, and the database itself to capture and analyze data. A general-purpose DBMS is

designed to allow the definition, creation, querying, update, and administration of databases. Well-

known DBMSs include MySQL, PostgreSQL, Microsoft SQL Server, Oracle, Sybase and IBM DB2. A

database is not generally portable across different DBMSs, but different DBMS can interoperate by using

standards such as SQL and ODBC or JDBC to allow a single application to work with more than one

DBMS. Database management systems are often classified according to the database model that they

support; the most popular database systems since the 1980s have all supported the relational model as

represented by the SQL language.

RDBMS A Relational Database Management System (RDBMS) is a database management system (DBMS) that is

based on the relational model as invented by E. F. Codd, of IBM's San Jose Research Laboratory.

Many popular databases currently in use are based on the relational database model.

RDBMSs are a common choice for the storage of information in new databases used for financial

records, manufacturing and logistical information, personnel data, and other applications since the

1980s. Relational databases have often replaced legacy hierarchical databases and network databases

because they are easier to understand and use. However, relational databases have been challenged by

object databases, which were introduced in an attempt to address the object-relational impedance

mismatch in relational databases, and XML databases.

In 1974, IBM began developing System R, a research project to develop a prototype RDBMS. Its first

commercial product was SQL/DS, released in 1981.However, the first commercially available RDBMS was

Oracle, released.

In 1979 by Relational Software, now Oracle Corporation.[6] Other examples of an RDBMS include DB2,

SAP Sybase ASE, and Informix.








14



Data Access Protocols

Any DBMS requires a connectivity driver/provider to be installed on the client’s machine.

This is very close to the PC hardware drivers when it is required to install them into the PC in order to be

able to connect the driver. Ex, the printer was coming in the past with Parallel Port connection and later

with USB port connection, and in order to prepare a PC to connect it, the proper driver get to be

installed compatible with the connected port.

The same with the DBMS, the prober driver/provider get to be installed on the client device to connect

to the target DBMS.

We will list the common standard known providers to the developers:

ODBC

In computing, ODBC (Open Database Connectivity) is a standard programming language

middleware API for accessing database management systems (DBMS). The designers of ODBC

aimed to make it independent of database systems and operating systems. An application

written using ODBC can be ported to other platforms, both on the client and server side, with

few changes to the data access code.

ODBC accomplishes DBMS independence by using an ODBC driver as a translation layer between

the application and the DBMS. The application uses ODBC functions through an ODBC driver

manager with which it is linked, and the driver passes the query to the DBMS. An ODBC driver

can be thought of as analogous to a printer or other driver, providing a standard set of functions

for the application to use, and implementing DBMS-specific functionality. An application that can

use ODBC is referred to as "ODBC-compliant". Any ODBC-compliant application can access any

DBMS for which a driver is installed. Drivers exist for all major DBMSs, many other data sources

like address book systems and Microsoft Excel, and even for text or CSV files.

ODBC was originally developed by Microsoft during the early 1990s, and became the basis for

the Call Level Interface (CLI) standardized by SQL Access Group in the Unix and mainframe world.

OLEDB

OLE DB (Object Linking and Embedding Database), sometimes written as OLEDB or OLE-DB.

It is an API designed by Microsoft, allows accessing data from a variety of sources in a uniform

manner. The API provides a set of interfaces implemented using the Component Object Model

(COM); it is otherwise unrelated to OLE. Microsoft originally intended OLE DB as a higher-level

replacement for, and successor to, ODBC, extending its feature set to support a wider variety of

non-relational databases, such as object databases and spreadsheets that do not necessarily

implement SQL.








15



OLEDB vs ODBC

The differences between these two data access protocols are technical, but in the most general

terms, OLE DB is newer, more advanced, and compatible with a wider selection of database

technologies. In addition, OLE DB is more general, in that it includes the ODBC functionality.

Technically speaking, ODBC (Open Database Connectivity) is designed to provide access primarily

to SQL data in a multi-platform environment. OLE DB (Object Linking and Embedding Database)

is designed to provide access to all types of data in an OLE Component Object Model (COM)

environment. OLE DB includes the SQL functionality defined in ODBC but also defines interfaces

suitable for gaining access to data other than SQL data.

A query generated via ODBC is functionally similar to a query generated by OLE-DB. However, be

aware that if you generate a query (via STATISTICA Query) using an ODBC connection,

STATISTICA uses OLE DB to connect to the ODBC connection. Because there is an intermediate

connection, ODBC queries may be slower than OLE DB queries.

JDBC

JDBC is a Java Database Connectivity technology (Java Standard Edition platform). This

technology is an API for the Java programming language that defines how a client may access a

database. It provides methods for querying and updating data in a database. JDBC is oriented

towards relational databases. A JDBC-to-ODBC bridge enables connections to any ODBC-

accessible data source in the JVM host environment.

Sun Microsystems released JDBC as part of JDK 1.1 on February 19, 1997. Since then it has been

part of the Java Standard Edition.

The JDBC classes are contained in the Java package java.sql and javax.sql.

ADO.Net Data Provider

An ADO.NET data provider is a software component that interacts with a data source comes with

the Microsoft .Net Framework. ADO.NET data providers are analogous to ODBC drivers, JDBC

drivers, and OLE DB providers.

ADO.NET providers can be created to access such simple data stores as a text file and

spreadsheet, through to such complex databases as Oracle Database, Microsoft SQL Server,

MySQL, PostgreSQL, SQLite, IBM DB2, Sybase ASE, and many others. They can also provide

access to hierarchical data stores such as email systems.

However, because different data store technologies can have different capabilities, every

ADO.NET provider cannot implement every possible interface available in the ADO.NET

standard.







http://docs.oracle.com/javase/8/docs/api/java/sql/package-summary.html

http://docs.oracle.com/javase/8/docs/api/javax/sql/package-summary.html


16



Starting from Microsoft .Net Framework v1.1, the developers don’t need to install any extra

API’s on the clients’ machines to connect to both SQL Server and Oracle Database Servers.

Just one important remark, the Microsoft ADO.Net Provider can connect to Microsoft SQL

Server v7.0 and higher. If the developers want to connect to Microsoft SQL Server 6.5 or lower,

they get to use the ODBC or OLEDB connection.








17



Indexes

A database index is a data structure that improves the speed of data retrieval operations on a database

table at the cost of additional writes and storage space to maintain the index data structure.

Indexes are used to quickly locate data without having to search every row in a database table every

time a database table is accessed.

Indexes can be created using one or more columns of a database table, providing the basis for both

rapid random lookups and efficient access of ordered records.

An index is a copy of select columns of data from a table that can be searched very efficiently that also

includes a low-level disk block address or direct link to the complete row of data it was copied from.

Index Architecture & Types

The data generally stored into records and each record cannot exceed the size of 8 KB (except

the very large data such as Binary, VarChar(Max), nVarChar(Max), Image, etc which are stored in

different pages with different structure). The records in turn are stored in pages and each page

cannot exceed the size 8 KB. The pages are grouped in extents while each extent can contain up

to 8 pages. That means each extent cannot exceed 64 KB.

The data in a table without any index created, are located in a logical partition calls Heap, while

with the index structure, that data is stored in the Leaf Level (LL), and the index definition keys

are stored in the Intermediate Level (IL) which is calls also as Non-Leaf Level.

If the table doesn’t contain any index (Clustered or Non-Clustered), the insert of the data would

be very fast because all the new records are added sequentially into the heap without taking

time to reorder them based on a sorting criteria such as in the case of the indexes. But in the

other hand, the search, update or delete would be very slow in comparing with the index

because it needs to scan the whole pages in the heap to retrieve the required data, update them

or delete them.

In SQL Server databases, each table each table can only contain one Clustered Index and up to

249 or 999 of Non-Clustered Indexes based on the SQL version.

Additional two types of indexes (XML and Spatial) were added initiated by SQL Server 2005 and

SQL Server 2008 respectively. The following points are representing the four types of indexes

(Clustered, Non-Clustered, XML and Spatial).








18



Clustered Index

The cluster index alters the data block into a certain distinct order to match the index, resulting

in the row data being stored physically in order for a specific column or combined with multiple

column which calls composite index. Therefore, only one clustered index can be created on a

given database table. Clustered indices can greatly increase overall speed of retrieval. Usually it

comes with the primary key index but not a necessary.

If a database table contains a clustered index, there won’t be an existing for heap because all the

data would be restructured on the Leaf Level with the extension of the intermediate level (Non-

Leaf Level).

The following figure illustrates the process of finding rows in a Clustered Index:

Syntax:

Create Clustered Index Index_Name On Table_Name(Col1, Col2, …);

Create Unique Clustered Index Index_Name On Table_Name(Col1, Col2, …);

Non-Clustered Index

The data is present in arbitrary order, but the logical ordering is specified by the index. The data

rows may be spread throughout the table regardless of the value of the indexed column or

expression. The non-clustered index tree contains the index keys in sorted order, with the leaf

level of the index containing the pointer to the record (page and the row number in the data

page in page-organized engines; row offset in file-organized engines).








19



The database table can contain one or more Non-Cluster index plus Clustered Index or without.

In case of a Clustered Index plus Non-Clustered Index, the data exists in the leaf level of the

Clustered Index while the Non-Clustered Index is creating a tree which is pointing into the data

in the Leaf Level of the clustered index as represented the next figure:

In case of a Non-Clustered Index without a Clustered Index, the data will be still lying in the

heap while the Non-Clustered Index’s is pointing to the data into the heap.

The next figure illustrates the process of finding rows in a Heap with a Non-Clustered Index:








20



Syntax:

Create NonClustered Index Index_Name On Table_Name(Col1, Col2, …);

Create Unique NonClustered Index Index_Name On Table_Name(Col1, Col2, …);

Note: If the index type (Clustered/NonClustered) not mentioned whether it is unique or not in

the syntax, the index is created as “NonClustered”.

Fill Factor

The Fill Factor is the defined ratio of filling the data into each page in the leaf level with each

index created.

If no Fill Factor is defined within an index, the page is filled with 100% of its capacity (8 KB).

Note: If the Fill Factor is defined as 0%, it would be the same respect of 100% which means all

the pages is filled with data.

It is very significant in performance to specify a fill factor to keep reserved empty blank space in

each page for updating data or deleting data. If the page is already 100% full and new item

should be inserted into this page, the page would be split in order to hold the new record based

on its order, while if it was having a predefined fill factor, the record would have a place to be

stored in. Defining the fill factor ratio should be calculated based on the amount of data

manipulation (insert, update or delete).

The disadvantage of the Fill factor, that is preserve extra disk space and it increases with the

increase of the Fill factor ratio, therefore, it must be revised wisely.

In SharePoint Content Databases, the fill factor is 80%.








21



It is healthy to rebuild the index from time to other to organize the data into the pages based on

the defined fill factor which could increase the data base size on the disk but in return, it

maintains the performance.

Syntax:

Create Clustered /NonClustered Index Index_Name On Table_Name(Col1, Col2, …) With

FillFactor = Fill_Ratio_As_Integer;

Create Unique Clustered /NonClustered Index Index_Name On Table_Name(Col1, Col2, …) With FillFactor = Fill_Ratio_As_Integer;

Pad Index

The Pad Index is optional keyword defined with the index creation statement to preserve the

same defined Fill Factor value to the Intermediate Level pages (Non-Leaf Level) as well;

otherwise, it will be 100% by default. (The intermediate Level fill factor cannot be different than

the leaf level fill factor).

Syntax:

Create Clustered /NonClustered Index Index_Name On Table_Name(Col1, Col2, …) With

Pad_Index, FillFactor = Fill_Ratio_As_Integer;

Create Unique Clustered /NonClustered Index Index_Name On Table_Name(Col1, Col2, …) With Pad_Index, FillFactor = Fill_Ratio_As_Integer;

XML Index

New index type was initiated by SQL Server 2005. XML indexes can be created on xml data type

columns. They index all tags, values and paths over the XML instances in the column and benefit

query performance. Your application may benefit from an XML index in the following situations:

Queries on XML columns are common in your workload. XML index maintenance cost

during data modification must be considered.

Your XML values are relatively large and the retrieved parts are relatively small. Building

the index avoids parsing the whole data at run time and benefits index lookups for

efficient query processing.

XML indexes fall into the following categories:

Primary XML index

Secondary XML index

The first index on the xml type column must be the primary XML index. Using the primary XML

index, the following types of secondary indexes are supported: PATH, VALUE, and PROPERTY.

Depending on the type of queries, these secondary indexes might help improve query

performance.








22



Spatial Index

Starting from SQL Server 2008 and higher are supporting spatial data and spatial indexes. A

spatial index is a type of extended index that allows you to index a spatial column. A spatial

column is a table column that contains data of a spatial data type, such as geometry or

geography.

The following table is representing the maximum indexes count per table:

Version Columns/

Table Columns/Index Key

Clustered Indexes

Non-Clustered Indexes

XML Indexes

SQL Server 2005 1024 16 1 249 249

SQL Server 2008 1024* 16 1 999 249

SQL Server 2008 R2 1024* 16 1 999 249

SQL Server 2012 1024* 16 1 999 249

SQL Server 2014 1024* 16 1 999 249

* In wide tables which is initiated by SQL Server 2008, it could take up to 30,000 columns per

table and also up to 1,000 indexes but with maximum tow size of 8KB such as the non-wide

table.








23



Database Shrinking

When the database is consuming much space on the hard-disk with much unallocated data, we can think

seriously on database shrinking which is trimming the unallocated data on the disk.

The unallocated data is caused by the much manipulation operations such as the data updating and

deleting; because if the data is moved from a page to another due to an update or even delete, the

reserved space on the disk is not freed up, furthermore, it may increase the index fragmentation.

Shrinking Log Files

The log can truncated with the Transaction Log backups. Otherwise, to shrink a log file, the

recovery model of database mush be set to “Simple” first and then run the “DBCC SHRINKFILE”

command.

Syntax:

USE DatabaseName;

GO

DBCC SHRINKFILE (LogFileName);

or

DBCC SHRINKFILE (LogFileName, NewFileSize_in_MB);

Shrinking Data Files

The data files can be shrunk with any recovery model using the “DBCC SHRINKFILE”:

Syntax:

USE DatabaseName;

GO

DBCC SHRINKFILE (DataFileName);

or

DBCC SHRINKFILE (DataFileName, NewFileSize_in_MB);

To avoid the data file size expansion after the defragmentation (index rebuilding) which would

be performed after the shrinking, always specify the new file size which is equal or greater than

the current allocated size with extra few gigabytes.

Check Fragmentations

You can always check the database fragmentation using the following T-SQL query:

Defragmentation

Use Database_Name;

Go

SELECT (Sch.Name + '.[' + T.Name + ']') As TableName, I.Name AS IndexName,

INS.index_type_desc As IndexType, Fill_Factor, Page_Count,

INS.avg_fragmentation_in_percent

FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL) As INS

INNER JOIN sys.indexes I ON I.object_id = INS.object_id AND I.index_id = INS.index_id

Inner Join sys.Tables As T On T.object_id = INS.object_id

Inner Join sys.Schemas As Sch On T.Schema_ID = Sch.Schema_ID

WHERE I.Name IS Not Null And INS.avg_fragmentation_in_percent > 0

ORDER BY INS.avg_fragmentation_in_percent DESC;








24



The defragmentation is done basically by rebuilding the index or reorganizing it using the Alter

Index command which would replace the usage of the two commands “DBCC INDEXDEFRAG”

and “DBCC DBREINDEX” in a future release of Microsoft SQL.

Defragmentation can be performed by rebuilding (if fragmentation is above 30%) and

reorganizing (if fragmentation is equal to or less than 30%) all the indexes in the database using

one of the following attached T-SQL scripts:

1. “Defrag_Database.sql” can be executed this script direct using the parameters listed on

the underneath table.

2. “JSP_Defrag_Database - Stored Procedure.sql” can be executed once per a time to create

the stored procedure in the memory until the current query session is closed. This

Stored Procedure can be used to defrag multiple databases in the same opened query

session using the following syntax:

JSP_Defrag_Database - Stored Procedure.sql

Defrag_Database.sql

Parameter Name Required/Optional

Data Type Default Value

Values Range

Description

@DataBase Optional nVarChar(1000) Null N/A The target database to be defragged. If not set, it would defrag the current connected database.

@FillFactor Optional SmallInt -1 0 - 100 The fill factor ratio; between 0 to 100

while 0 is the same respects as 100.

@Online Optional VarChar(3) On On | Off

If set to “On”, the database would be kept in the “Online” mode, if set to “Off” the database would be set to the “Offline” mode.

@XMLOnlineMode Optional VarChar(10) Reorg Off | Reorg

Both XML and Spatial indexes are not supported in rebuilding with “Online”

mode (@Online = ‘On’), therefore, it must be to set wither to “Off” to be rebuilt with “Offline” mode, or “Reorg” to be reorganized instead of the rebuilding.

@SpatialOnlineMode Optional VarChar(10) Reorg Off | Reorg

Execute ##JSP_Defrag_Database @DataBase = 'AdventureWorks2008R2';







http://ahmad.jempress.com/attachments/SQL/Defrag_Database.zip

http://ahmad.jempress.com/attachments/SQL/JSP_Defrag_Database%20-%20Stored%20Procedure.zip


25



Important Remarks:

1. The “Online = On” option will not work with the following cases:

Non Enterprise Edition of SQL.

The index is an XML index.

The index is a Spatial index.

The index is on a local temp table.

The index is clustered and the table contains a LOB database column(s).

The index is not clustered and the index itself contains a LOB database

column(s).

2. Do not execute the above script on Microsoft SQL Server 2012 or Microsoft SQL Server 2014

with “Online” option.

If using SQL Server 2012 or SQL Server 2014, mind there is a bug where the online index rebuild

can cause index corruption or data loss when it is used together with concurrent queries that

modify many rows.

Apply a CU that includes hotfix http://support.microsoft.com/kb/2969896.

For SQL Server 2012 SP1, apply at least CU11.

For SQL Server 2012 SP2, apply at least CU1.

For SQL Server 2014, apply at least CU2.

The following T-SQL script is useful to get all the info about the current installed SQL:

SELECT

SERVERPROPERTY('ServerName') AS ServerName,

SERVERPROPERTY('MachineName') AS MachineName,

SERVERPROPERTY('ComputerNamePhysicalNetBIOS') AS CurrentActiveClusterMachine,

SERVERPROPERTY('InstanceName') AS InstanceName,

SERVERPROPERTY('ProductVersion') AS ProductVersion,

SERVERPROPERTY('ProductLevel') AS ProductLevel,

SERVERPROPERTY('Edition') AS Edition,

SERVERPROPERTY('EngineEdition') AS EngineEdition,

SERVERPROPERTY('LicenseType') AS LicenseType,

SERVERPROPERTY('Collation') AS Collation,

SERVERPROPERTY('BuildClrVersion') AS BuildClrVersion;







http://support.microsoft.com/kb/2969896


26



References

http://sqlity.net/en/2445/b-plus-tree/

https://technet.microsoft.com/en-us/magazine/2009.07.sqlbackup.aspx

https://msdn.microsoft.com/en-us/library/ms188388.aspx





http://en.wikipedia.org/wiki/Database_index


http://en.wikipedia.org/wiki/SQL

http://sqlity.net/en/2462/intermediate-index-pages/

http://sqlity.net/en/563/index-misconceptions-tsql-tuesday-026-second-chances/

https://www.simple-talk.com/sql/learn-sql-server/sql-server-index-basics/

http://use-the-index-luke.com/sql/anatomy/the-tree

http://en.wikipedia.org/wiki/Tree_%28data_structure%29

http://en.wikipedia.org/wiki/Binary_tree

http://en.wikipedia.org/wiki/B-tree


http://en.wikipedia.org/wiki/Microsoft_SQL_Server


http://en.wikipedia.org/wiki/OLE_DB

http://en.wikipedia.org/wiki/Open_Database_Connectivity

http://en.wikipedia.org/wiki/Java_Database_Connectivity

http://en.wikipedia.org/wiki/Database

http://en.wikipedia.org/wiki/Relational_database_management_system

http://documentation.statsoft.com/STATISTICAHelp.aspx?path=Query/StatQuery/FAQ/WhatIstheDifferencebe

tweenODBCandOLEDB

http://sqlmag.com/sql-server/ole-db-or-odbc

http://en.wikipedia.org/wiki/ADO.NET_data_provider

https://msdn.microsoft.com/en-us/library/system.data.oracleclient%28v=vs.71%29.aspx

https://msdn.microsoft.com/en-us/library/system.data.sqlclient%28v=vs.110%29.aspx

http://blogs.msdn.com/b/blogdoezequiel/archive/2011/07/03/adaptive-index-defrag.aspx#.VRLtM3vMhWw


https://msdn.microsoft.com/en-us/library/ms143432%28v=sql.90%29.aspx






https://technet.microsoft.com/en-us/library/bb964712%28v=sql.105%29.aspx







http://sqlity.net/en/2445/b-plus-tree/







http://en.wikipedia.org/wiki/Database_index


http://en.wikipedia.org/wiki/SQL

http://sqlity.net/en/2462/intermediate-index-pages/

http://sqlity.net/en/563/index-misconceptions-tsql-tuesday-026-second-chances/

https://www.simple-talk.com/sql/learn-sql-server/sql-server-index-basics/

http://use-the-index-luke.com/sql/anatomy/the-tree

http://en.wikipedia.org/wiki/Tree_%28data_structure%29

http://en.wikipedia.org/wiki/Binary_tree

http://en.wikipedia.org/wiki/B-tree


http://en.wikipedia.org/wiki/Microsoft_SQL_Server


http://en.wikipedia.org/wiki/OLE_DB

http://en.wikipedia.org/wiki/Open_Database_Connectivity

http://en.wikipedia.org/wiki/Java_Database_Connectivity

http://en.wikipedia.org/wiki/Database

http://en.wikipedia.org/wiki/Relational_database_management_system

http://documentation.statsoft.com/STATISTICAHelp.aspx?path=Query/StatQuery/FAQ/WhatIstheDifferencebetweenODBCandOLEDB

http://documentation.statsoft.com/STATISTICAHelp.aspx?path=Query/StatQuery/FAQ/WhatIstheDifferencebetweenODBCandOLEDB

http://sqlmag.com/sql-server/ole-db-or-odbc

http://en.wikipedia.org/wiki/ADO.NET_data_provider

https://msdn.microsoft.com/en-us/library/system.data.oracleclient%28v=vs.71%29.aspx

https://msdn.microsoft.com/en-us/library/system.data.sqlclient%28v=vs.110%29.aspx

http://blogs.msdn.com/b/blogdoezequiel/archive/2011/07/03/adaptive-index-defrag.aspx#.VRLtM3vMhWw







27



https://msdn.microsoft.com/en-us/library/bb934196.aspx

https://msdn.microsoft.com/en-us/library/bb895265.aspx

https://technet.microsoft.com/en-us/library/ms186986%28v=sql.100%29.aspx


http://en.wikipedia.org/wiki/UB-tree

http://www.scholarpedia.org/article/B-tree_and_UB-tree

http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&ved=0CCwQFjAC&url=http%3A%2F%

2Fwww.vldb.org%2Fconf%2F2000%2FP263.pdf&ei=DusVVePmDMTiapCtgmg&usg=AFQjCNETDk6oAlHSAY36Xj

W9v_ylryPorg&sig2=OX_VuUsxUW4RKI32a7Ic9w&bvm=bv.89381419,d.d24&cad=rja

https://www.simple-talk.com/sql/t-sql-programming/sql-server-spatial-indexes/








http://en.wikipedia.org/wiki/UB-tree

http://www.scholarpedia.org/article/B-tree_and_UB-tree

http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&ved=0CCwQFjAC&url=http%3A%2F%2Fwww.vldb.org%2Fconf%2F2000%2FP263.pdf&ei=DusVVePmDMTiapCtgmg&usg=AFQjCNETDk6oAlHSAY36XjW9v_ylryPorg&sig2=OX_VuUsxUW4RKI32a7Ic9w&bvm=bv.89381419,d.d24&cad=rja



Download - SQL Server Indexes Management Overviewahmad.jempress.com/attachments/SQL/SQL Server Indexes Manage… · Version 1.0 SQL Server Indexes Management Overview A h m a d A d e l G a d

Top Related