mcts self-paced training kit (exam 70-433) microsoft sql server 2008—database development

25
MCTS Self-Paced Training Kit (Exam 70-433): Microsoft SQL Server 2008—Database Development Notes from the Book

Upload: martin-surasky

Post on 28-Jul-2015

123 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

MCTS Self-Paced Training Kit (Exam 70-433): Microsoft SQL Server 2008

—Database DevelopmentNotes from the Book

Page 2: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Chapter 1 - Data Retrieval When the ANSI_NULLS setting is set to ON, the default configuration, you cannot equate

values to NULL. If you try, the server returns "unknown"

When the database engine parses and compiles a query, conditions that include Boolean operators are evaluated in the following order: NOT, AND, OR

Neither leading wildcard characters nor NOT logic allow the query optimizer to use indexes to optimize the search. For optimal performance, you should avoid using the NOT keyword and leading wildcard symbols.

The CHECKSUM_AGG aggregate function is not compatible with ROLLUP, CUBE, or GROUPING SETS

Rollup & Cube defined

GROUP BY [GROUP_BY_COLUMN_1], [GROUP_BY_COLUMN_2], … WITH [ROLLUP|CUBE]

Grouping Sets defined

GROUP BY GROUPING SETS (([GROUPING_SET_1), ([GROUPING_SET_2),…)

I would like to read a bit more on Rollup, CUBE and Grouping Sets

EXCEPT and INTERSECT operators were introduced in SQL Server 2005 to provide the ability to compare the results from two queries and provide a new result set based on whether or not there are rows in common between the result sets.

When using UNION, the ALL keyword specifies that all rows, including duplicate rows, should be returned. By default, duplicate rows are not returned.

DateTime functions:

Returning types that use more memory (datetime2(7) & datetimeoffset(7)): SYSDATETIME, SYSDATETIMEOFFSET, and SYSUTCDATETIME

Returning types that use less memory (datetime): GETDATE, CURRENT_TIMESTAMP, and GETUTCDATE

CURRENT_TIMESTAMP and GETDATE functions produce the same results

Page 3: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Chapter 2: Modifying Data—The INSERT, UPDATE, DELETE, and MERGE Statements OUTPUT clause

MERGE clause

SAVE TRANSACTION

Locks can be assigned to resources such as rows, key, pages, extents, tables, and databases.

Shared (S) Placed on resources for read (SELECT) operations. Shared locks are compatible with other shared locks. Shared locks are not compatible with exclusive locks. When the isolation level is set to REPEATABLE READ or higher, or a locking hint is used, the shared locks are retained for the duration of the transaction. Otherwise, shared locks are released as soon as the read is completed

Update (U) Placed on resources where a shared (S) lock is required, but the need to upgrade to an exclusive (X) lock is anticipated. Only one transaction at a time can obtain an update lock on a resource. When modification to the resource is required, the update lock is upgraded to an exclusive lock.

Exclusive (X) Placed on resources for data modification. An exclusive lock is not compatible with any other type of lock. Only the NOLOCK hint or the READ UNCOMMITTED isolation level overrides an exclusive lock's functionality

Intent (IS, IX, SIX) Placed on resources to improve performance and locking efficiency by placing intent (IS, IX, SIX) locks at a high-level object (such as a table) before placing shared (S) or exclusive (X) locks at a lower level (such as the page level).

Schema (Sch-M, Sch-S) Schema modification (Sch-M) locks are placed on objects during schema modification operations, such as adding a new column to a table. Schema stability (Sch-S) locks are placed on objects while queries are being compiled or executed. Sch-M locks block all other operations until the lock is released. Sch-S locks are not compatible with Sch-M locks

Bulk Update (BU) Placed on tables for bulk insert. These locks allow multiple bulk insert threads to access the table but do not allow other processes to access the table. These locks are enabled by either using the TABLOCK hint or by using the sp_tableoption stored procedure to enable the Table lock on bulk load table option

Key-range Placed on a range of rows to protect against phantom insertions and deletions in a record set that is being accessed by a transaction. These locks are used by transactions using the SERIALIZABLE transaction isolation level.

Database concurrency issues

Dirty Reads occur when one transaction reads data written by another, uncommitted, transaction. The danger with dirty reads is that the other transaction might never commit, leaving the original transaction with "dirty" data.

Non-repeatable Reads occur when one transaction attempts to access the same data twice and a second transaction modifies the data between the first transaction's read attempts. This may cause the first transaction to read two different values for the same data, causing the original read to be non-repeatable

Phantom Reads occur when one transaction accesses a range of data more than once and a second transaction inserts or deletes rows that fall within that range between the first transaction's read attempts. This can cause "phantom" rows to appear or disappear from the first transaction's perspective.

Types of transaction isolation levels (SET TRANSACTION ISOLATION LEVEL)

The READ UNCOMMITTED Isolation Model offers essentially no isolation between transactions. Any transaction can read data written by an uncommitted transaction.

Page 4: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

This leaves the transactions vulnerable to dirty reads, phantom reads and non-repeatable reads.

The READ COMMITTED Isolation Model is SQL Server’s default behavior. In this model, the database does not allow transactions to read data written to a table by an uncommitted transaction. This model protects against dirty reads, but provides no protection against phantom reads or non-repeatable reads

The REPEATABLE READ Isolation Model goes a step further than the Read Committed model by preventing transactions from writing data that was read by another transaction until the reading transaction completes (using shared locks that are maintained until the current transaction is completed). This isolation model protect against both dirty reads and non-repeatable reads.

The SERIALIZABLE Isolation Model uses range locks to prevent transactions from inserting or deleting rows in a range being read by another transaction. The Serializable model protects against all three concurrency problems

The SNAPSHOT Isolation Model also protects against all three concurrency problems, but does so in a different manner. It provides each transaction with a "snapshot" of the data it requests. The transaction may then access that snapshot for all future references, eliminating the need to return to the source table for potentially dirty data. Requires the ALLOW_SNAPSHOT_ISOLATION database option to be set to ON.

Page 5: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Chapter 3: Tables, Data Types, and Declarative Data Integrity COLLATE statement (WHERE Name = 'öqvist' COLLATE Finnish_Swedish_CI_AS). By

changing the collation in the expression, SQL Server cannot use an index defined on the column to perform a seek operation because that index is sorted according to another collation

Type groups: exact numeric, approximate numeric, character, date and time, and binary

Exact: tinyint (1), smallint(2), int(4), bigint (8), decimal(p,s) & numeric(p,s)

Precision to storage requirements 1 to 9 5 bytes10 to 19 9 bytes20 to 28 13 bytes29 to 38 17 bytes

Approximate: real & float

Date and time: datetime (8 bytes – backward compatibility), smalldatetime (4 bytes – backward compatibility), datetime2 (6 to 8 bytes – if time offset is not required), datetimeoffset (8 to 10 bytes – when time zones are required), date (3 bytes – when only date is required), time (3 to 4 bytes – when only time is required)

Data compression is implemented in two levels: row and page. The following statement configures a table to use compression:ALTER TABLE HR.Employees REBUILD WITH (DATA_COMPRESSION = PAGE|ROW);

If you turn on row-level compression, SQL Server changes the format used to store rows. In simple terms, this row format converts all data types to variable-length data types. It also uses no storage space to store NULL values.

Page-level compression includes row-level compression and also introduces pointers between rows in the same page to avoid storing redundant data.

There are five types of constraints: PRIMARY KEY, UNIQUE, FOREIGN KEY, CHECK, and DEFAULT

sys.foreign_keys and sys.foreign_key_columns

One important aspect of check constraints is that they reject values that evaluate to False (consider NULL)

Now turn off the foreign keyALTER TABLE [TABLE_NAME] NOCHECK CONSTRAINT [CONSTRAINT_NAME]

Turn on the foreign keyALTER TABLE [TABLE_NAME] [WITH CHECK] CHECK CONSTRAINT [CONSTRAINT_NAME]

Page 6: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Chapter 4 - Using Additional Query Techniques A recursive CTE expands the definition of the table expression and consists of two parts:

An anchor query, which is the source of the recursion, along with a UNION ALL statement and a second query, which recurses across the anchor query.

An outer query, which references the routine and specifies the number of recursion levels OPTION (MAXRECURSION [MAX_RECURSION])

You can construct queries with two types of subqueries: correlated and noncorrelated

When you embed a SELECT statement into a FROM clause, you are using a feature referred to as derived tables

SQL has four functions that can be used for ranking data: ROW_NUMBER, RANK, DENSE_RANK, and NTILE

OPTION (FAST 50)

Parametrization

SQL Server can auto-parameterize some queries to aid with execution plan reuse. This process is called simple parameterization

With simple parameterization SQL Server can only parameterize a small subset of queries

The performance of transactional applications with heavy volumes of data changes can benefit from forced parameterization by the virtue of increased reuse of query execution plans.

To change parametrization, this is the T-SQLALTER DATABASE AdventureWorksDW SET PARAMETERIZATION FORCED

Page 7: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Chapter 5: Programming Microsoft SQL Server with T-SQL User-Defined Stored Procedures, Functions, Triggers, and Views The only commands that cannot be used in a stored procedure are the following (13):

USE <database name>

SET SHOWPLAN_TEXT

SET SHOWPLAN_ALL

SET PARSEONLY

SET SHOWPLAN_XML

CREATE AGGREGATE

CREATE RULE

CREATE DEFAULT

CREATE SCHEMA

CREATE FUNCTION or ALTER FUNCTION

CREATE TRIGGER or ALTER TRIGGER

CREATE PROCEDURE or ALTER PROCEDURE

CREATE VIEW or ALTER VIEW

@@IDENTITY vs. SCOPE_IDENTITY()

Error Messages

The error number is an integer value. Error messages that ship with SQL Server are numbered from 1 to 49999

SQL Server defines 26 severity levels numbered from 0 through 25. Any error with a severity level of 16 or higher is logged automatically to the SQL Server error log and the Windows Application Event Log. Errors with a severity level of 19 to 25 can be specified only by members of the sysadmin fixed server role. Errors with a severity level of 20 to 25 are considered fatal and cause the connection to be terminated and any open transactions to be rolled back

You can create your own custom error messages, which must be numbered 50001 and higher. (The number 50000 is reserved to designate a message whose number is not specified.)

sys.messages catalog view

If you want the transaction to either complete entirely or fail entirely, you can use the SET command to change the XACT_ABORT setting on your connection

If an error with a severity less than 20 is encountered within the TRY block, control passes to the corresponding CATCH block

XACT_ABORT behaves differently when used in conjunction with a TRY block. Instead of terminating the transaction, control is transferred to the CATCH block. However, if XACT_ABORT is turned on, any error is fatal. The transaction is left in a doomed state and XACT_STATE returns -1. Therefore, you cannot commit a transaction inside a CATCH block if XACT_ABORT is turned on

EXECUTE AS has three possible arguments:

OWNER Specifies the statements inside the module executes in the context of the current owner of the module.

[USER_NAME] Executes under the security context of the specified database user. This account can't be a role, group, certificate, or asymmetric key.

Page 8: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

CALLER Executes under the security context of the routine that called the module (default)

SELF is equivalent to EXECUTE AS user_name, where the specified user is the person creating or altering the module

You can declare four different types of cursors (4):

FAST_FORWARD The fastest performing cursor type because it allows you only to move forward one row at a time. Scrolling (discussed later in this section) is not supported. A FAST_FORWARD cursor is the same as declaring a FORWARD_ONLY, READ_ONLY cursor. FAST_FORWARD is the default option for cursors.

STATIC The result set is retrieved and stored in a temporary table in the tempdb database. All fetches go against the temporary table and modifications to the underlying tables for the cursor are not visible. A STATIC cursor supports scrolling, but modifications are not allowed.

KEYSET The set of keys that uniquely identify each row in the cursor result set is stored in a temporary table in tempdb. As you scroll within the cursor, non-key columns are retrieved from the underlying tables. Therefore, any modifications to rows are reflected as the cursor is scrolled. Any inserts into the underlying table are not accessible to the cursor. If you attempt to access a row that has been deleted, @@FETCH_STATUS returns -2.

DYNAMIC The most expensive cursor to use. The cursor reflects all changes made to the underlying result set, including newly inserted rows as the cursor is scrolled. The position and order of rows within the cursor can change each time a fetch is made. The FETCH ABSOLUTE option is not available for dynamic cursors.

T-SQL has three concurrency options available for cursors (3):

READ_ONLY SQL Server does not acquire a lock on the underlying row in the table because a cursor marked as READ_ONLY cannot be updated.

SCROLL_LOCKS A lock is acquired as each row is read into the cursor, guaranteeing that any transaction executed against the cursor succeeds.

OPTIMISTIC A lock is not acquired. SQL Server instead uses either a timestamp or a calculated checksum in the event that a timestamp column does not exist to detect if the data has changed since being read into the cursor. If the data has changed, the modification fails.

SQL Server limits the maximum size of the query cache to approximately 20 percent of the memory allocated to the SQL Server instance

WHERE CURRENT OF {cursor_name}

As a best practice, you should try to avoid creating queries that use a function in the WHERE clause because the function would have to execute for each potential row returned from the results of the FROM clause

sp_settriggerorder system stored procedure

NOT FOR REPLICATION option on triggers

ON ALL SERVER option vs ON DATABASE option on DDL triggers

While DML triggers have access to the inserted and deleted tables, DDL triggers have access to the EVENTDATA function. EVENTDATA returns an Extensible Markup Language (XML) document with some valuable data about the DDL transaction.

If you execute a ROLLBACK statement within a logon trigger, the connection to the instance terminates.

The VIEW_METADATA option is used when creating an updatable view and causes SQL Server to return to client applications metadata about the view, instead of about the tables underlying the view.

Page 9: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

The WITH CHECK OPTION clause requires that the only data manipulation that can occur through the view must also be retrievable when you select from the view.

You can modify data through a view so long as the following requirements are met (6):

The data modification must reference exactly one table.

Columns in the view must reference columns in a table directly.

The column cannot be derived from an aggregate.

The column cannot be computed as the result of a UNION/UNION ALL, CROSSJOIN, EXCEPT, or INTERSECT.

The column being modified cannot be affected by the DISTINCT, GROUP BY, or HAVING clause.

The TOP operator is not used.

When you split member tables of a partitioned view across SQL Server instances, you create a special case called a distributed partitioned view

When an index is created against a view, the data is materialized

Page 10: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Chapter 6: Techniques to Improve Query Performance The query cost is typically a good performance measurement, but when certain items are

used in a query, such as scalar user-defined functions (UDFs) and Common Language Runtime (CLR) routines, the cost for these items is not calculated, which renders the query cost lower than a truly accurate assessment

Page Reads (enabled by entering SET STATISTICS IO ON) also has the same problem with scalar UDFs and CLR routines as the query cost, which is that page reads caused by these routines are not included in the output of STATISTICS IO

Theoretical Query Execution Order (8)

FROM, JOIN, APPLY, and ON

WHERE

GROUP BY and aggregate functions (such as SUM, AVG, and so on) that are included in the query

HAVING

SELECT

ORDER BY

TOP

FOR XML

Theoretical Query Execution Order (When using UNION) (8)

FROM, JOIN, APPLY, and ON

WHERE

GROUP BY and aggregate functions (such as SUM, AVG, and so on) that are included in the query

HAVING

TOP (as a result of this, some queries do not return the correct values because, on UNION statements, the ORDER BY clause is executed after the TOP clause)

UNION and SELECT

ORDER BY

FOR XML

Things to watch out on a graphical execution plan

Thick Arrows

Hash operations on GROUP BY or JOINS

Sorts with high percentage of query cost (consider creating an index)

Large Plans

Table or clustered index scans

Scalar UDF’s can significantly degrade performance. The reason for this is that these functions are not expanded and optimized into the main query plan by the optimizer; rather, they are just called from the execution plan without any optimization based on the context into which it is inserted in the plan

To identify queries that need to be optimized, a good starting point with profiler would be the SQL:BatchCompleted and RPC:Completed events (use CPU, Reads, Writes and Time columns to start)

Page 11: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Clustered index should generally be defined on columns that are often queried and typically return a lot of data (a very frequently queried FK column or very frequently queried date column)

After you have created the unique clustered index on the view, it is possible to create additional nonclustered indexes on the same view

If you are using any edition of SQL Server other than Enterprise Edition or Developer Edition, your query must reference the view directly, and you must add the optimizer hint WITH(NOEXPAND) to the query

You can query the sys.dm_db_index_usage_stats dynamic management view (DMV) to find index-usage information

To be able to partition tables and indexes, you first need to create two objects: a partition function and a partition scheme

Partition functions are created using the CREATE PARTITION FUNCTION statement

CREATE PARTITION FUNCTION pf (INT)

AS RANGE [LEFT|RIGHT]

FOR VALUES ({ENTER_VALUES_HERE});

Partition schemes are created using the CREATE PARTITION SCHEME statement

CREATE PARTITION SCHEME PS

AS PARTITION PF TO ({ENTER_PARTITIONS_HERE});

It is recommended that a table and all its keys be created on the same partition scheme. A table with all indexes defined on the same partition scheme is said to have "aligned partitions."

nonunique, nonclustered indexes are aligned automatically with the table's partition scheme when created; that is, you don't even need to specify the ON clause for the CREATE INDEX statement. For unique indexes, however, you must include the partitioning column in the index key to make it aligned with the table

Alters the partition function to add a new devider

ALTER PARTITION FUNCTION PFCustomerID()

SPLIT RANGE ({NUMBER_THAT_YOU_WANT_TO_BE_THE_DIVIDER});

Add an existing table to a partition

ALTER TABLE {NEW_TABLE} SWITCH TO {PARTITIONED_TABLE} PARTITION {PARTITION_NUMBER};

Tunning Indexes Automatically

When viewing the graphical execution plan, you may see a note about a "Missing Index." In this case, you can right-click the missing index note, which lets you retrieve the script needed to create the missing index

You can also use the Database Engine Tuning Advisor graphical utility to retrieve information on indexes, indexed views, and even partitioning solutions that may help query performance

Page 12: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Chapter 7: Extending Microsoft SQL Server Functionality with XML, SQLCLR, and Filestream In SQL Server 2008, there are four flavors (or modes) of the FOR XML clause: RAW, AUTO,

EXPLICIT, and PATH (PATH mode is the best choice of the different FOR XML modes for most solutions)

RAW supports naming the element (as opposed to just <row>) by using: FOR XML RAW(‘{Element_name}’)

ROOT directive

ELEMENTS [XSINIL] directive (exist only in RAW and AUTO)

AUTO doesn't support more than one path of branches

EXPLICIT is difficult to write and maintain, but is very flexible (you can create virtually any XML structure, even some not supported by the PATH mode)

When using subqueries of XML data inside other query containing XML data, the inner query needs to use the TYPE directive. Without the TYPE option, SQL Server interprets the result of the subquery as an XML-encoded string.

WITH XMLNAMESPACES clause

The XML data type can be either typed or untyped. Typed simply means that an XML schema collection is assigned to the type (SCHEMA) to verify its contents.

The exist method returns a bit value and is used to verify if an XPath expression is found within an XML instance

The value method is used to perform an XQuery against the XML instance to fetch a single scalar value from it

The nodes method is used to shred the XML into a tabular form

The query method is used to perform an XQuery against the XML instance to retrieve an XML fragment rather than a scalar value or a tabular result. In the query method, you use the XQuery language's FLWOR expressions to retrieve the parts of the XML document that you need and present it in the way you want.

You can optionally associate a collection of XML schemas with a variable, parameter, or column of xml type. In this case, the xml data type instance is called typed

You must first register the XML schema collection by using CREATE XML SCHEMA COLLECTION

Then define your typed XML data type like thisDECLARE @{variable_name} xml ({schema_name})

Once typed xml is defined, you can take advantage of storage and query optimizations based on type information

Internal representation of XML documents in SQL Server (tree representation of the various nodes)

To create an internal representation of the XML document, useEXEC sp_xml_preparedocument @hdoc OUTPUT, @doc

To remove the internal representation, useEXEC sp_xml_removedocument @hdoc

What is the modify method?

To use CLR within the database, you must perform the following steps (5):

1. You must set the SQL Server instance to allow CLR code to run (EXEC sp_configure 'clr enabled', 1; RECONFIGURE;)

2. You must write the code that the object uses with a .NET language

Page 13: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

3. You must compile the code to an assembly (a CLR executable).

4. You must load the assembly into SQL Server (CREATE ASSEMBLY statement).

5. Finally, you must create the database object and point it to the assembly using Data Definition Language (DDL) (EXTERNAL NAME).

When the statement is executed, the assembly is copied physically into the database to which you are attached. Therefore, you can delete the .dll assembly file after it has been loaded because SQL Server doesn't need it.

To see which assemblies exist within a database, you can query the sys.assemblies catalog view

You can also store the CLR source code in the database. SQL Server does not use the source code, but it is handy to store the code in the database because it is then possible to retrieve it from the database later (you may have lost the actual source code by then…). To add a source code file to the database, execute the following statement:ALTER ASSEMBLY {ASSEMBLY_NAME} ADD FILE FROM {'PATH_TO_.CS_OR_.VB_FILE'};

Table-Valued CLR UDF

While a scalar UDF consists of a single method, a table-valued UDF consists of two methods:

A method that acts as the iterator, looping over the values that should be returned as rows

A method that is executed for every row and populates the actual row being returned to the SQL Server execution engine

[SqlFunction(FillRowMethodName = "{METHOD_TO_POPULATE_THE_ACTUAL_ROW}", TableDefinition = "{TABLE_COLUMNS_AND_TYPES}")] attribute

yield statement (used for the method that acts as the iterator to return values to the enumerator object)

Creating a CLR trigger is very similar to creating a CLR stored procedure.

When creating a CLR user-defined aggregate, you must create an entire CLR type (either class or struct) rather than just one or two methods as with the previous CLR objects. This is because the aggregate needs to be able to do a few things, including the following:

Initialize itself: This is performed using the Init method of the aggregate CLR type.

Add another value to the calculation: For each value that needs to be added to the calculation, the Accumulate method is called.

Combine itself with another instance of the same aggregate function: This is performed by calling the Merge method.

Return its result: This is performed by calling the Terminate method.

When creating the CLR used-defined aggregate, .net code uses the SqlUserDefinedAggregate attribute to inform SQL Server of its various behaviors

You create an aggregate in T-SQL by using the CREATE AGGREGATE statement

When creating the CLR user data type, .net code uses the SqlUserDefinedType attribute to inform SQL Server of its various behaviors

You create a type in T-SQL by using the CREATE TYPE statement

The CLR code that is used within your database can be placed in one of three different "permission sets": SAFE (default), EXTERNALACCESS, and UNSAFE.

SAFE: Only current database

EXTERNALACCESS: Other SQL Databases and Instances, even the network or file-system

Page 14: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

UNSAFE: Less safe of the three, this even allows you to execute unmanaged code

Filestream is typically recommended if the data that you store in the column is at least 1 megabyte (MB) in size

To use Filestream, you must also create a filegroup that contains a Filestream database file (CONTAINS FILESTREAM)

For a table to contain Filestream columns, it must have a uniqueidentifier column marked with the property ROWGUIDCOL and having a unique constraint defined on it.

When you delete Filestream data (either by using an UPDATE or a DELETE statement), SQL Server doesn't immediately delete the file. Instead, the files are deleted when the Filestream garbage collection process is run. This process in turn is run when the database checkpoint process is executed.

Chapter 8: Extending Microsoft SQL Server Functionality with the Spatial, Full-Text Search, and Service Broker

Spatial Data Differenece between geometry and geography

Open Geospatial Consortium (OGC) An international, nonprofit organization that develops standards for geospatial- and location-based services.

geography::Parse (static method)

{geography_column}.ToString() (instance method)

Both the Parse and STGeomFromText methods return a geometry instance from an OGC WKT representation. The difference between the commands is that Parse assumes an SRID of 0 as a parameter. STGeomFromText includes an argument to specify the SRID

The figure below depicts the geometry hierarchy upon which the geometry and geography data types are based. The instantiable types of geometry and geography are indicated in blue.

Properties for a geometry are {geometry_object}.STX, {geometry_object}.STY, {geometry_object}.Z, {geometry_object}.M

Page 15: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

How to create a multipoint? geometry::STGeomFromText('MULTIPOINT((2 3), (7 8 9.5))', 23);orgeometry::STMPointFromText('MULTIPOINT((2 3), (7 8 9.5))', 23);

And how to retrieve a subset of that data?SELECT @g.STGeometryN(2).STAsText();

For a LineString instance to be valid it must meet the following criteria.

1. The LineString instance must be accepted (2 points or empty).

2. If a LineString instance is not empty then it must contain at least two distinct points.

3. The LineString instance cannot overlap itself over an interval.

{linestring}.STIsValid() method

{linestring}.STLength() method

Full Text Search By using the CONTAINS and FREETEXT predicates as well as the CONTAINSTABLE and

FREETEXTTABLE functions, you can write queries that return the following:

o Inflectional forms of a verb that you input

o Matched results from data stored as a PDF file in a varbinary(max) column

o Synonyms of the search term located through a thesaurus search

The MSFTESQL Service no longer exists in SQL Server 2008 (it’s all integrated in the SQL Server process – Sqlservr.exe)

Filter daemon host process (Fdhost.exe) Runs as an isolated process to host third-party components, thus protecting the SQL Server process from those components.

SQL Full-Text Filter Daemon Launcher (Fdlauncher.exe) Starts Fdhost.exe processes when required.

Full-Text Terminology: Term, Full-text catalog, Full-text index, Word breaker, Token, Stemmer, Thesaurus, Stopword, Stoplist, Filter, Population (crawl), Full-text engine,

The first step in configuring full-text indexing on a database is to create a full-text catalog on the database where you want to query data by using full-text search capabilities. You cannot create full-text catalogs in the master, model, or tempdb databases. You can use the CREATE FULLTEXT CATALOG command to create the full-text catalog

Before you can create a full-text index by using the CREATE FULLTEXT INDEX command, you must satisfy the following requirements:

o A full-text index cannot already exist on the table. You can create only one full-text index per table.

o A unique key index must exist on the table. This key index must be based on a unique, single-key column that does not allow NULL values.

o A full-text catalog must exist in the respective database. If a default catalog does not exist in the table's database, you must specify a catalog name in your CREATE FULLTEXT INDEX command

When creating full text indexes you can use the CHANGE_TRACKING option that specifies how changes to the table are propagated to the full-text index. The settings for this argument are as follows:

o AUTO Specifies that the propagation of changes happens automatically. Changes still might not be reflected immediately in the full-text index. This is the default setting.

Page 16: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

o MANUAL Specifies that the ALTER FULLTEXT INDEX…START UPDATE POPULATION statement must be run either manually or by using SQL Server Agent jobs to propagate changes to the full-text index.

o OFF Specifies that SQL Server does not track changes to the table. A full population must be performed for any changes to be propagated to the full-text index. Unless the NO POPULATION option is specified, an initial population of the full-text index will occur automatically after the full-text index is created.

o OFF, NO POPULATION Specifies that the initial population of the full-text index does not occur after the full-text index is created. The NO POPULATION option is valid only with the OFF change tracking option.

You can use the FULLTEXTCATALOGPROPERTY function to determine the population status of the full-text catalog. If a population is in progress, a value of 1 is typically returned

To view what columns are included in a full-text index, you can use the sys.fulltext_index_columns catalog view

You can view the available languages for a particular SQL Server instance by looking into the sys.full-text_languages catalog view

If a full-text query does not return the expected result set, you can use the sys.dm_fts_parser DMF to view the final tokenization result from the query

CONTAINS vs. FREETEXT

o The CONTAINS predicate allows you to write queries to return exact matches to your input as well as fuzzy (less precise) matches to the input term

o The FREETEXT predicate allows you to write queries that return values that match the meaning of the search condition, not simply the exact words or synonyms of the search condition

Managing a stoplist

o Create the stoplist: CREATE FULLTEXT STOPLIST

o Add or remove stopwords to the stop list: ALTER FULLTEXT STOPLIST

o Drop the stoplist: DROP FULLTEXT STOPLIST

o The system stoplist is installed in the Resource database.

Thesaurus

o Thesaurus matching occurs only for CONTAINS and CONTAINSTABLE queries that specify the FORMSOF THESAURUS clause and for FREETEXT and FREETEXTABLE queries

o Thesaurus are built by editing an XML file formatted in an especial manner. This XML will contain <expansion> and <replacement> tags which form the foundation of the whole thing.

o Once the thesaurus file is finalized, use sp_fulltext_load_thesaurus_file to load the content of the thesaurus file into tempdb

SQL Server Service Broker Service Broker Concepts

o A Service Broker solution is made of messages, message types, queues, services, and contracts

o Conversations are made of messages. A message can belong to one and only one conversation and is made of a specific message type.

o A conversation between two specific Service Broker services is called a dialog

Page 17: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Service definition components

o Queues are tables where messages are stored until they are processed

o Contracts are agreements between two services that defines the message types the services send to accomplish certain tasks

o Message types are objects that defines the name and contents of a message. Every database contains a message type named DEFAULT that uses a validation of NONE

o Contracts are associated with services to manage incoming conversations

SQL Server 2008 provides the following new Service Broker features: Broker Priorities, Ssbdiagnose utility and System Monitor Object and Counters

To start a conversation and send messages, an application for the initiating service issues the BEGIN DIALOG statement

Activation methods: start when SQL Server starts, schedule the application to run at specific times or start automatically when there is work for the program to do (internal activationactivated by stored procedures, and event-based activation activated by SQL Server queue activation)

Before you can build your Service Broker solution applications, you need to be able to create and configure the individual components, once you have configured the service definition components and the network and security components (part one), you need to configure your applications to initiate dialogs (part two).

If a contract is not defined in the CREATE SERVICE command, the service might only initiate conversations and not be a target

SQL Statements used for service broker (in the order they should be called) (part one)

o CREATE MESSAGE TYPE SampleType

o CREATE QUEUE SampleQueue

o CREATE CONTRACT SampleContract (SampleType SENT BY INITIATOR, SampleType2 SENT BY ANY)

o CREATE SERVICE SampleService ON QUEUE SampleQueue (SampleContract);

o CREATE BROKER PRIORITY SampleBrokerPriorityFOR CONVERSATIONSET (CONTRACT_NAME = SampleContract , LOCAL_SERVICE_NAME = SampleService , REMOTE_SERVICE_NAME = 'SampleService2' , PRIORITY_LEVEL = 2);

You use the BEGIN DIALOG, SEND, and RECEIVE commands to control individual conversations between two services. These are the SQL Statements used for service broker (in the order they should be called) (part two)

o DECLARE @dialog_handle uniqueidentifier

BEGIN DIALOG @dialog_handleFROM SERVICE SampleServiceTO SERVICE ‘SampleService2’

o SEND ON CONVERSATION @dialog_handle ( message_body_expression );

o RECEIVE TOP (1) @dialog_handle = conversation_handleFROM SampleQueue;

Page 18: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Service Broker messages between SQL Server instances (different database, different server). In this case, there is a lot more configuration needed (endpoints, master keys, certificates, users, routes and bindings)

o A Service Broker endpoint establishes the network address to which Service Broker messages are sent

o By default, Service Broker cannot communicate on the network because there are no Service Broker endpoints unless you configure them.

o In this case use this statement to create an endpoint (both on the initiator and target databases)CREATE ENDPOINT InstTargetEndpointSTATE = STARTEDAS TCP ( LISTENER_PORT = 4022 )FOR SERVICE_BROKER (AUTHENTICATION = WINDOWS );

o You also need (both on the initiator and target databases) a master key (used to protect private keys of certificates). Use the CREATE MASTER KEY T-SQL statement for that purpose

o Also, you need to create (both on the initiator and target databases) a user that will be used to support encryption and remote connections, he will also own the certificates

o Finally you also need to create and backup (both on the initiator and target databases) a certificateCREATE CERTIFICATE Inst[Target|Initiator]Certificate AUTHORIZATION [Target|Initiator]User WITH SUBJECT = '[Target|Initiator] Certificate', EXPIRY_DATE = N'12/31/2010';

BACKUP CERTIFICATE Inst[Target|Initiator]Certificate TO FILE = N'C:\storedcerts\$ampleSSBCerts\Inst[Target|Initiator]Certificate.cer';

o Two more things, on the initiator database you need to create the TargetUser and viceversa (InitiatorUser on the targetdatabase) and also you need to create The Target certificate on the initiator database (and viceversa) but use the FROM FILE clause this time (and point, in the initiator, to the cer file created on the target and viceversa)

o Then, on the target, create routes (CREATE ROUTE) to the target service and initiator service and a remote service binding (CREATE REMOTE SERVICE BINDING) that associates the TargetUser with the target service route. Do the same (but inverse) on the initiator.

Page 19: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

Chapter 9: An Introduction to Microsoft SQL Server Manageability Features You can use the features of Database Mail to integrate e-mail messages into your

applications

The sp_send_dbmail system stored procedure is used to send e-mail messages from the SQL Server service to e-mail recipients

One of the primary benefits of Database Mail (over SQLMail) is that it communicates by using the SMTP protocol and does not require an Extended MAPI—compliant e-mail application, such as Microsoft Office Outlook, on the server running SQL Server. In addition, Database Mail provides fault tolerance by supporting multiple SMTP servers and multiple SMTP user accounts and profiles

You can invoke Sqlcmd through SQL Server PowerShell, allowing you to execute any valid SQL commands from within the Windows PowerShell environment.

sqlps is a utility to enable SQL Server functionality within the PowerShell environment

The root node for SQL Server is the SQLSERVER: drive. Under the SQLSERVER: drive, the SQL Server PowerShell provider implements the following three folders:

o SQLSERVER:\SQL Contains database objects, such as databases, tables, views, and stored procedures

o SQLSERVER:\SQLPolicy Contains policy-based management objects, such as policies and facets

o SQLSERVER:\SQLRegistration Contains registered server objects, such as server groups and registered servers

PowerShell commands are known as cmdlets

A list of the cmdlets and their functions implemented in SQL Server 2008 are as follows:

o Get-Location Returns the current node name.

o Set-Location Changes the current node.

o Get-ChildItem Lists the objects stored at the current node.

o Get-Item Returns the properties of the current item.

o Move-Item Moves an item.

o Rename-Item Renames an object.

o Remove-Item Deletes an object.

In Addition to the cmdlets listed previously, the SQL Server PowerShell provider includes other built-in cmdlets

Get-Help Provides help information about each cmdlet

Invoke-Sqlcmd Runs a Transact-SQL (T-SQL) or XQuery script

Tracking Data Changes In SQL Server 2008, Microsoft introduced change tracking and change data capture

(CDC) to help answer questions about data that changes in a database. In addition, SQL Server 2008 Enterprise Edition includes SQL Server Audit, which provides automatic auditing of a SQL Server instance

Change Tracking

o Provides functionality with DML statements.

o Can answer questions such as:

What rows in the table have changed?

Page 20: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

What columns have changed?

Has a particular row been updated?

Did an INSERT, UPDATE, or DELETE occur?

o Operates synchronously to provide change information immediately.

o Provides a lower storage overhead than CDC.

o Provides a built-in cleanup mechanism.

o Uses the transaction commit time to determine the order of the changes.

o Works without requiring schema changes to the table or additional triggers.

o Must be enabled at the database level by using ALTER DATABASE

CDC

o Provides functionality with DML statements.

o Can answer the same questions as change tracking, as well as the following ones:

What were the intermediate changes made to the data since the last synchronization?

How many times has a row been updated since the last synchronization?

o Uses change tables to record modified column data and metadata that is required to apply changes to a target environment. The column structure of the change table mirrors the structure of the source table.

o Uses the transaction log as input for the information added to the change tables.

o Operates asynchronously and changes are available only after the DML statement has completed.

o Provides table-valued functions to allow access to the data in the change tables.

o Requires the database to be enabled by using sys.sp_cdc_enable_db.

o Enable CDC on a table by using the sys.sp_cdc_enable_table

o Requires SQL Server 2008 Enterprise, Developer, or Evaluation Edition.

Change Tracking configuration options for the database

o Change Tracking Set to True or False (the default) to enable or disable change tracking.

o Retention Period Set to a numeric value that represents the minimum amount of time that changes are maintained. The default retention period is 2.

o Retention Period Units Set to Days (the default), Hours, or Minutes.

o Auto CleanUp Set to ON (the default) or OFF to enable or disable the process that removes outdated change tracking information.

Sample T-SQL for change trackingALTER DATABASE AdventureWorksDW2008SET CHANGE_TRACKING = ON(CHANGE_RETENTION = 7 DAYS, AUTO_CLEANUP = ON)

ALTER TABLE {SAMPLE_TABLE}ENABLE CHANGE_TRACKINGWITH (TRACK_COLUMNS_UPDATED = ON);

To determine which tables in the AdventureWorksDW2008 database have change tracking enabled, you can query the sys.change_tracking_tables catalog view

To see which databases has change tracking query sys.change_tracking_databases

Page 21: MCTS Self-Paced Training Kit (Exam 70-433) Microsoft SQL Server 2008—Database Development

The following functions allow you to query change information and manage the change tracking environment

o CHANGETABLE

o CHANGE_TRACKING_MIN_VALID_VERSION

o CHANGE_TRACKING_CURRENT_VERSION

o CHANGE_TRACKING_IS_COLUMN_IN_MASK

o WITH CHANGE_TRACKING_CONTEXT

When you enable CDC for a database, the system sets the is_cdc_enabled column in the sys.databases catalog view to 1 and creates the system objects that CDC requires. These objects include the cdc schema, the cdc database user account, and the tables, jobs, stored procedures, and functions that the CDC process requires

You can sys.sp_cdc_help_change_data_capture to obtain information about all current tables tracked by CDC;

You use either the cdc.fn_cdc_get_all_changes_<capture_instance> or the cdc.fn_cdc_get_net_changes_<capture_instance> function to query data in the change table

To map a time interval to a change data capture query rangeSET @from_lsn = sys.fn_cdc_map_time_to_lsn(‘smallest greater than or equal', @begin_time);SET @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @end_time);

To use SQL Server Audit, you must create and configure at least one SQL Server Audit object for each instance where auditing occurs (in SSMS, Instance Security Audits New Audit…)

After the instance-level audit object, you can create a Database Audit specification at the server or database level by using the T-SQL or by using Object Explorer in SSMS (Instance Databases [database name] Security)

The information provided by this event class reports the query syntax that was issued but does not maintain before-and-after values of updated or deleted data

Before the Database Audit Specification begins to collect information, you must enable it by using SSMS or T-SQL