interview topics on sql

8/9/2019 Interview Topics on SQL

1/22

2009

Vinay Kotha

CSC

11/5/2009

Interview Topics for SQL & MSBI


2/22

Author: Vinay Kotha Page 2

Table of Contents

Recovery Models: .................................................................................................................................... 4

SimpleRecovery Model: ...................................................................................................................... 4

Fullrecovery Model: ............................................................................................................................ 5

Bulk-Logged: ........................................................................................................................................ 5

Back-ups .................................................................................................................................................. 6

Back-up Scopes: ................................................................................................................................... 6

A) Database backups: ................................................................................................................... 6

B) Partial Back-ups:....................................................................................................................... 6

C) File Back-ups: ........................................................................................................................... 6

Back-Up Types ......................................................................................................................................... 6

A) Full Backups: ................................................................................................................................ 6

B) Differential backups: .................................................................................................................... 6

SQL SERVERREPLICATION........................................................................................................................ 7

A) Load Balancing: ............................................................................................................................ 7

B) Offline Processing: ....................................................................................................................... 7

C) Redundancy: ................................................................................................................................ 7

A) Publishers .................................................................................................................................... 7

B) Subscribers .................................................................................................................................. 7

A) Snapshot Replication:................................................................................................................... 7

B) TransactionalReplication: ............................................................................................................ 7

C) MergeReplication: ....................................................................................................................... 8

A) Expressedition .............................. ...................... ................................ ...................... ................... 8

B) Workgroup edition ....................................................................................................................... 8

C) Standard edition .......................................................................................................................... 8

D) Enterpriseedition ........................................................................................................................ 8

Difference between Temp tables and Table variables in SQL Server ............................ ...................... ....... 8

Suggestion forchoosing between these two: ....................................................................................... 9

Stored Procedures ................................................................................................................................... 9

Advantages of Stored Procedures: ....................................................................................................... 9

Differences between User Defined Functions and Stored Procedures .................................................... 10

SSAS ...................................................................................................................................................... 10


3/22


Different Dimension types by Microsoft available in Analysis Services.............................. .................. 10

Different Types of Dimensions ............................................................................................................... 11

Confirmed Dimension: ....................................................................................................................... 12

Junked Dimension:............................................................................................................................. 12

Degenerated Dimension: ................................................................................................................... 12

Slowly Changing Dimensions:............................................................................................................. 12

There are 10 types of dimension Tables ............................................................................................. 12

Differences between Analysis Services 2005 and 2008 .................................................. ......................... 12

Define temporary and extended stored procedure. ............................................................................... 13

Differences between SSRS 2005 and SSRS 2008 ..................................................................................... 14

Performance Tuning of SSRS: Handling a Large workload ............................... ....................... ................. 14

Steps to Improve Performance........................................................................................................... 14

Control the Size of yourReports..................................................................................................... 14

Use Cache Execution ...................................................................................................................... 14

Configure and Schedule YourReports ............................................................................................ 15

DeliverRendered Reports forNon-browser Formats ...................................................................... 15

Populate theReport Cache by Using Data-Driven Subscriptions for Parameterized Reports ........... 15

Back to Report Catalogs ................................................................................................................. 15

Tuning with Web Service................................................................................................................ 15

Memory Limits in SQL ServerReporting Services 2008 ............................................. .......................... 16

Memory Limit ................................................................................................................................ 16

Maximum Memory Limit ................................ ....................... ................................ ...................... ... 16

Performance Tuning of SQL Server......................................................................................................... 16

Section A: .......................................................................................................................................... 16

Microsoft Tips on Performance Tuning:.............................................................................................. 17

Not knowing the performance and scalability characteristics of yoursystem:......................... 17

Retrieving too much data: ...................................................................................................... 17

Misuse of Transactions: .......................................................................................................... 17

Misuse ofIndexes:.................................................................................................................. 17

Mixing OLTP, OLAP and reporting workloads: ......................................................................... 17

Inefficient Schemas: ............................................................................................................... 17

Using an inefficient disksub-system: ...................................................................................... 17


4/22


SSIS 10 Best Practices: ........................................................................................................................... 17

SSIS Performance tuning ........................................................................................................................ 18

Data Flow Optimization Modes .................................................................................................. 18

Buffers: ...................................................................................................................................... 18

Buffer Sizing: .............................................................................................................................. 18

Buffer Tuning: ............................................................................................................................ 19

Parallelism: ................................................................................................................................ 19

Extraction Tuning ....................................................................................................................... 19

Transformation Tuning ................................ ....................... ................................ ...................... .. 20

Merge-Join Transformation ............................ ....................... ................................ ..................... 20

Slowly Changing Dimensions ...................................................................................................... 21

Data Types ................................................................................................................................. 21

Miscellaneous ............................................................................................................................ 21

Load Tuning ............................................................................................................................... 21

Differences between SSIS 2005 and SSIS 2008 ....................................................................................... 22

Look-up ............................................................................................................................................. 22

Cache Transformation ............................... ....................... ................................ ...................... ............ 22

Data Profiling Task ............................................................................................................................. 22

Script Task and Transformation .............................. ....................... ................................ ..................... 22

Recovery Models:

There are 3 recovery Models in SQL Server.

1) Simple2) Full3) Bulk-Logged

SimpleRecovery Model: Simplerecovery model allows you to recover data only to the mostrecent full database or differential back-up. Transaction log back-ups are not available because the

contents of the transaction log are truncated each time a checkpoint is issued for the database.

Or


5/22


Simplerecovery model is just that simple, in this approach; SQL Server maintains only a minimal amount

of information in the transaction log. SQL Server truncates the transaction log each time the database

reaches a transaction checkpoint, leaving no log entries for disasterrecovery purposes.

In databases using simplerecovery model, you may restore full or differential back-up only. It is not

possible to restoresuch a database to a given point in time; you may only restore it to theexact timewhen a full or differential back-up occurred. Therefore, you will automatically lose any data

modifications made between the time of the most recent full/differential back-up and the time of

failure.

Fullrecovery Model: Fullrecovery model uses database back-ups and transaction log back-ups to

providecomplete protection against failure. Along with being able to restore a full or differential back-

up, you can recover the database to the point of failure or to a specific point in time. All operations,

including bulk operationssuch as SELECT INTO, CREATE INDEX and bulk-loading data, are fully logged

and recoverable.

Or

FullRecovery model also bears a self-descriptive name. In this model, SQL Server preserves the

transaction log until you back it up. This allows you to design a disaster back-up in conjunction with

transaction log back-ups.

In theevent of a database failure, you have the most flexibility restoring databases using the full

recovery model. In addition to preserving data modificationsstored in the transaction log, the full

recovery model allows you to restore a database to a specific point in time. Forexample, if an erroneous

modification corrupted your data at 2:36 Am on Monday, you could use SQL Servers point in time to

restore to roll your database back to 2:35 AM, wiping out theeffects of therecovery.

Bulk-Logged: Bulk-logged recovery model provides protection against failurecombined with the

best performance. In order to get better performance, the following operations are minimally logged

and not fully recoverable: SELECT INTO, bulkload operations.

Or

Bulkrecovery model is a special-purpose model that works in a similar manner to the fullrecovery

model. The only difference is in the way it handles bulk data modification operations. The bulk-logged

modelrecords these operations in the transaction log using a technicalknown as minimallogging. This

savessignificantly on processing time, but prevents you from using point-in-timerestore option.

Microsoft recommends that the bulk-logged recovery model only be used forshort periods of time. Best

practice dictates that you switch a database to the bulk-logged recovery model immediately before

conducting bulk operations and restore it to the fullrecovery model when those operationscomplete.


6/22


Back-ups

One of the major advantages that enterprise-class databases offer over their desktop counterparts is a

robust back-up and recovery featureset. Microsoft SQL Server provides database administrators with

the ability to customize a database backup and recovery plan to the business and technical

requirements of an organization.

In this article, weexplore the process of backing up data with Microsoft SQL Server. When you create a

backup plan, you will need to create an appropriate mix of backups with varying[em] backup

scopes[/em] and [em]backup types[/em] that meet therecovery objectives of your organization and are

suitable for your technicalenvironment.

Back-upScopes: Thescope of a back-up defines the portion of the databasecovered by the

backup. It defines the database, file and or file-group that SQL Server will backup. There are three

different types of back-up scope available in Microsoft SQL Server:

A)

Database backups: Thesecover theentire database including allstructuralschemainformation, theentire data contents of the database and any portion of the transaction log

necessary to restore the database from scratch to itsstate at the time of the backup. Database

backups are thesimplest way to restore your data in theevent of a disaster, but they consume a

large amount of diskspace and time to complete.

B) Partial Back-ups: These are good alternatives to database back-ups for very largedatabases that contain significant quantities ofread-only data. If you haveread-only file-groups

in your database, it probably doesnt makesense to back them up frequently, as they do not

change. Therefore, thescope of a partial back-up includes all files in the primary file-group; all

read/write file-groups, and any read-only file- groups that you explicitly specify.

C) File Back-ups: This allows you to individually back-up files and/or file-groups from yourdatabase. They may be used to complement partial back-ups by creating one-time-only backups

of yourread-only file-groups. They may also play a role in complex back-up models.

Back-UpTypes

Thesecond decision you need to make when planning a SQL Server database backup model is the type

each backup included in your plan. The backup type describes the temporalcoverage of the database

backup. SQL Serversupports two different back-up types:

A) Full Backups: This includes all data within the backup scope. Forexample, a full databasebackup will include all data in the database, regardless of when it waslast created for modified.

Similarly, a full partial backup will include theentirecontents ofevery file and file-group within

in thescope of the partial backup.

B) Differentialbackups: This includes only the portion of data that had changed since thelast full backup. Forexample, if you perform a full database backup on Monday morning and

then perform a differential backup on Monday evening, the differential backup will be a much


7/22


smaller file and takes much less time to create, this includes only the data changed during the

day on Monday.

You should keep in mind that thescope and type of a backup are two independent decisions made

when creating your backup plan. As described above, each type and scope allows you to customize

the amount of data included in the backup and, therefore, the amounts of timerequired to backupand restore the database in theevent of a disaster.

SQLSERVERREPLICATION

SQL Serverreplication allows database administrators to distribute data to variousservers

throughout an organization. You may wish to implement replication in your organization for a

number ofreasons, such as

A) Load Balancing:Replication allows you to disseminate your data to a number ofserversand then distribute thequery load among thoseservers.

B) OfflineProcessing: you may wish to manipulate data from your database on a machinethat is not alwaysconnected to the network.

C) Redundancy:Replication allows you to build a fail-over databaseserver thatsready to pickup the processing load at a moments notice.

In any replication scenario there are 2 main components:

A) Publishers have data to offer to the otherservers. Any given replication scheme may haveone or more publishers.

B) Subscribers are databaseservers that wish to receive updates from the publisher when thedata is modified

Theres nothing preventing a singlesystem from acting both of thesecapabilities. In fact, this is often

done in large-scale distributed databasesystems. Microsoft SQL Serversupports three types of database

replication. They are

A) SnapshotReplication:It acts in the manner its name implies. The publishersimply takes asnapshot of theentirereplicated database and shares it with thesubscribers. Ofcourse, this is a

very time and resource-intensive process. For thisreason, most administrators dont use

snapshot replication on a recurring basis for databases that change frequently. There are two

scenarios wheresnapshot replication iscommonly used. First, it is used for databases that rarelychange. Second, it is used to set a baseline to establish replication between systems while future

updates are propagated using transactional or mergereplication.

B) TransactionalReplication: This offers a more flexiblesolution for databases thatchange on a regular basis. With transactionalreplication, thereplication agent monitors the

publisher forchanges to the database and transmits thosechanges to thesubscribers. This

transmission can take place immediately or on a periodic basis.


8/22


C) MergeReplication:It allows the publisher and subscriber to independently makechangesto the database. Both entitiescan work without an active networkconnection. When they are

reconnected, the mergereplication agentschecks forchanges on both sets of data and modifies

each database accordingly. Ifchangesconflict with each other, it uses a predefined conflict

resolution algorithm to determine the appropriate data. Mergereplication iscommonly used by

laptop users and others who cannot beconstantly connected to the publisher.

Each one of thesereplication techniquesserves a useful purpose and is well-suited to particular

databasescenarios.

If you are working with SQL Server 2005, youll need to choose youredition based upon your

replication needs. Each edition has differing capabilities.

A) Express edition hasextremely limited replication capabilities. It is able to act as areplication client only.

B) W

orkgroupedition addslimited publishing capabilities. It is able to serve fiveclients usingtransactionalreplication and up to 25 clients using mergereplication. It can also act as a

replication client.

C) Standard edition has full, unlimited replication capabilities with other SQL Serverdatabases

D) Enterpriseedition adds a powerful tool for those operating in a mixed databaseenvironmentsitscapable ofreplication with oracle databases

As you have undoubtedly recognized by this point, SQL Serversreplication capabilities offer

database administrators a powerful tool for managing and scaling databases in an enterprise

environment.

Differencebetween Temp tables and Table variables in SQLServer

1) Transaction log are not recorded for table variablesso they are transactional neutral or you cansay that they are out ofscope of transaction mechanism. Whereas temp tables participate in

transactions just like normal tables

2) Table variablescannot be altered it means no DDL action is allowed on them. Whereas temptablescan be altered

3) Stored procedures with a temporary tablecannot be pre-compiled, while an execution plan ofprocedures with table variablescan bestatically compiled in advance. Pre-compiling a script

gives a major advantage to itsspeed ofexecution. This advantagecan be dramatic forlong

procedures, whererecompilation can be too pricy.

4) Unlike temp tables, table variables are memory resident but not always. Under memorypressure, the pages belonging to a table variablecan be pushed out to tempdb.

5) Therecan be big performance differences between using table variables and temporary tables.In most cases, temporary tables are faster than table variables. Although queries using table

variables didnt generate parallelquery plans on a large SMP box, similarqueries using


9/22


temporary tables (local or global) and running undersamecircumstances did generate parallel

plans.

6) Table variables use internal metadata in a way that prevents theengine from using a tablevariable with parallelquery. SQL Server maintainsstatistics forqueries that use temporary

tables but not forqueries that use table variables. Without statistics, SQL Server might choose a

poor processing plan for a query that contains a table variable.

No Statistics is maintained on the table variable which means that any changes in data

impacting table variable will not causerecompilation ofqueries accessing table variable. Queries

involving table variables dont generate parallel plans.

Suggestion forchoosing between these two:

1) Use table variable where you want to pass table to the SP as parameter because there is nootherchoice.

2) Its found that table variable areslow in SQL Server 2005 than in 2000 on similar data andcircumstances, so if you have used table variablesextensively in your database and planning to

migrate from 2000 to 2005, make yourchoicecarefully.

3) Table variable areOK if used in smallqueries and for processing small amount of data otherwisego for temp tables.

4) If you are using very complex businesslogic in your SP, its better using temp tables than tablevariables.

Stored ProceduresA stored Procedure is a group of SQLstatements that form a logical unit and perform a particular

task. Stored procedures are used to encapsulate a set of operations orqueries to execute on a

databaseserver. Forexample, operations on an employee database (hire, fire, promote, lookup)could becoded asstored procedureexecuted by application code. Stored procedurescan be

compiled and executed with different parameters and results, and they may have any combination

of input, output, and input/output parameters.

Advantages ofStored Procedures:

A) Precompiled execution: SQL Servercompileseach stored procedure once and then reutilizestheexecution plan. Thisresults in tremendous performance boosts when stored procedures are

called repeatedly.

B) Reduced client/server traffic:If network bandwidth is a concern in yourenvironment, youll behappy to learn that stored procedurescan reducelong SQLqueries to a singleline that ittransmitted over the wire.

C) Efficient re-use of code and programming abstraction: Stored Procedurescan be used bymultiple users and client programs. If you utilize them in a planned manner, youll find the

development cycle takesless time.

D) Enhanced security controls: you can grant users permissions to execute a stored procedureindependently of underlying table permissions.


10/22


Differences between UserDefined Functions and Stored Procedures

Stored procedures are very similar to user-defined functions, but there aresuitable differences.

Both allow you to create bundles of SQLstatements that arestored on theserver for future use. This

offers you a tremendousefficiency benefit, as you can save programming by

A) Reusing code from one program to another, cutting down on program development timeB) Hiding the SQL details, allowing database developers to worry about SQL and application

developers to deal only in higher-levellanguages

C) Centralize maintenance, allowing you to make businesslogicchanges in a single place thatautomatically affect all dependent applications

At first glance, functions and stored proceduresseem identical. However, there areseveralsubtle, yet

important differences between the two:

A) Stored procedures arecalled independently, using the EXEC command, while functions arecalled from within another SQLstatement

B) Stored procedures allow you to enhance application security by granting users and applicationspermission to usestored procedures, rather than permission to access the underlying tables.

Stored procedures provide the ability to restrict user actions at a much more granularlevel than

standard SQL Server permissions. Forexample if you have an inventory table that cashiers must

updateeach time an item issold (to decrement the inventory for that item by 1 unit), you can

grant cashiers permissions to use a decrement item stored procedure, rather than allowing

them to make arbitrary changes to the inventory table.

C) Functions always must return a value (either a scalar value or a table). Stored procedures mayreturn a scalar value, a table value or nothing at all.

Overall, stored procedures are one of the greatest treasures available to SQL Server developers. The

efficiency and security benefits are well worth the upfront investment in time.

SSAS

DifferentDimension types by Microsoftavailablein Analysis Services

1) Regular2) Time3) Organization4) Geography5) Bill of Materials6) Accounts7) Customers8) Products9) Scenario10)Quantitative


11/22


11)Utility12)Currency13)Rates14)Channel15)PromotionRegular: A dimension whose type has not been set to a special dimension type

Time: A dimension whose attributesrepresents time periods, such as years, semesters, quarters,

months and days

Organization: A dimension whose attributesrepresents organizational information such as

employers orsubsidiaries

Geography: A dimension whose attributerepresents geographic information, such ascities or postal

codes

Bill of Materials: A dimension whose attributesrepresent inventory r manufacturing information

such as partslists for products

Accounts: A dimension whose attributesrepresent a chart of accounts for financialreporting

purposes

Customers: A dimension whose attributerepresent customer orcontact information

Products: Dimensions whose attributerepresent product information

Scenario: Dimensions whose attributerepresent planning orstrategic analysis information

Quantitative: Dimensions whose attributesrepresent quantitative information

Utility: Dimensions whose attributerepresent miscellaneous information

Currency: A dimension whose attributesrepresentscurrency rate information

Rates: Dimensions whose attributerepresent currency rate information

Channel: A dimension whose attributerepresent channel information

Promotion: Dimensions whose attributerepresent marketing promotion information.

DifferentTypes ofDimensions

1) Confirmed Dimension2) Junk Dimension3) Degenerated Dimension4) Slowly changing dimensions


12/22


Confirmed Dimension: These dimensions aresomething that is built once in your model and

can dereused multiple times with different fact tables. Forexampleconsider a modelcontaining

multiple fact tables, representing different data-marts. Now look for a dimension that iscommon to

these fact tables. In thisexampleletsconsider that the product dimension iscommon and hence

can bereused by creating short cuts and joining the different fact tables. Some of theexamples are

time dimension arecustomer dimension, product dimension.

Junked Dimension: When you consolidatelots ofsmall dimensions and instead of having 100s

ofsmall dimensions, that will have few records in them, cluttering your database with these mini

identifier tables, allrecords from all thesesmall dimension tables areloaded into ONE dimension

table and wecall this dimension table as JUNK dimension table. (Since we arestoring all the Junk in

this one table) Forexample a company might have handful of manufacture plants, handful of order

types, and so on, so forth, and wecan consolidate them into one dimension tablecalled Junk

dimension table

Degenerated Dimension: An item that is in the fact table but isstripped off of itsdescription, because the description belongs in dimension table, isreferred to as Degenerated

Dimension. Since it lookslike dimension, but isreally in fact table and has been degenerated of its

description, hence iscalled as Degenerated Dimension.

Slowly Changing Dimensions: These dimensions are those wherekey value willremain

static but description might change over the period of time

Thereare 10 types of dimension Tables(This is not thecase in most of the instances)

1) Primary Dimensions2) Secondary Dimensions3) Degenerate Dimensions4) Confirmed Dimensions5) Slowly Changing Dimensions6) Rapidly Changing Dimensions7) Large Dimensions8) Rapidly Changing Monster Dimensions9) Junk Dimensions10)Role-Playing Dimensions

Differences between Analysis Services 2005 and 2008

A) Real time best practice design warnings. These warnings are implemented in AMO, exposed intheUI via bluesquiggly lines, and can be dismissed individually (a single occurrence) or turned

off all together. To disable/re-enable build project and then in the warning window select

warning message and right mouseclick to choose disable orenable.

B) New Dimension Design Wizard


13/22


C) New Cube Design WizardD) Attribute relationship tab in dimension designer. Allowseasier to define and understand

attributerelationship.

E) CREATE MEMBER syntax extensions to support defining caption, display folders and associatedmeasure group.

F) CREATE SET syntax extensions to support defining caption and display folders as well as theability to define dynamic named sets.

G) CREATE KPIcommand is addedH) Backup performance improvements.In SSAS 2005 backup time for big databases grew

exponentially. In SSAS 2008 backup time grow islinear. Redesigned backup storage willremove

backup sizelimits.

I) Write-back to MOLAP Analysis Services 2008 removes therequirement to query ROLAPpartitions when performing write-backs, which results in huge performance gains.

J) Scale-out Analysis Services. A singleread-only copy of Analysis Services databasecan besharedbetween many Analysis Services through a virtualIP address. Thiscreates a highly scalable

deployment option for an Analysis Services Solution.

K) UPDATE MEMBER new statement. TheUPDATE MEMBERstatement updates an existingcalculated member while preserving therelative precedence of this member with respect to

othercalculations. Therefore, you cannot use theUPDATE MEMBERstatement to change

SOLVEORDER. An UPDATE MEMBERstatement cannot bespecified in the MDX script for a Cube.

L) Block Computation. Thiseliminates unnecessary aggregation calculations (forexample, whenthe values to be aggregated areNULL) and provides a significant improvement in analysiscube

performance, which enables users to increase the depth of their hierarchies and complexity of

computations.

M) Aggregation Designer Changes. Algorithm that builds aggregations will be improved, there willbesupport for manualedit/create/delete of aggregations and weshould be able to see what

aggregates was designed. Also aggregation designer will have built-in validations for optimal

design assistance.

N) Data Management Views (DMV). These DMVs will allow writing SELECT typestatements againstSSAS instance to get performance and statistics information.

O) SSAS database attach/detachP) Analysis Services Personalization Extensions

Define temporary and extended stored procedure.

Answer - Temporary Stored Procedure isstored in Tempdb database. It is volatile and is deleted once

connection gets terminated orserver isrestarted......


14/22


Differences between SSRS 2005 and SSRS 2008

1) For SSRS 2005, it required Internet information services (IIS) to rum, where as in SSRS 2008, it nolongerrequiresIIS. 2008 useshttp.sys driver and listens forreport requests through http.sys.

Not only does thisreduce deployment headaches, it also reducesserver overhead

2) SSRS 2005 used more memory and it wasextremely resource intensive, so much so that manycompanies would install it on other machine apart from SQL Server, but 2008 utilizes memory

moreefficiently, especially when working with reports that contain largesets of data.

Additionally, SSRS 2008 will often load the first page of a report much faster than 2005.

PerformanceTuning ofSSRS: Handling aLarge workload

To get the highest performance when handling large workloads that include userrequests forlarge

reports, implement the following recommendations

Steps to ImprovePerformance

1) Control thesize of yourreports2) Use Cache Execution3) Configure and Schedule yourreports4) DeliverRendered Reports forNon-browser Formats5) Populate theReport Cache by Using Data-Driven Subscriptions for Parameterized Reports6) Back to theReport Catalogs7) Tuning the Web Service

Control the Size ofyour Reports

you will first want to determine the purpose of thesereports and whether a large multi-pagereport is

even necessary. If a largereport is necessary, how frequently will it be used? If you provide users with

smallersummary reports, can you reduce the frequency with which users attempt to access thislarge

multi-pagereport? Largereports have a significant processing load on thereport server, thereport

servercatalog, and report data, so it is necessary to evaluateeach report on a case-by-case basis

Somecommon problems with theselargereports are that they contain data fields that are not used in

thereport or they contain duplicate datasets. Often usersretrieve more data than they really need. To

significantly reduce theload placed on yourReporting Servicesenvironment, createsummary reports

that use aggregatescreated at the data source, and include only the necessary columns,. If you want to

provide data feeds, you can do this asynchronously using more appropriate toolssuch as SSIS, to provide

the data feed.

Use Cache Execution

If thereports do not need to haveliveexecution, enable thecacheexecution setting foreach of your


15/22


appropriatereports. Thissetting causes thereport server to cache a temporary copy of thosereports in

memory.

Configure and Schedule Your Reports

For yourlargereports, use theReport Execution Timeoutssetting to control how long a report can

execute before it times out. Somereportssimply need a long time to run, so timeouts will not help youthere, but ifreports are based on bad orrunaway queries, execution timeoutsensure that resources are

not being inappropriately utilized

If you havelargereports that create data processing bottle-necks, you can mitigateresourcecontention

issues by using Scheduled Snapshots. Instead of thereport data itself, a regularly scheduled report

execution snapshot is used to render thereport. Thescheduled snapshot can beexecuted during off-

peak hours, leaving moreresources available forlivereports for users during peak hours.

Deliver Rendered Reports for Non-browser Formats

rendering performance of non-browser formatssuch as PDF and XLS has improved SQL Server 2008Reporting Services, nevertheless, to reduce theload on your SQL ServerReporting Services

environment, you can place non-browser format reports onto a fileshare and/or Sharepoint, so users

can access the file directly instead ofcontinually regenerating thereport.

Populate the Report Cache by Using Data-Driven Subscriptions for

Parameterized Reports

For yourlarge parameterized reports; you can improve performance by pre-populating thereport cache

using data-driven subscriptions. Data-driven subscriptionsenableeasier population of thecache forset

combinations of parameter values that are frequently used when the parameterized report isexecuted.

Note that if you choose a set of parameters that are not used, you take on thecost ofrunning thecachewith little value in return. Therefore, to identify the more frequent parameter valuecombinations,

analyze the Execution-Log2 view. Ultimately, when a user opens thereport, thereport servercan now

use a cached copy of thereport instead ofcreating thereport on demand. You can schedule and

populate thereport cache by using data-driven subscriptions.

Back to Report Catalogs

You can also increase thesize of yourreport servercatalogs, which allows the database to store more of

thesnapshot data.

Tuning with Web ServiceIIS and Http.Sys tuning helps get thelast incremental performance out of thereport servercomputer.

Thelow-level options allow you to change thelength of the HTTP request queue, the duration that

connections arekept alive, and so on. Forlargeconcurrent reporting loads, it may be necessary to

change thesesettings to allow yourservercomputer to accept enough requests to fully utilize theserver

resources.


16/22


you should consider this only if yourservers are at maximum load and you do not see fullresource

utilization or if you experienceconnection failures to theReporting Services.

MemoryLimits in SQLServerReporting Services 2008

Memory Limit

Thisconfiguration issimilar to WorkingSetMinimum in SQL Server 2008. Its default is 60% of physical

memory. Increasing the value helpsReporting Services handle morerequests. After this threshold is

reached, no new requests are accepted.

MaximumMemory Limit

Thisconfiguration issimilar to WorkingSetMaximum in SQL Server 2008. Its default is 80% of physical

memory. But unlike SQL Server 2008 version, when its threshold isreached, it starts aborting process

instead ofrejecting new requests

PerformanceTuning ofSQLServer

Section A:

Increasing the min memory perquery option to improve the performance ofqueries that usehashing orsorting operations, if your SQL Server has a lot of memory available and there are

many queriesrunning concurrently on theserver. Default min memory perquery option is

equal to 1024 kb.

Increasing the max asyncIO option if the SQL Server works on a high performanceserver withhigh-speed intelligent disksubsystem (such as hardware-based RAID with more than 10 disks)

Changing the Network Packet Size option to the appropriate value. By default packet size is4096 kb, forqueries with high amounts of data packet sizecan be increased accordingly

You can increase the Recovery Interval value Increasing the Priority boost for SQL Server options to 1. By default it isset to 0. Set the Max Worker Threads options to maximum number of userconnections to your SQL

Server box.

The default setting for the max worker threads options is 255. If the number of user

connections will beless than the max worker threads value, a separate operating system

thread will becreated foreach client connection, but if the number of userconnections will

exceed this value the thread pooling will be used. Forexample, if the maximum number of the

userconnections to your SQL Server box isequal to 50, you can set the max worker threads

option to 50, this frees up resources for SQL Server to useelsewhere. If the maximum number ofthe userconnections to your SQL Server box isequal to 500, you can set the max worker

threads options to 500, thiscan improve SQL Server performance because thread pooling will

not be used.

Specify the Min Server Memory and Max Server Memory options Specify the Set Working Set Size SQL Server option to reserve the amount of physical memory

space for SQL Server.


17/22


MicrosoftTips on PerformanceTuning:

Not knowing the performance and scalability characteristics ofyoursystem:If performance and scalability of a system are important to you, the biggest

mistake that you can make is to not to know the actual performance and scalability

characteristics of important queries, and theeffect the different queries have on each otherin a multiusersystem. You achieve performance and scalability when you limit resource use

and handlecontention for thoseresources. Contention iscaused by locking and by physical

contention. Resource use includes CPU utilization, networkI/O, diskI/O and memory use.

Retrieving too much data: A common mistake is to retrieve more data than youactually require. Retrieving too much data leads to increased network traffic, and increased

server and client resources. Thiscan include both thecolumns and rows.

Misuse ofTransactions:Long-running transactions, transactions that depend on userinput to commit, transactions that nevercommit because of an error, and non-transactional

queries inside transactionscausescalability and performance problems because they lock

resourceslonger than needed.

Misuse ofIndexes: if you do not create indexes that support thequeries that are issuedagainst yourserver, the performance of your application suffers as a result. However, if you

have too many indexes, then insert and update performance of your application suffers. You

have to find a balance between the indexing needs of the writes and reads that is based on

how your application is used.

Mixing OLTP, OLAP and reporting workloads:OLTP workloads arecharacterized by many small transactions, with an expectation of very quickresponse time

from the user. OLAP and reporting workloads arecharacterized by a few-long running

operations that might consume moreresources and cause morecontention. Thelong-

running operations arecaused by locking and by the underlying physicalsub-system. You

must resolve thisconflict to achieve a scalablesystem.

Inefficient Schemas: Adding indexescan help improve performance, however theirimpact may belimited if yourqueries are inefficient because of poor table design that

results in too many join operations or in inefficient join operations. Schema design is a key

performance factor. It also provides information to theserver that may be used to optimize

query plans. Schema design islargely a tradeoff between good read performance and good

write performance. Normalization helps write performance. De-normalization helpsread

performance

Using an inefficient disk sub-system: the physical disksub-system must provide adatabaseserver with sufficient I/O processing power to permit the databaseserver to run

without diskqueuing orlong I/O waits.

SSIS 10 BestPractices:

1) SSIS is an in-memory pipeline, so ensure all transformations occur in memory


18/22


2) Plan forcapacity by understanding resource utilization3) Baselinesourcesystem extract speed4) Optimize SQL data source, lookup transformations and destination5) Tune your network6) Use data types yes, back to data types wisely7) Change the design8) Partition the problem9) Minimizelogged operations10)Schedule and distribute it correctly

SSISPerformance tuning

SSIS architecture has two engines, Run-Timeengine and Data Flow engine. Run-Timeengine is a highly

parallelcontrol flow engine that co-ordinates theexecution of tasks or units work within SSIS and

manages theengine threads that carry out those tasks. Data-Flow engine manages the data pipeline

within a data flow task.

DataFlow Optimization ModesData flow task has a property called RunInOptimizedMode. When this property isenabled,

any down-stream component that doesnt use any of thesourcecomponent columns is

automatically disabled, and unused column is also automatically disabled. The net result of

enabling the RunInOptimizedMode property is the performance of theentire data-flow task is

improved

SSIS projects also have a RunInOptimizedMode property. This indicates that the

RunInOptimizedMode property of all the data-flow tasks in the project is overridden at design

time, and that all of data-flow tasks in the project run is optimized mode during debugging.

Buffers:A buffer is an in-memory dataset object utilized by the data flow engine to transform data. The

data flow task has a configurable property called DefaultMaxBufferSize, which isset to 10,000

by default. Data-flow task also has a configurable property called DefaultBufferSize, which is

set to 10 MB by default. Additionally, data-flow task has a property called MaxBufferSize,

which isset to 100 MB and cannot bechanged.

BufferSizing:When performance tuning a data-flow task, the goalshould be to pass as many records as

possible through a single buffer whileefficiently utilizing memory. This begs thequestion: what

does efficiently utilizing memory mean? SSIS estimates thesize of a bufferrow by calculating

the data source meta-data at design time. Optimally, the bufferrow sizeshould be assmall as

possible, which can be accomplished by employing thesmallest possible data-type foreach

column. SSIS automatically multiplies theestimated bufferrow size by the

DefaultMaxBufferRows setting to determine how much memory to allocate to each buffer in


19/22


the data-flow engine. If this amount of memory exceeds Max Buffer Size100 MB, SSIS

automatically reduces the number of bufferrows to fit within the 100 MB boundary.

Data-flow task has another property called MinBufferSize, which is 64 KB and cannot be

changed. If the amount of memory estimated by SSIS to be allocated foreach buffer is below 64

KB, SSIS will automatically increase the number of bufferrows per buffer in order to exceed

MinBufferSize memory boundary.

BufferTuning:Data-flow task has a property called BufferSizeTuning. When the value of this property isset

to true, SSIS will add information to the SSIS log indicating where SSIS had adjusted the buffer

size. While buffer tuning, the goalshould be to fit as many rows into buffer as possible. Thus,

the value for DefaultMaxBufferRows should be aslarge as possible without exceeding a total

buffersize of 100 MB.

Parallelism:SSIS natively supports the parallelexecution of packages, tasks and transformations. Therefore,

parallelism can greatly improve the performance of a package when it isconfigures with-in the

constraints ofsystem resources. A package has a property called MaxConcurrentExecutables,

which can beconfigured to set the maximum number of threads that can execute in parallel per

package. By default this isset to -1, which translates to the number oflogical machine

processors plus 2. All orsome of the operations in a packagecan execute in parallel.

Additionally, data-flow task has a property called EngineThreads, which defines how many

threads the data-flow enginecan create and run in parallel. This property appliesequally to boththesource threads that the data flow enginecreates forsources and the worker threads that

theenginecreates for transformations and destinations. Forexample, setting the EngineThreads

property to 10 indicates that the data-flow enginecan create upto 10 source threads and 10

worker threads.

Extraction Tuninga) Increase the connectionmanagers packet size property:Useseparateconnection

managers for bulkloading and smaller packet size for ole-db command transformations

b) Affinitize network connections: thiscan be accomplished if a machine has multiple

cores and multipleNICs.

c) Tune Queries:

--Select only needed columns

--Use a hint to specify that no shared locks be used during theselect (query can potentially

read uncommitted data). Used only if thequery must have the best performance

d) Look-ups

-- Select only needed columns


20/22


--Use the Shared Look-up Cache (available in 2008)

e) Sorting

Merge and Merge-Join transformationsrequiresorted inputs. Source data for these

transformations that is already sorted obviates the need for an upstream Sort transformation

and improves data flow performance. The following properties must beconfigured on a source

component if thesource data is already sorted

a) IsSorted: The outputs of a sourcecomponent have a property called IsSorted. The value of

this property must be true.

b) Sort Key Position: Each output column of a sourcecomponent has this property, which

indicates whether a column issorted, thecolumnssort order and thesequence in which

multiplecolumns aresorted. This property must beset foreach column ofsorted data.

Transformation TuningPartially Blocking (Asynchronous): Merge, Merge-Join, union allcan possible be optimized in the

sourcequery

Use SSIS 2008:

--Improved data flow taskscheduler

--Union All transforms no longer necessary to split up and parallelizeexecution trees

Blocking Transformations (Asynchronous): Aggregate, Sort, Pivot, Un-Pivot should belimited

one per data flow on thesame data

Aggregate Transformations: This transformations includes the Keys, KeyScale, CountDistinctKeys

and CountDistinctScale properties, which improves performance by enabling the transformation

to pre-allocate the amount of memory that the transformation needs for the data that the

transformation caches. If theexact or approximate number of groups that areexpected to result

from a Group By operation isknown, then set the Keys and KeyScale propertiesrespectively. If

theexact or approximate number of distinct values that areexpected to result from a DistinctCount operation isknown, then set the CountDistinctKeys and CountDistinctScale properties

respectively.

If thecreation of multiple aggregations in a data flow is necessary, then consider thecreation of

multiple aggregations that use one Aggregate transformation instead ofcreating multiple

transformations. Performance is improved with this approach because when one aggregation is

a subset of another aggregation, the transformations internalstorage is optimized by scanning

the incoming data only once. Forexample, if an aggregation uses a Group By clause and an AVG

aggregation, then performancecan be improved by combining them into one transformation.

However, aggregation operations areserialized when multiple aggregations are performed

within one aggregation transformation. Therefore, performance might not be improved whenmultiple aggregations must becomputed independently.

Merge-Join TransformationMax Buffers PerInput: this property specifies the maximum number of buffers that can be

active foreach input at one time. This property can be used to tune the amount of memory that

buffersconsume, and consequently the performance of the transformation. As the number of

buffers increase, the more memory the transformation useswhich improves performance. The


21/22


default value of this property is 5. This is the number of buffers that works well in most

scenarios. Performancecan be tuned by using a slightly different number of bufferssuch as 4 or

6.using a very small number of buffersshould be avoided if possible. Forexample, there is a

significant impact on performance when MaxBuffersPerInput isset to 1 instead of 5.

Additionally, MaxBuffersPerInput shouldnt beset to 0 orless. Throttling doesnt occur with this

range of values. Also, depending on the data load the amount of memory available, the package

may not complete.

Slowly Changing Dimensionsthis wizard creates a set of data flow transformation components which work together with the

slowly changing dimension transformation component. This wizard createsOLE DB Command

transformation components that perform Updates against a singlerow at a time. Performance

can be improved by replacing these transformation components with destination components

that save allrows to be updated to a staging table. Then, an Execute SQL Taskcan be added that

performs a singleset-based T-SQLUpdatestatement against allrows at thesame time.

DataTypes1) Use thesmallest possible data-types in the data flow.

2) Use the CAST or CONVERT functions in thesourcequery if possible

Miscellaneous1) Sort in the Query if possible

2) if possible, use the T-SQL Mergestatement instead of the SCD transformation

3) If possible, use the T-SQLInsert Into statement instead of the data flow task

4) A data reload may perform better than a delta refresh

Load TuningUse the SQL Server Destination

1) Only helps if the data flow and the destination databases are on thesame machine

2) Weakererror handling then theOLE DB Destination

3) Set Commit Size = 0

Use OLE DB Destination

1) Set Commit Size = 0

Drop Indexes basedon the expected % load growth

1) Dont drop an index if its the only clustered index: Data in a table issorted by a clustered

index. Primary keys areclustered indexes. Loading will always be faster than dropping and

recreating a primary key, and usually be faster than dropping and recreating a clustered index

2) Drop a non-clustered index if theload willcause 100% increase: This is therule of thumb

3) Dont drop non-clustered index if theload increase is under 10%:Not a rule of thumb,

experiment to find out the optimal value.

Use Partitions ifNecessary

1) Use the SQL Server Profiler to trace the performance

2) see The Data Load Performance Guide

3) Use the Truncatestatement instead of the t-sql Deletestatement. Delete is a logged


22/22


operation which performsslower than Truncate

4) Affinitize the network

Differences between SSIS 2005 and SSIS 2008There is no difference between the architecture of both the SSIS 2005 and SSIS 2008. 2008 hassome

additional features which 2005 did not have, it can besaid that 2008 is theenhancement of features to

the 2005 version.

Look-up

In 2005 for ErrorOutput look-ups had only 3 options Fail Component, Ignore Failure and Re-direct row.

But in 2008 it has an additional feature No match Out-Put

In 2005 it did not had the Cache mode, while 2008 has 3 different Cache modes Full Cache, Partial Cache

and No Cache

2005 didnt have the Connection Manager types while 2008 hasOLE DB Connection Manager and CacheConnection Manager

CacheTransformation

2005 did not have this transformation; it is introduced in 2008 version. This is a Data-flow

transformation. Cache transformation writes data from a connected data source in the data-flow to a

Cache Connection Manager. TheLook-up transformation in a package performslookups on the data

In a single package, only one Cache Transformation can write data to thesame Connection Manager. If

the packagecontains multiple Cache transforms, then first Cache transform that arecalled when the

packageruns, writes the data to theconnection manager. The write operations ofsubsequent cache

transforms fail.Configuring ofthe Cache can be made in the following way

1) Specify theconnection manager

2) Map the input columns in thecache transform to destination columns in the Cache

connection manager

DataProfiling Task

2005 did not have this Task while it is introduced in 2008; this is a Control-flow task. It lets you analyze

data in a SQL Server database and from theresults of that analysis, generate XMLreports that can be

saved to a file or an SSIS variable. By configuring one or more of the tasks profile types, you can

generate a report that provides detailssuch as a columns minimum and maximum values, or thenumber and percentage of null values.

ScriptTaskand Transformation

2008 gives the option of writing thescriptseither in VB or C#, where as 2005 only enabled the users to

write thescripts in only VB

interview topics on sql

Documents