insert, update, and delete destination table with ssis - reza rad's technical blog

Upload: oluwatobiadewale

Post on 13-Oct-2015

133 views

Category:

Documents


1 download

DESCRIPTION

SSIS

TRANSCRIPT

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 1/17

    Insert, Update, and Delete Destination table with SSISPreviously Ive wrote about design and implementation an UPSERT with SSIS. UPSERT is about Update existing

    records, and Insert new records. Today I want to extend this to cover DELETED records as well. So method used in

    this post can be used to find INSERTED / UPDATED / DELETED records from the source table and apply those changes

    into the destination table.

    In this example I used Merge Join Transformation, Conditional Split, and OLE DB Command transform to implement

    the solution. First we apply a full outer join on source and destination table on key column(s) with Merge Join

    transformation. Then we use a conditional split to find out the change type (removed, new, or existing records).

    Existing records will require another processing to find out is there any changes happened or not? We use another

    conditional split to compare value of equivalent columns in source and destination.

    Source table used in this example is Department table from AdventureWorks2012 sample database which you can

    download online for free.

    Solution:

    1- Create an OLE DB Source for source table, use select command below to select data:

    select *

    from dbo.Department

    order by DepartmentID

    Note to the ORDER BY Clause in this statement. That part is required because Merge Join transform require sorted

    sources as input. Name this component as Source Table

    2- Create another OLE DB Source for destination table. In this example source and destination has same table name

    but are in different databases. So we use same script as step 1 for this one as well. Name this component as

    Destination Table.

    3- Right click on OLE DB Source, choose Show Advanced Editor. In the Advanced Editor window go to Input and

    Output Properties tab. Select the OLE DB Source Output, and change the IsSorted Property to true.

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 2/17

    4- Expand OLE DB Source output, and then under Output Columns select DepartmentID. Then change the

    SortKeyPosition to 1.

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 3/17

    5- Apply steps 3 and 4 for both OLE DB Sources (Source Table and Destination Table)

    6- Drag and drop a Merge Join transformation, connect two OLE DB Sources to this. Set Source Table as left and

    Destination Table as right input of this transformation.

    7- Go to Merge Join transformation editor, DepartmentID will be used as joining column (selected based on sort

    properties of previous components). Note that if you dont sort input columns of the merge join transformation then

    you cannot get into the editor of this transformation and you face the error regarding sorting of inputs.

    Select all columns from Source and Destination tables in the merge join transform, and rename them as picture

    below shows (add Source or Destination prefix to each column)

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 4/17

    8- Add a Conditional Split transformation and write two expressions below to find out new records, and removed

    records. Also rename default output as existing records and screenshot below shows

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 5/17

    Expressions used in this sample are very easy and simply find record changes. For example expression below:

    !ISNULL(SourceDepartmentID) && ISNULL(DestinationDepartmentID)

    Used to find new records. And literally means records that has SourceDepartmentID but not

    DestinationDepartmentID.

    And this script used to find deleted records:

    ISNULL(SourceDepartmentID) && !ISNULL(DestinationDepartmentID)

    9- Add an OLE DB Destination and connect NEW RECORDS output to it. Set configuration for destination table and use

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 6/17

    columns with Source prefix in the column mapping of the OLE DB destination. This destination component will insert

    new records into the destination table.

    10- Add an OLE DB Command and connect Removed RECORDS output to it. Create a connection to destination

    database, and write script below to delete records by input department ID:

    delete from dbo.department whereDepartmentID=?

    In the column mappings, map DestinationDepartmentID to the parameter of statement.

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 7/17

    11- Add another Conditional Split and connect Existing Records output to it. We use this component to find only

    records that had a change in one of the values. So we compare equivalent source and destination columns to find

    non-match data.

    This is the expression used to find match data in screenshot below:

    (SourceName == DestinationName) && (SourceGroupName == DestinationGroupName) &&

    (SourceModifiedDate == DestinaitonModifiedDate)

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 8/17

    12- Create a stored procedure in destination database to update the Department table.

    CREATE PROCEDURE dbo.UpdateDepartment

    @DepartmentID smallint

    ,@Name nvarchar(50)

    ,@GroupName nvarchar(50)

    ,@ModifiedDate datetime

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 9/17

    AS

    BEGIN

    SET NOCOUNT ON;

    UPDATE [dbo].[Department]

    SET

    [Name] = @Name

    ,[GroupName] = @GroupName

    ,[ModifiedDate] = @ModifiedDate

    WHERE [DepartmentID] = @DepartmentID

    END

    13- Add another OLE DB Command and use non match output as the input data stream to it. Connect it to destination

    database, and write below statement in Component Properties tabs SQLCommand property.

    exec dbo.UpdateDepartment ?,?,?,?

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 10/17

    14- Map input columns (with source prefixes) to parameters in the stored procedure as screenshot below shows

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 11/17

    15- Run the package and you will see changes will be applied to destination table.

    Testing the solution:

    Here is data rows from source table

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 12/17

    And data rows from destination table

    Yellow records are new records

    Pink records are updated records

    Green record is deleted record (in destination table)

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 13/17

    After running the package you will see records will be redirected to data path as implemented:

    And destination table will pick changes:

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 14/17

    Pos ted by Reza Rad on Tuesday, September 10. 2013 at 20:26

    61344 hits

    Trackbacks

    Trackback specific URI for this entry

    No Trackbacks

    Comments

    Display comments as (Linear | Threaded)

    anudeep says,

    Friday, September 27. 2013 at 04:30

    Hi, nice explanation am tried with your solution its working fine. BUT am getting million of records after union allfrom different local dbs after that sorting and sending to merge join its blocking at sorting can you tell me anybetter way

    Reza Rad says,

    Wednesday, November 06. 2013 at 11:22 (Link)

    Hi Anudeep, the blocking part of your package as you've mentioned is Sort Transformation. I don't recommendusing Sort Transformation because it first load all records in memory and then start sorting them which causeperformance issues and blocking. The best way is always sorting with order by clauses in source t-sqlcommands. But if you are loading data from different sources; the solution for you Is to load them all (with union

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 15/17

    in SSIS or t-sql) into an staging table, and then sort staging table with order by clause. then you can set sortproperties of the OLE DB Source in data flow task and work with Merge Join simply. Regards, Reza

    Nilesh says,

    Thursday, October 24. 2013 at 04:28

    Hi, Above creating packages is good but merge join and update process took lot of times in case of mine tableshave atleast 15K rows at Source and same as dest.. It takes aleates 3-4 min to exceute. I want fast way toexecute thease package within 1 min is it posible? Please guide

    Reza Rad says,

    Wednesday, November 06. 2013 at 11:08 (Link)

    Hi Nilesh, There are other methods to do that, each method has pros and cons. the best method in terms ofperformance and speed of running the etl, is using Merge command of T-SQL. There is also an option for usingLookup component, which performs better especially if reference table is small. The only consideration is thatlookup only works with OLE DB Connection, but if you have other types of data sources, don't worry you can useCache Component and Cache connection to bring them and load them into lookup. Regards, Reza

    M Saidul Karim says,

    Wednesday, November 13. 2013 at 16:58

    I just build my required SSIS package using VS2005. Your article made my day. Keep up the good work. Thankyou.

    Reza Rad says,

    Wednesday, November 13. 2013 at 18:15 (Link)

    Hi Saidul, Glad to hear That Regards, Reza

    Sebastin says,

    Monday, November 18. 2013 at 12:04

    Hi Reza, I need to develop a package that does the following ... Objects: - Source: Table People MySql database.Columns: PK DNI_Number int, name varchar (100), LastName varchar (100). - To: Table People Database SqlServer 2012. Columns: PK PeopleID int - identity-, DNI_Number int, name varchar (100), LastName varchar (100).Actions that I make and assumptions: - In the target table, I place all persons in the origin and still not in thedestination. - In the target table, I need to update all those that have been modified at the origin. - The People ofthe source table only use it to read (I have not update anything). - The People of the destination table can not bedeleted and refilled as its primary key is foreign key in other tables of the model. - The key business of both tables

    is DNI_Number. Thank you very much for your help from Argentina - the world's ass jeje -

    Reza Rad says,

    Tuesday, November 26. 2013 at 11:23 (Link)

    Hi Sebastian, Do you want to apply rules on tables (such as people on destination table cannot be deleted) ? ordo you want to do the data transfer? for applying rules you can write Constraints and use Triggers on that tableand fields in the database. for data transfer you can use data flow transformations such as Lookup to find therecord, if it exists update it with OLE DB Command, otherwise create it with OLE DB Destination. If you tell me inwhich part of the scenario you have the question that will helps in providing exact answer. Regards, Reza

    Abhas says,

    Monday, December 23. 2013 at 22:10

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 16/17

    Hi Reza, Very nice explanation. Thanks ABhas.

    Reza Rad says,

    Sunday, December 29. 2013 at 21:29 (Link)

    Hi Abhas, Thanks for your visit, and the feedback. Regards, Reza

    Nicolas says,

    Friday, January 10. 2014 at 12:47

    Hi Reza, your explanation was great, but im having trouble with the stored procedure, every time a row gets intothe update proc, the flow freezes and i have to stop it, any suggestions???

    Reza Rad says,

    Saturday, January 11. 2014 at 22:33 (Link)

    Hi Nicolas, Freezing might have different reasons. could you just use Update command instead of exec storedprocedure inside the OLE DB command transformation? please let me know if it doesn't make any differenceRegards, Reza

    saurabh says,

    Sunday, February 09. 2014 at 23:30 (Link)

    i dont want to make storedprocedure in my database. then how can i update records of table. if u created storedprocedure then we can write only a merge statement for delete,insert and update in stored procedure. what theneed of these conditional spilit and others..... so pls reply. i want to update records without creating storedprocedure in my db.

    Reza Rad says,

    Monday, February 10. 2014 at 00:47 (Link)

    Hi Saurabh, You can write your t-sql command inside OLE DB Command transformation. I just used storedprocedure to show it clean, and also with more friendly parameter names. But there is no mandatoryrequirement for using stored procedure. if you don't want to create stored procedure, then simply write yourscript directly in OLE DB command. Regards, Reza

    Dana says,

    Thursday, March 20. 2014 at 14:57

    Great explanation. I found this article very detailed and understandable. It helped guide me in the direction togo for my package. The stored procedure helped me out too; rather than passing to generic param_0,param_1 I can see my parameter names which makes it a lot cleaner and easier to read.

    Reza Rad says,

    Wednesday, March 26. 2014 at 03:23 (Link)

    Hi Dana, I am glad that my post helped you. Thanks for your comment. Regards, Reza

    Shruti says,

    Friday, March 28. 2014 at 00:59

    Hi, thanks for your article, it really helps. But, my requirement is I have to do the above approach for multiple

  • 30/6/2014 Insert, Update, and Delete Destination table with SSIS - Reza Rad's Technical blog

    http://www.rad.pasfu.com/index.php?/archives/150-Insert,-Update,-and-Delete-Destination-table-with-SSIS.html#extended 17/17

    tables(nearly 40) that is an entire database. So, what is the best way to do this? Thanks in advance for your help.

    Reza Rad says,

    Saturday, May 24. 2014 at 19:41 (Link)

    Hi Shruti, Sorry for my late response. Do you want to use SSIS for synching two databases? Of course you cando that with SSIS, but Microsoft Sync Framework might be better option for that, it will give you what you wantfor synching two databases with much easier steps. But if you want to do such scenario in ETL; then Irecommend using SSIS package generators such as BIML, it helps a lot when you want to replicate a logicthrough multiple SSIS packages. with few lines of BIML script you will get all your packages generated. But forsure you will require some customization at the end for each package. Regards, Reza

    David Burghgraeve says,

    Thursday, May 22. 2014 at 00:47 (Link)

    Thanks! This post helped me a lot! I'm new in SSIS but I like it a lot I'm doing this with a source on Oracle anda Destination on SQL Server. One catch I've had was with the Data Types nvarchar vs. nchar between the twotables on the Oracle and SQL Server. This makes the Merge Join step not to work properly.

    Reza Rad says,

    Saturday, May 24. 2014 at 19:33 (Link)

    Hi David, You can use Data Conversion Transformation after one of the data sources, and change data types fromDT_STR to DT_WSTR or reverse, and then when both data types matches, you can use Merge Join Transform tojoin them together. Regards, Reza