sap data services 4.x cookbook

624

Upload: jueaen-c-leija

Post on 07-Jul-2016

811 views

Category:

Documents


412 download

DESCRIPTION

Dataservices Book

TRANSCRIPT

Page 1: SAP Data Services 4.x Cookbook
Page 2: SAP Data Services 4.x Cookbook

SAPDataServices4.xCookbook

Page 3: SAP Data Services 4.x Cookbook

TableofContentsSAPDataServices4.xCookbook

Credits

AbouttheAuthor

AbouttheReviewers

www.PacktPub.com

Supportfiles,eBooks,discountoffers,andmoreWhysubscribe?FreeaccessforPacktaccountholdersInstantupdatesonnewPacktbooks

Preface

WhatthisbookcoversWhatyouneedforthisbookWhothisbookisforSectionsGettingreadyHowtodoit…Howitworks…There’smore…Seealso

ConventionsReaderfeedbackCustomersupportDownloadingtheexamplecodeDownloadingthecolorimagesofthisbookErrataPiracyQuestions

1.IntroductiontoETLDevelopment

IntroductionPreparingadatabaseenvironmentGettingreadyHowtodoit…Howitworks…

CreatingasourcesystemdatabaseHowtodoit…Howitworks…There’smore…

DefiningandcreatingstagingareastructuresHowtodoit…

Flatfiles

Page 4: SAP Data Services 4.x Cookbook

RDBMStablesHowitworks…

CreatingatargetdatawarehouseGettingreadyHowtodoit…Howitworks…There’smore…

2.ConfiguringtheDataServicesEnvironment

IntroductionCreatingIPSandDataServicesrepositoriesGettingready…Howtodoit…Howitworks…Seealso

InstallingandconfiguringInformationPlatformServicesGettingready…Howtodoit…Howitworks…

InstallingandconfiguringDataServicesGettingready…Howtodoit…Howitworks…

ConfiguringuseraccessGettingready…Howtodoit…Howitworks…

StartingandstoppingservicesHowtodoit…Howitworks…Seealso

AdministeringtasksHowtodoit…Howitworks…Seealso

UnderstandingtheDesignertoolGettingready…Howtodoit…Howitworks…

ExecutingETLcodeinDataServicesValidatingETLcodeTemplatetablesQuerytransformbasicsTheHelloWorldexample

3.DataServicesBasics–DataTypes,ScriptingLanguage,andFunctions

Page 5: SAP Data Services 4.x Cookbook

IntroductionCreatingvariablesandparametersGettingreadyHowtodoit…Howitworks…There’smore…

CreatingascriptHowtodoit…Howitworks…

UsingstringfunctionsHowtodoit…

UsingstringfunctionsinthescriptHowitworks…There’smore…

UsingdatefunctionsHowtodoit…

GeneratingcurrentdateandtimeExtractingpartsfromdates

Howitworks…There’smore…

UsingconversionfunctionsHowtodoit…Howitworks…There’smore…

UsingdatabasefunctionsHowtodoit…

key_generation()total_rows()sql()

Howitworks…UsingaggregatefunctionsHowtodoit…Howitworks…

UsingmathfunctionsHowtodoit…Howitworks…There’smore…

UsingmiscellaneousfunctionsHowtodoit…Howitworks…

CreatingcustomfunctionsHowtodoit…Howitworks…There’smore…

4.Dataflow–Extract,Transform,andLoad

Page 6: SAP Data Services 4.x Cookbook

IntroductionCreatingasourcedataobjectHowtodoit…Howitworks…There’smore…

CreatingatargetdataobjectGettingreadyHowtodoit…Howitworks…There’smore…

LoadingdataintoaflatfileHowtodoit…Howitworks…There’smore…

LoadingdatafromaflatfileHowtodoit…Howitworks…There’smore…

Loadingdatafromtabletotable–lookupsandjoinsHowtodoit…Howitworks…

UsingtheMap_OperationtransformHowtodoit…Howitworks…

UsingtheTable_ComparisontransformGettingreadyHowtodoit…Howitworks…

ExploringtheAutocorrectloadoptionGettingreadyHowtodoit…Howitworks…

SplittingtheflowofdatawiththeCasetransformGettingreadyHowtodoit…Howitworks…

MonitoringandanalyzingdataflowexecutionGettingreadyHowtodoit…Howitworks…There’smore…

5.Workflow–ControllingExecutionOrder

IntroductionCreatingaworkflowobjectHowtodoit…

Page 7: SAP Data Services 4.x Cookbook

Howitworks…NestingworkflowstocontroltheexecutionorderGettingreadyHowtodoitHowitworks…

UsingconditionalandwhileloopobjectstocontroltheexecutionorderGettingreadyHowtodoit…Howitworks…Thereismore…

UsingthebypassingfeatureGettingready…Howtodoit…Howitworks…Thereismore…

Controllingfailures–try-catchobjectsHowtodoit…Howitworks…

Usecaseexample–populatingdimensiontablesGettingreadyHowtodoit…Howitworks…

MappingDependenciesDevelopmentExecutionorderTestingETL

PreparingtestdatatopopulateDimSalesTerritoryPreparingtestdatatopopulateDimGeography

UsingacontinuousworkflowHowtodoit…Howitworks…Thereismore…Peekinginsidetherepository–parent-childrelationshipsbetweenData

ServicesobjectsGettingreadyHowtodoit…Howitworks…

GetalistofobjecttypesandtheircodesintheDataServicesrepository

DisplayinformationabouttheDF_Transform_DimGeographydataflow

DisplayinformationabouttheSalesTerritorytableobjectSeethecontentsofthescriptobject

6.Job–BuildingtheETLArchitecture

Page 8: SAP Data Services 4.x Cookbook

IntroductionProjectsandjobs–organizingETLGettingreadyHowtodoit…Howitworks…

HierarchicalobjectviewHistoryexecutionlogfilesExecuting/schedulingjobsfromtheManagementConsole

UsingobjectreplicationHowtodoit…Howitworks…

MigratingETLcodethroughthecentralrepositoryGettingreadyHowtodoit…Howitworks…

AddingobjectstoandfromtheCentralObjectLibraryComparingobjectsbetweentheLocalandCentralrepositories

Thereismore…MigratingETLcodewithexport/importGettingreadyHowtodoit…

Import/ExportusingATLfilesDirectexporttoanotherlocalrepository

Howitworks…DebuggingjobexecutionGettingready…Howtodoit…Howitworks…

MonitoringjobexecutionGettingreadyHowtodoit…Howitworks…

BuildinganexternalETLauditandauditreportingGettingready…Howtodoit…Howitworks…

Usingbuilt-inDataServicesETLauditandreportingfunctionalityGettingreadyHowtodoit…Howitworks…

AutoDocumentationinDataServicesHowtodoit…Howitworks…

7.ValidatingandCleansingData

Introduction

Page 9: SAP Data Services 4.x Cookbook

CreatingvalidationfunctionsGettingreadyHowtodoit…Howitworks…

UsingvalidationfunctionswiththeValidationtransformGettingreadyHowtodoit…Howitworks…

ReportingdatavalidationresultsGettingreadyHowtodoit…Howitworks…

UsingregularexpressionsupporttovalidatedataGettingreadyHowtodoit…Howitworks…

EnablingdataflowauditGettingreadyHowtodoit…Howitworks…There’smore…

DataQualitytransforms–cleansingyourdataGettingreadyHowtodoit…Howitworks…There’smore…

8.OptimizingETLPerformance

IntroductionOptimizingdataflowexecution–push-downtechniquesGettingreadyHowtodoit…Howitworks…

Optimizingdataflowexecution–theSQLtransformHowtodoit…Howitworks…

Optimizingdataflowexecution–theData_TransfertransformGettingreadyHowtodoit…Howitworks…

WhyweusedasecondData_TransfertransformobjectWhentouseData_Transfertransform

There’smore…Optimizingdataflowreaders–lookupmethodsGettingreadyHowtodoit…

Page 10: SAP Data Services 4.x Cookbook

LookupwiththeQuerytransformjoinLookupwiththelookup_ext()functionLookupwiththesql()function

Howitworks…Querytransformjoinslookup_ext()sql()Performancereview

Optimizingdataflowloaders–bulk-loadingmethodsHowtodoit…Howitworks…

Whentoenablebulkloading?Optimizingdataflowexecution–performanceoptionsGettingreadyHowtodoit…

DataflowperformanceoptionsSourcetableperformanceoptionsQuerytransformperformanceoptionslookup_ext()performanceoptionsTargettableperformanceoptions

9.AdvancedDesignTechniques

IntroductionChangeDataCapturetechniquesGettingready

NohistorySCD(Type1)LimitedhistorySCD(Type3)UnlimitedhistorySCD(Type2)

Howtodoit…Howitworks…

Source-basedETLCDCTarget-basedETLCDCNativeCDC

AutomaticjobrecoveryinDataServicesGettingreadyHowtodoit…Howitworks…There’smore…

SimplifyingETLexecutionwithsystemconfigurationsGettingreadyHowtodoit…Howitworks…

TransformingdatawiththePivottransformGettingreadyHowtodoit…Howitworks…

Page 11: SAP Data Services 4.x Cookbook

10.DevelopingReal-timeJobs

IntroductionWorkingwithnestedstructuresGettingreadyHowtodoit…Howitworks…Thereismore…

TheXML_MaptransformGettingreadyHowtodoit…Howitworks…

TheHierarchy_FlatteningtransformGettingreadyHowtodoit…

HorizontalhierarchyflatteningVerticalhierarchyflattening

Howitworks…Queryingresulttables

ConfiguringAccessServerGettingreadyHowtodoit…Howitworks…

Creatingreal-timejobsGettingready

InstallingSoapUIHowtodoit…Howitworks…

11.WorkingwithSAPApplications

IntroductionLoadingdataintoSAPERPGettingreadyHowtodoit…Howitworks…

IDocMonitoringIDocloadontheSAPsidePost-loadvalidationofloadeddata

Thereismore…

12.IntroductiontoInformationSteward

IntroductionExploringDataInsightcapabilitiesGettingreadyHowtodoit…

CreatingaconnectionobjectProfilingthedata

Page 12: SAP Data Services 4.x Cookbook

ViewingprofilingresultsCreatingavalidationruleCreatingascorecard

Howitworks…ProfilingRulesScorecards

Thereismore…PerformingMetadataManagementtasksGettingreadyHowtodoit…Howitworks…

WorkingwiththeMetapediafunctionalityHowtodoit…Howitworks…

CreatingacustomcleansingpackagewithCleansingPackageBuilderGettingreadyHowtodoit…Howitworks…Thereismore…

Index

Page 13: SAP Data Services 4.x Cookbook

SAPDataServices4.xCookbook

Page 14: SAP Data Services 4.x Cookbook

SAPDataServices4.xCookbookCopyright©2015PacktPublishing

Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthepublisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.

Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyoftheinformationpresented.However,theinformationcontainedinthisbookissoldwithoutwarranty,eitherexpressorimplied.Neithertheauthor,norPacktPublishing,anditsdealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecauseddirectlyorindirectlybythisbook.

PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthecompaniesandproductsmentionedinthisbookbytheappropriateuseofcapitals.However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.

Firstpublished:November2015

Productionreference:1261115

PublishedbyPacktPublishingLtd.

LiveryPlace

35LiveryStreet

BirminghamB32PB,UK.

ISBN978-1-78217-656-5

www.packtpub.com

Page 15: SAP Data Services 4.x Cookbook

CreditsAuthor

IvanShomnikov

Reviewers

AndrésAguadoAranda

DickGroenhof

BernardTimbalDuclauxdeMartin

SridharSunkaraneni

MeenakshiVerma

CommissioningEditor

VinayArgekar

AcquisitionEditors

ShaonBasu

KevinColaco

ContentDevelopmentEditor

MerintMathew

TechnicalEditor

HumeraShaikh

CopyEditors

BrandtD’mello

ShrutiIyer

KarunaNarayanan

SameenSiddiqui

ProjectCoordinator

FrancinaPinto

Proofreader

SafisEditing

Indexer

MonicaAjmeraMehta

ProductionCoordinator

NileshMohite

Page 16: SAP Data Services 4.x Cookbook

CoverWork

NileshMohite

Page 17: SAP Data Services 4.x Cookbook

AbouttheAuthorIvanShomnikovisanSAPanalyticsconsultantspecializingintheareaofExtract,Transform,andLoad(ETL).Hehasin-depthknowledgeofthedatawarehouselifecycleprocesses(DWHdesignandETLdevelopment)andextensivehands-onexperiencewithboththeSAPEnterpriseInformationManagement(DataServices)technologystackandtheSAPBusinessObjectsreportingproductsstack(WebIntelligence,Designer,Dashboards).

IvanhasbeeninvolvedintheimplementationofcomplexBIsolutionsontheSAPBusinessObjectsEnterpriseplatforminmajorNewZealandcompaniesacrossdifferentindustries.HealsohasastrongbackgroundasanOracledatabaseadministratoranddeveloper.

Thisismyfirstexperienceofwritingabook,andIwouldliketothankmypartnerandmysonfortheirpatienceandsupport.

Page 18: SAP Data Services 4.x Cookbook

AbouttheReviewersAndrésAguadoArandaisa26-year-oldcomputerengineerfromSpain.Hisexperiencehasgivenhimareallytechnicalbackgroundindatabases,datawarehouse,andbusinessintelligence.

Andréshasworkedindifferentbusinesssectors,suchasbanking,publicadministrations,andenergy,since2012indata-relatedpositions.

Thisbookismyfirststintasareviewer,andithasbeenreallyinterestingandvaluabletome,bothpersonallyandprofessionally.

IwouldliketothankmyfamilyandfriendsforalwaysbeingwillingtohelpmewhenIneeded.Also,Iwouldliketothanktomyformercoworkerandcurrentlyfriend,AntonioMartín-Cobos,aBIreportinganalystwhoreallyhelpedmegetthisopportunity.

DickGroenhofstartedhisprofessionalcareerin1990afterfinishinghisstudiesinbusinessinformationscienceatVrijeUniversiteitAmsterdam.Havingworkedasasoftwaredeveloperandservicemanagementconsultantforthefirstpartofhiscareer,hebecameactiveasaconsultantinthebusinessintelligencearenasince2005.

DickhasbeenaleadconsultantonnumerousSAPBIprojects,designingandimplementingsuccessfulsolutionsforhiscustomers,whoregardhimasatrustedadvisor.Hiscorecompetencesincludebothfrontend(suchasWebIntelligence,CrystalReports,andSAPDesignStudio)andbackendtools(suchasSAPDataServicesandInformationSteward).DickisanearlyadopteroftheSAPHANAplatform,creatinginnovativesolutionsusingHANAInformationViews,PredictiveAnalysisLibrary,andSQLScript.

HeisaCertifiedApplicationAssociateinSAPHANAandSAPBusinessObjectsWebIntelligence4.1.Currently,DickworksasseniorHANAandbigdataconsultantforahighlyrespectedandinnovativeSAPpartnerintheNetherlands.

HeisastrongbelieverinsharinghisknowledgewithregardtoSAPHANAandSAPDataServicesbywritingblogs(athttp://www.dickgroenhof.comandhttp://www.thenextview.nl/blog)andspeakingatseminars.

DickishappilymarriedtoEmmaandisaveryproudfatherofhisson,Christiaan,anddaughter,Myrthe.

BernardTimbalDuclauxdeMartinisabusinessintelligencearchitectandtechnicalexpertwithmorethan15yearsofexperience.Hehasbeeninvolvedinseverallargebusinessintelligencesystemdeploymentsandadministrationinbankingandinsurancecompanies.Inaddition,Bernardhasskillsinmodeling,dataextraction,transformation,loading,andreportingdesign.Hehasauthoredfourbooks,includingtworegardingSAPBusinessObjectsEnterpriseadministration.

MeenakshiVermahasbeenapartoftheITindustrysince1998.SheisanexperiencedbusinesssystemsspecialisthavingtheCBAPandTOGAFcertifications.Meenakshiiswell-versedwithavarietyoftoolsandtechniquesusedforbusinessanalysis,suchasSAPBI,SAPBusinessObjects,Java/J2EEtechnologies,andothers.SheiscurrentlybasedinToronto,Canada,andworkswithaleadingutilitycompany.

Page 19: SAP Data Services 4.x Cookbook

MeenakshihashelpedtechnicallyreviewmanybookspublishedbyPacktPublishingacrossvariousenterprisesolutions.HerearlierworksincludeJasperReportsforJavaDevelopers,JavaEE5DevelopmentusingGlassFishApplicationServer,PracticalDataAnalysisandReportingwithBIRT,EJB3DeveloperGuide,LearningDojo,andIBMWebSphereApplicationServer8.0AdministrationGuide.

I’dliketothankmyfather,Mr.BhopalSingh,andmother,Mrs.RajBala,forlayingastrongfoundationinmeandgivingmetheirunconditionalloveandsupport.Ialsoowethanksandgratitudetomyhusband,AtulVerma,forhisencouragementandsupportthroughoutthereviewingofthisbookandmanyothers;myten-year-oldson,PrieyaanshVerma,forgivingmethewarmthofhislovedespitemyhecticschedules;andmybrother,SachinSingh,foralwaysbeingthereforme.

Page 20: SAP Data Services 4.x Cookbook

www.PacktPub.com

Page 21: SAP Data Services 4.x Cookbook

Supportfiles,eBooks,discountoffers,andmoreForsupportfilesanddownloadsrelatedtoyourbook,pleasevisitwww.PacktPub.com.

DidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFandePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandasaprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwithusat<[email protected]>formoredetails.

Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signupforarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooksandeBooks.

https://www2.packtpub.com/books/subscription/packtlib

DoyouneedinstantsolutionstoyourITquestions?PacktLibisPackt’sonlinedigitalbooklibrary.Here,youcansearch,access,andreadPackt’sentirelibraryofbooks.

Page 22: SAP Data Services 4.x Cookbook

Whysubscribe?

FullysearchableacrosseverybookpublishedbyPacktCopyandpaste,print,andbookmarkcontentOndemandandaccessibleviaawebbrowser

Page 23: SAP Data Services 4.x Cookbook

FreeaccessforPacktaccountholdersIfyouhaveanaccountwithPacktatwww.PacktPub.com,youcanusethistoaccessPacktLibtodayandview9entirelyfreebooks.Simplyuseyourlogincredentialsforimmediateaccess.

Page 24: SAP Data Services 4.x Cookbook

InstantupdatesonnewPacktbooksGetnotified!Findoutwhennewbooksarepublishedbyfollowing@PacktEnterpriseonTwitterorthePacktEnterpriseFacebookpage.

Page 25: SAP Data Services 4.x Cookbook

PrefaceSAPDataServicesdeliversanenterprise-classsolutiontobuilddataintegrationprocessesaswellasperformdataqualityanddataprofilingtasks,allowingyoutogovernyourdatainahighly-efficientway.

SomeofthetasksthatDataServiceshelpsaccomplishinclude:migrationofthedatabetweendatabasesorapplications,extractingdatafromvarioussourcesystemsintoflatfiles,datacleansing,datatransformationusingeithercommondatabase-likefunctionsorcomplexcustom-builtfunctionsthatarecreatedusinganinternalscriptinglanguage,andofcourse,loadingdataintoyourdatawarehouseorexternalsystems.SAPDataServiceshasanintuitiveuser-friendlygraphicalinterface,allowingyoutoaccessallitspowerfulExtract,Transform,andLoad(ETL)capabilitiesfromthesingleDesignertool.However,gettingstartedwithSAPDataServicescanbedifficult,especiallyforpeoplewhohavelittleornoexperienceinETLdevelopment.Thegoalofthisbookistoguideyouthrougheasy-to-understandexamplesofbuildingyourownETLarchitecture.Thebookcanalsobeusedasareferencetoperformspecifictasksasitprovidesreal-worldexamplesofusingthetooltosolvedataintegrationproblems.

Page 26: SAP Data Services 4.x Cookbook

WhatthisbookcoversChapter1,IntroductiontoETLDevelopment,explainswhatExtract,Transform,andLoad(ETL)processesare,andwhatroleDataServicesplaysinETLdevelopment.Itincludesthestepstoconfigurethedatabaseenvironmentusedinrecipesofthebook.

Chapter2,ConfiguringtheDataServicesEnvironment,explainshowtoinstallandconfigureallDataServicescomponentsandapplications.ItintroducestheDataServicesdevelopmentGUI—theDesignertool—withthesimpleexampleof“HelloWorld”ETLcode.

Chapter3,DataServicesBasics–DataTypes,ScriptingLanguage,andFunctions,introducesthereadertoDataServicesinternalscriptinglanguage.ItexplainsvariouscategoriesoffunctionsthatareavailableinDataServices,andgivesthereaderanexampleofhowscriptinglanguagecanbeusedtocreatecustomfunctions.

Chapter4,Dataflow–Extract,Transform,andLoad,introducesthemostimportantprocessingunitinDataService,dataflowobject,andthemostusefultypesoftransformationsthatcanbeperformedinsideadataflow.Itgivesthereaderexamplesofextractingdatafromsourcesystemsandloadingdataintotargetdatastructures.

Chapter5,Workflow–ControllingExecutionOrder,introducesanotherDataServicesobject,workflow,whichisusedtogroupotherworkflows,dataflows,andscriptobjectsintoexecutionunits.ItexplainstheconditionalandloopstructuresavailableinDataServices.

Chapter6,Job–BuildingtheETLArchitecture,bringsthereadertothejobobjectlevelandreviewsthestepsusedinthedevelopmentprocesstomakeasuccessfulandrobustETLsolution.ItcoversthemonitoringanddebuggingfunctionalityavailableinDataServicesandembeddedauditfeatures.

Chapter7,ValidatingandCleansingData,introducestheconceptsofvalidatingmethods,whichcanbeappliedtothedatapassingthroughtheETLprocessesinordertocleanseandconformitaccordingtothedefinedDataQualitystandards.

Chapter8,OptimizingETLPerformance,isoneofthefirstadvancedchapters,whichstartsexplainingcomplexETLdevelopmenttechniques.ThisparticularchapterhelpstheuserunderstandhowtheexistingprocessescanbeoptimizedfurtherinDataServicesinordertomakesurethattheyrunquicklyandefficiently,consumingaslesscomputerresourcesaspossiblewiththeleastamountofexecutiontime.

Chapter9,AdvancedDesignTechniques,guidesthereaderthroughadvanceddatatransformationtechniques.ItintroducesconceptsofChangeDataCapturemethodsthatareavailableinDataServices,pivotingtransformations,andautomaticrecoveryconcepts.

Chapter10,DevelopingReal-timeJobs,introducestheconceptofnestedstructuresandthetransformsthatworkwithnestedstructures.ItcoversthemainsaspectsofhowtheycanbecreatedandusedinDataServicesreal-timejobs.ItalsointroducesnewaDataServicescomponent—AccessServer.

Chapter11,WorkingwithSAPApplications,isdedicatedtothetopicofreadingand

Page 27: SAP Data Services 4.x Cookbook

loadingdatafromSAPsystemswiththeexampleoftheSAPERPsystem.Itpresentsthereal-lifeusecaseofloadingdataintotheSAPERPsystemmodule.

Chapter12,IntroductiontoInformationSteward,coversanotherSAPproduct,InformationSteward,whichaccompaniesDataServicesandprovidesacomprehensiveviewoftheorganization’sdata,andhelpsvalidateandcleanseitbyapplyingDataQualitymethods.

Page 28: SAP Data Services 4.x Cookbook

WhatyouneedforthisbookTousetheexamplesgiveninthisbook,youwillneedtodownloadandmakesurethatyouarelicensedtousethefollowingsoftwareproducts:

SQLServerExpress2012SAPDataServices4.2SP4orhigherSAPInformationSteward4.2SP4orhigherSAPERP(ECC)SoapUI—5.2.0

Page 29: SAP Data Services 4.x Cookbook

WhothisbookisforThebookwillbeusefultoapplicationdevelopersanddatabaseadministratorswhowanttogetfamiliarwithETLdevelopmentusingSAPDataServices.ItcanalsobeusefultoETLdevelopersorconsultantswhowanttoimproveandextendtheirknowledgeofthistool.ThebookcanalsobeusefultodataandbusinessanalystswhowanttotakeapeekatthebackendofBIdevelopment.TheonlyrequirementofthisbookisthatyouarefamiliarwiththeSQLlanguageandgeneraldatabaseconcepts.Knowledgeofanykindofprogramminglanguagewillbeabenefitaswell.

Page 30: SAP Data Services 4.x Cookbook

SectionsInthisbook,youwillfindseveralheadingsthatappearfrequently(Gettingready,Howtodoit,Howitworks,There’smore,andSeealso).

Togiveclearinstructionsonhowtocompletearecipe,weusethesesectionsasfollows:

Page 31: SAP Data Services 4.x Cookbook

GettingreadyThissectiontellsyouwhattoexpectintherecipe,anddescribeshowtosetupanysoftwareoranypreliminarysettingsrequiredfortherecipe.

Page 32: SAP Data Services 4.x Cookbook

Howtodoit…Thissectioncontainsthestepsrequiredtofollowtherecipe.

Page 33: SAP Data Services 4.x Cookbook

Howitworks…Thissectionusuallyconsistsofadetailedexplanationofwhathappenedintheprevioussection.

Page 34: SAP Data Services 4.x Cookbook

There’smore…Thissectionconsistsofadditionalinformationabouttherecipeinordertomakethereadermoreknowledgeableabouttherecipe.

Page 35: SAP Data Services 4.x Cookbook

SeealsoThissectionprovideshelpfullinkstootherusefulinformationfortherecipe.

Page 36: SAP Data Services 4.x Cookbook

ConventionsInthisbook,youwillfindanumberoftextstylesthatdistinguishbetweendifferentkindsofinformation.Herearesomeexamplesofthesestylesandanexplanationoftheirmeaning.

Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows:“Wecanincludeothercontextsthroughtheuseoftheincludedirective.”

Ablockofcodeissetasfollows:select*

fromdbo.al_langtexttxt

JOINdbo.al_parent_childpc

ontxt.parent_objid=pc.descen_obj_key

where

pc.descen_obj=‘WF_continuous’;

Whenwewishtodrawyourattentiontoaparticularpartofacodeblock,therelevantlinesoritemsaresetinbold:AlGUIComment(“ActaName_1”=‘RSavedAfterCheckOut’,“ActaName_2”=‘RDate_created’,“ActaName_3”=‘RDate_modified’,“ActaValue_1”=‘YES’,“ActaValue_2”=‘SatJul0416:52:332015’,“ActaValue_3”=‘SunJul0511:18:022015’,“x”=’-1’,“y”=’-1’)

CREATEPLANWF_continuous::‘7bb26cd4-3e0c-412a-81f3-b5fdd687f507’()

DECLARE

$l_DirectoryVARCHAR(255);

$l_FileVARCHAR(255);

BEGIN

AlGUIComment(“UI_DATA_XML”=’<UIDATA><MAINICON><LOCATION><X>0</X>

<Y>0</Y></LOCATION><SIZE><CX>216</CX><CY>-179</CY></SIZE></MAINICON>

<DESCRIPTION><LOCATION><X>0</X><Y>-190</Y></LOCATION><SIZE><CX>200</CX>

<CY>200</CY></SIZE><VISIBLE>0</VISIBLE></DESCRIPTION></U

IDATA>’,“ui_display_name”=‘script’,“ui_script_text”=’$l_Directory='C:\\AW\\Files\\';

$l_File='flag.txt';

$g_count=$g_count+1;

print('Execution#'||$g_count);

print('Starting'||workflow_name()||'…');

sleep(10000);

print('Finishing'||workflow_name()||'…');’,“x”=‘116’,“y”=’-175’)

BEGIN_SCRIPT

$l_Directory=‘C:\AW\Files\’;$l_File=‘flag.txt’;$g_count=($g_count+1);print((‘Execution#’||$g_count));print(((‘Starting’||workflow_name())||’…’));sleep(10000);print(((‘Finishing’||workflow_name())||’…’));END

END

SET(“loop_exit”=‘fn_check_flag($l_Directory,$l_File)’,“loop_exit

_option”=‘yes’,“restart_condition”=‘no’,“restart_count”=‘10’,“restart_count_option”=‘yes’,“workflow_type”=‘Continuous’)

Anycommand-lineinputoroutputiswrittenasfollows:

setup.exeSERVERINSTALL=Yes

Newtermsandimportantwordsareshowninbold.Wordsthatyouseeonthescreen,forexample,inmenusordialogboxes,appearinthetextlikethis:“OpentheworkflowpropertiesagaintoeditthecontinuousoptionsusingtheContinuousOptionstab.”

Page 37: SAP Data Services 4.x Cookbook

NoteWarningsorimportantnotesappearinaboxlikethis.

TipTipsandtricksappearlikethis.

Page 38: SAP Data Services 4.x Cookbook

ReaderfeedbackFeedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthisbook—whatyoulikedordisliked.Readerfeedbackisimportantforusasithelpsusdeveloptitlesthatyouwillreallygetthemostoutof.

Tosendusgeneralfeedback,simplye-mail<[email protected]>,andmentionthebook’stitleinthesubjectofyourmessage.

Ifthereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingorcontributingtoabook,seeourauthorguideatwww.packtpub.com/authors.

Page 39: SAP Data Services 4.x Cookbook

CustomersupportNowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelpyoutogetthemostfromyourpurchase.

Page 40: SAP Data Services 4.x Cookbook

DownloadingtheexamplecodeYoucandownloadtheexamplecodefilesfromyouraccountathttp://www.packtpub.comforallthePacktPublishingbooksyouhavepurchased.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.

Page 41: SAP Data Services 4.x Cookbook

DownloadingthecolorimagesofthisbookWealsoprovideyouwithaPDFfilethathascolorimagesofthescreenshots/diagramsusedinthisbook.Thecolorimageswillhelpyoubetterunderstandthechangesintheoutput.Youcandownloadthisfilefrom:https://www.packtpub.com/sites/default/files/downloads/6565EN_Graphics.pdf.

Page 42: SAP Data Services 4.x Cookbook

ErrataAlthoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdohappen.Ifyoufindamistakeinoneofourbooks—maybeamistakeinthetextorthecode—wewouldbegratefulifyoucouldreportthistous.Bydoingso,youcansaveotherreadersfromfrustrationandhelpusimprovesubsequentversionsofthisbook.Ifyoufindanyerrata,pleasereportthembyvisitinghttp://www.packtpub.com/submit-errata,selectingyourbook,clickingontheErrataSubmissionFormlink,andenteringthedetailsofyourerrata.Onceyourerrataareverified,yoursubmissionwillbeacceptedandtheerratawillbeuploadedtoourwebsiteoraddedtoanylistofexistingerrataundertheErratasectionofthattitle.

Toviewthepreviouslysubmittederrata,gotohttps://www.packtpub.com/books/content/supportandenterthenameofthebookinthesearchfield.TherequiredinformationwillappearundertheErratasection.

Page 43: SAP Data Services 4.x Cookbook

PiracyPiracyofcopyrightedmaterialontheInternetisanongoingproblemacrossallmedia.AtPackt,wetaketheprotectionofourcopyrightandlicensesveryseriously.IfyoucomeacrossanyillegalcopiesofourworksinanyformontheInternet,pleaseprovideuswiththelocationaddressorwebsitenameimmediatelysothatwecanpursuearemedy.

Pleasecontactusat<[email protected]>withalinktothesuspectedpiratedmaterial.

Weappreciateyourhelpinprotectingourauthorsandourabilitytobringyouvaluablecontent.

Page 44: SAP Data Services 4.x Cookbook

QuestionsIfyouhaveaproblemwithanyaspectofthisbook,youcancontactusat<[email protected]>,andwewilldoourbesttoaddresstheproblem.

Page 45: SAP Data Services 4.x Cookbook

Chapter1.IntroductiontoETLDevelopmentInthischapter,wewillcover:

PreparingadatabaseenvironmentCreatingasourcesystemdatabaseDefiningandcreatingstagingareastructuresCreatingatargetdatawarehouse

Page 46: SAP Data Services 4.x Cookbook

IntroductionSimplyput,Extract-Transform-Load(ETL)isanengineofanydatawarehouse.ThenatureoftheETLsystemisstraightforward:

Extractdatafromoperationaldatabases/systemsTransformdataaccordingtotherequirementsofyourdatawarehousesothatthedifferentpiecesofdatacanbeusedtogetherApplydataqualitytransformationmethodsinordertocleansedataandensurethatitisreliablebeforeitgetsloadedintoadatawarehouseLoadconformeddataintoadatawarehousesothatenduserscanaccessitviareportingtools,usingclientapplicationsdirectly,orwiththehelpofSQL-basedquerytools

Whileyourdatawarehousedeliverystructuresordatamartsrepresentthefrontendor,inotherwords,whatusersseewhentheyaccessthedata,theETLsystemitselfisabackbonebackendsolutionthatdoesalltheworkofmovingdataandgettingitreadyintimeforuserstouse.BuildingtheETLsystemcanbeareallychallengingtask,andthoughitisnotpartofthedatawarehousedatastructures,itisdefinitelythekeyfactorindefiningthesuccessofthedatawarehousesolutionasawhole.Intheend,whowantstouseadatawarehousewherethedataisunreliable,corrupted,orsometimesevenmissing?ThisisexactlywhatETLisresponsibleforgettingright.

ThefollowingdatastructuretypesmostoftenusedinETLdevelopmenttomovedatabetweensourcesandtargetsareflatfiles,XMLdatasets,andDBMStables,bothinnormalizedschemasanddimensionaldatamodels.WhenchoosinganETLsolution,youmightfacetwosimplechoices:buildingahandcodedETLsolutionorusingacommercialone.

ThefollowingaresomeadvantagesofahandcodedETLsolution:

AprogramminglanguageallowsyoutobuildyourownsophisticatedtransformationsYouaremoreflexibleinbuildingtheETLarchitectureasyouarenotlimitedbythevendor’sETLabilitiesSometimes,itcanbeacheapwayofbuildingafewsimplisticETLprocesses,

Page 47: SAP Data Services 4.x Cookbook

whereasbuyinganETLsolutionfromavendorcanbeoverkillYoudonothavetospendtimelearningthecommercialETLsolution’sarchitectureandfunctionality

HerearesomeadvantagesofacommercialETLsolution:

Thisismoreoftenasimpler,faster,andcheaperdevelopmentoptionasavarietyofexistingtoolsallowyoutobuildaverysophisticatedETLarchitecturequicklyYoudonothavetobeaprofessionalprogrammertousethetoolItautomaticallymanagesETLmetadatabycollecting,storing,andpresentingittotheETLdeveloper,whichisanotherimportantaspectofanyETLsolutionIthasahugerangeofadditionalready-to-usefunctionality,frombuilt-inschedulerstovariousconnectorstoexistingsystems,built-indatalineages,impactanalysisreports,andmanyothers

InthemajorityofDWHprojects,thecommercialETLsolutionfromaspecificvendor,inspiteofthehigherimmediatecost,eventuallysavesyouasignificantamountofmoneyonthedevelopmentandmaintenanceofETLcode.

SAPDataServicesisanETLsolutionprovidedbySAPandispartoftheEnterpriseInformationManagementproductstack,whichalsoincludesSAPInformationSteward;wewillreviewthisinoneofthelastchaptersofthisbook.

Page 48: SAP Data Services 4.x Cookbook

PreparingadatabaseenvironmentThisrecipewillleadyouthroughthefurtherstepsofpreparingtheworkingenvironment,suchaspreparingadatabaseenvironmenttobeutilizedbyETLprocessesasasourceandstagingandtargetingsystemsforthemigratedandtransformeddata.

Page 49: SAP Data Services 4.x Cookbook

GettingreadyTostarttheETLdevelopment,weneedtothinkaboutthreethings:thesystemthatwewillsourcethedatafrom,ourstagingarea(forinitialextractsandasapreliminarystoragefordataduringsubsequenttransformationsteps),andfinally,thedatawarehouseitself,towhichthedatawillbeeventuallydelivered.

Page 50: SAP Data Services 4.x Cookbook

Howtodoit…Throughoutthebook,wewillusea64-bitenvironment,soensurethatyoudownloadandinstallthe64-bitversionsofsoftwarecomponents.Performthefollowingsteps:

1. Let’sstartbypreparingoursourcesystem.Forquickdeployment,wewillchoosethe

MicrosoftSQLServer2012Expressdatabase,whichisavailablefordownloadathttp://www.microsoft.com/en-nz/download/details.aspx?id=29062.

2. ClickontheDownloadbuttonandselecttheSQLEXPRWT_x64_ENU.exefileinthelistoffilesthatareavailablefordownload.Thispackagecontainseverythingrequiredfortheinstallationandconfigurationofthedatabaseserver:theSQLServerExpressdatabaseengineandtheSQLServerManagementStudiotool.

3. Afterthedownloadiscomplete,runtheexecutablefileandfollowtheinstructionsonthescreen.TheinstallationofSQLServer2012Expressisextremelystraightforward,andalloptionscanbesettotheirdefaultvalues.Thereisnoneedtocreateanydefaultdatabasesduringoraftertheinstallationaswewilldoitabitlater.

Page 51: SAP Data Services 4.x Cookbook

Howitworks…Afteryouhavecompletedtheinstallation,youshouldbeabletoruntheSQLServerManagementStudioapplicationandconnecttoyourdatabaseengineusingthesettingsprovidedduringtheinstallationprocess.

Ifyouhavedoneeverythingcorrectly,youshouldseethe“green”stateofyourDatabaseEngineconnectionintheObjectExplorerwindowofSQLServerManagementStudio,asshowninthefollowingscreenshot:

Weneedan“empty”installationofMSSQLServer2012Expressbecausewewillcreateallthedatabasesweneedmanuallyinthenextstepsofthischapter.Thisdatabaseengineinstallationwillhostalloursource,stage,andtargetrelationaldatastructures.ThisoptionallowsustoeasilybuildatestenvironmentthatisperfectforlearningpurposesinordertobecomefamiliarwithETLdevelopmentusingSAPDataServices.

Inareal-lifescenario,yoursourcedatabases,stagingareadatabase,andDWHdatabase/appliancewillmostlikelyresideonseparateserverhosts,andtheymaysometimesbefromdifferentvendors.So,theroleofSAPDataServicesistolinkthemtogetherinordertomigratedatafromonesystemtoanother.

Page 52: SAP Data Services 4.x Cookbook

CreatingasourcesystemdatabaseInthissection,wewillcreateoursourcedatabase,whichwillplaytheroleofanoperationaldatabasethatwewillpulldatafromwiththehelpofDataServicesinordertotransformthedataanddeliverittoadatawarehouse.

Page 53: SAP Data Services 4.x Cookbook

Howtodoit…Luckilyforus,thereareplentyofdifferentflavorsofready-to-usedatabasesontheWebnowadays.Let’spickoneofthemostpopularones:AdventureWorksOLTPforSQLServer2012,whichisavailablefordownloadontheCodePlexwebsite.Performthefollowingsteps:

1. Usethefollowinglinktoseethelistofthefilesavailablefordownload:

https://msftdbprodsamples.codeplex.com/releases/view/55330

2. ClickontheAdventureWorks2012DataFilelink,whichshoulddownloadtheAdventureWorks2012_Data.mdfdatafile.

3. Whenthedownloadiscomplete,copythefileintotheC:\AdventureWorks\directory(createitbeforecopyingifnecessary).

Thenextstepistomapthisdatabasefiletoourdatabaseengine,whichwillcreateoursourcedatabase.Todothis,performthefollowingsteps:

1. StartSQLServerManagementStudio.2. ClickontheNewQuerybutton,whichwillopenanewsessionconnectiontoa

masterdatabase.3. IntheSQLQuerywindow,typethefollowingcommandandpressF5toexecuteit:

CREATEDATABASEAdventureWorks_OLTPON

(FILENAME=‘C:\AdventureWorks\AdventureWorks2012_Data.mdf’)

FORATTACH_REBUILD_LOG;

4. Afterasuccessfulcommandexecutionanduponrefreshingthedatabaselist(usingF5),youshouldbeabletoseetheAdventureWorks_OLTPdatabaseinthelistoftheavailabledatabasesintheObjectExplorerwindowofSQLServerManagementStudio.

TipDownloadingtheexamplecode

YoucandownloadtheexamplecodefilesforallPacktbooksyouhavepurchasedfromyouraccountathttp://www.packtpub.com.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.

Page 54: SAP Data Services 4.x Cookbook

Howitworks…Inatypicalscenario,everySQLServerdatabaseconsistsoftwodatafiles:adatabasefileandatransactionlogfile.Adatabasefilecontainsactualdatastructuresanddata,whileatransactionlogfilekeepsthetransactionalchangesappliedtothedata.

Asweonlydownloadedthedatafile,wehadtoexecutetheCREATEDATABASEcommandwithaspecialATTACH_REBUILD_LOGclause,whichautomaticallycreatesamissingtransactionlogfilesothatthedatabasecouldbesuccessfullydeployedandopened.

Now,oursourcedatabaseisreadytobeusedbyDataServicesinordertoaccess,browse,andextractdatafromit.

Page 55: SAP Data Services 4.x Cookbook

There’smore…Therearedifferentwaystodeploytestdatabases.ThismainlydependsonwhichRDBMSsystemyouuse.Sometimes,youmayfindapackageofSQLscriptsthatcontainsthecommandsrequiredtocreateallthedatabasestructuresandcommandsusedtoinsertdataintothesestructures.Thisoptionmaybeusefulifyouhaveproblemswithattachingthedownloadedmdfdatafiletoyourdatabaseengineor,forexample,ifyoufindtheSQLscriptscreatedforSQLServerRDBMSbuthavetoapplythemtotheOracleDB.Withslightmodificationstothecommand,youcanruntheminordertocreateanOracledatabase.

ExplainingRDBMStechnologiesliesbeyondthescopeofthisbook.So,ifyouarelookingformoreinformationregardinghowaspecificRDBMSsystemworks,refertotheofficialdocumentation.

WhathastobesaidhereisthatfromtheperspectiveofusingDataServices,itdoesnotmatterwhichsourcesystemortargetsystemsyouuse.DataServicesnotonlysupportsthemajorityofthem,butitalsocreatesitsownrepresentationofthesourceandtargetobjects;thisway,theyalllookthesametoDataServicesusersandabidebythesameruleswithintheDataServicesenvironment.So,youreallydonothavetobeaDBAordatabasedevelopertoeasilyconnecttoanyRDBMSfromDataServices.AllthatisrequiredisaknowledgeoftheSQLlanguagetounderstandtheprincipleofmethodsthatDataServicesuseswhenextractingandloadingdataorcreatingdatabaseobjectsforyou.

Page 56: SAP Data Services 4.x Cookbook

DefiningandcreatingstagingareastructuresInthisrecipe,wewilltalkaboutETLdatastructuresthatwillbeusedinthisbook.Stagingstructuresareimportantstorageareaswhereextracteddataiskeptbeforeitgetstransformedorstoredbetweenthetransformationsteps.Thestagingareaingeneralcanbeusedtocreatebackupcopiesofdataortorunanalyticalqueriesonthedatainordertovalidatethetransformationsmadeortheextractprocesses.Stagingdatastructurescanbequitedifferent,asyouwillsee.Whichonetousedependsonthetasksyouaretryingtoaccomplish,yourprojectrequirements,andthearchitectureoftheenvironmentused.

Page 57: SAP Data Services 4.x Cookbook

Howtodoit…ThemostpopulardatastructuresthatcouldbeusedinthestagingareaareflatfilesandRDBMStables.

FlatfilesOneoftheperksofusingDataServicesagainstthehandcodedETLsolutionisthatDataServicesallowsyoutoeasilyreadfromandwriteinformationtoaflatfile.

CreatetheC:\AW\folder,whichwillbeusedthroughoutthisbooktostoreflatfiles.

NoteInsertingdataintoaflatfileisfasterthaninsertingdataintoanRDBMStable.So,duringETLdevelopment,flatfilesareoftenusedtoreachtwogoalssimultaneously:creatingabackupcopyofthedatasnapshotandprovidingyouwiththestoragelocationforyourpreliminarydatabeforeyouapplythenextsetoftransformationrules.

Anothercommonuseofflatfilesistheabilitytoexchangedatabetweensystemsthatcannotcommunicatewitheachotherinanyotherway.

Lastly,itisverycost-effectivetostoreflatfiles(OSdiskstoragespaceischeaperthanDBstoragespace).

Themaindisadvantageoftheflatfilesstoragemethodisthatthemodificationofdatainaflatfilecansometimesbearealpain,nottomentionthatitismuchslowerthanmodifyingdatainarelationalDBtable.

RDBMStablesTheseETLdatastructureswillbeusedmoreoftenthanotherstostagethedatathatisgoingthroughtheETLtransformationprocess.

Let’screatetwoseparatedatabasesforrelationaltables,whichwillplaytheroleoftheETLstagingareainourfutureexamples:

1. OpenSQLServerManagementStudio.2. Right-clickontheDatabasesiconandselecttheNewDatabase…option.3. Onthenextscreen,inputODSasthedatabasename,andspecify100MBastheinitial

sizevalueofthedatabasefileand10MBasthatofthetransactionallogfile:

Page 58: SAP Data Services 4.x Cookbook

4. RepeatthelasttwostepstocreateanotherdatasetcalledSTAGE.

Page 59: SAP Data Services 4.x Cookbook

Howitworks…Let’srecap.TheETLstagingareaisalocationtostorethepreliminaryresultsofourETLtransformationsandalsoalandingzonefortheextractsfromthesourcesystem.

Yes,DataServicesallowsyoutoextractdataandperformalltransformationsinthememorybeforeloadingtothetargetsystem.However,asyouwillseeinlaterchapters,theETLprocess,whichdoeseverythinginone“go”,canbecomplexanddifficulttomaintain.Plus,ifsomethinggoeswrongalongtheway,allthechangesthattheprocesshasalreadyperformedwillbelostandyoumayhavetostarttheextraction/transformationprocessagain.Thisobviouslycreatesextraworkloadonasourcesystembecauseyouhavetoqueryitagaininordertogetthedata.Finally,bigdoesnotmeaneffective.WewillshowyouhowsplittingyourETLprocessintosmallerpieceshelpsyoutocreateawell-performingsequenceofdataflow.

TheODSdatabasewillbeusedasalandingzoneforthedatacomingfromsourcesystems.Thestructureofthetablesherewillbeidenticaltothestructureofthesourcesystemtables.

TheSTAGEdatabasewillholdtherelationaltablesusedtostoredatabetweenthedatatransformationsteps.

WewillalsostoresomedataextractedfromasourcedatabaseinaflatfileformattodemonstratetheabilityofDataServicestoworkwiththemandshowtheconvenienceofthisdatastoragemethodintheETLsystem.

Page 60: SAP Data Services 4.x Cookbook

CreatingatargetdatawarehouseFinally,thisisthetimetocreateourtargetdatawarehousesystem.Thedatawarehousestructuresandtableswillbeusedbyenduserswiththehelpofvariousreportingtoolstomakesenseofthedataandanalyzeit.Asaresult,itshouldhelpbusinessuserstomakestrategicdecisions,whichwillhopefullyleadtobusinessgrowth.

Weshouldnotforgetthatthemainpurposeofadatawarehouse,andhencethatofourETLsystem,istoservebusinessneeds.

Page 61: SAP Data Services 4.x Cookbook

GettingreadyThedatawarehousecreatedinthisrecipewillbeusedasatargetdatabasepopulatedbytheETLprocessesdevelopedinSAPDataServices.ThisiswherethedatamodifiedandcleansedbyETLprocesseswillbeinsertedintheend.Plus,thisisthedatabasethatwillmainlybeaccessedbybusinessusersandreportingtools.

Page 62: SAP Data Services 4.x Cookbook

Howtodoit…Performthefollowingsteps:

1. AdventureWorkscomestotherescueagain.Useanotherlinktodownloadthe

AdventureWorksdatawarehousedatafile,whichwillbemappedinthesamemannertoourSQLServerExpressdatabaseengineinordertocreatealocaldatawarehouseforourownlearningpurposes.GotothefollowingURLandclickontheAdventureWorksDWforSQLServer2012link:

https://msftdbprodsamples.codeplex.com/releases/view/105902

2. AfteryouhavesuccessfullydownloadedtheAdventureWorksDW2012.zipfile,unpackitscontentsintothesamedirectoryasthepreviousfile:C:\AdventureWorks\

3. Thereshouldbetwofilesinthearchive:AdventureWorksDW2012_Data.mdf—thedatabasedatafileAdventureWorksDW2012_Log.ldf—thedatabasetransactionlogfile

4. OpenSQLServerManagementStudioandclickontheNewQuery…buttonintheuppermosttoolbar.

5. EnterandexecutethefollowingcommandintheSQLQuerywindow:CREATEDATABASEAdventureWorks_DWHON

(FILENAME=‘C:\AdventureWorks\AdventureWorksDW2012_Data.mdf’),(FILENAME=‘C:\AdventureWorks\AdventureWorksDW2012_Log.ldf’)FORATTACH;

6. Afterasuccessfulcommandexecution,right-clickontheDatabasesiconandchoosetheRefreshoptionintheopenedmenulist.Thisshouldrefreshthecontentsofyourobjectlibrary,andyoushouldseethefollowinglistofdatabases:

ODS

STAGE

AdventureWorks_OLTP

AdventureWorks_DWH

Page 63: SAP Data Services 4.x Cookbook

Howitworks…Getyourselffamiliarwiththetablesofthecreateddatawarehouse.Throughoutthewholebook,youwillbeusingtheminordertoinsert,update,anddeletedatausingDataServices.

Therearealsosomediagramsavailablethatcouldhelpyouseethevisualdatawarehousestructure.Togetaccesstothem,openSQLServerManagementStudio,expandtheDatabaseslistintheObjectExplorerwindow,thenexpandtheAdventureWorks_DWHdatabaseobjectlist,andfinallyopentheDiagramstree.Double-clickingonanydiagraminthelistopensanewwindowwithinManagementStudiowiththegraphicalpresentationoftables,keycolumns,andlinksbetweenthetables,whichshowsyoutherelationshipsbetweenthem.

Page 64: SAP Data Services 4.x Cookbook

There’smore…Inthenextrecipe,wewillhaveanoverviewoftheknowledgeresourcesthatexistontheWeb.Wehighlyrecommendthatyougetfamiliarwiththeminordertoimproveyourdatawarehousingskills,learnaboutthedatawarehouselifecycle,andunderstandwhatmakesasuccessfuldatawarehouseproject.Inthemeantime,feelfreetoopenNewQueryinSQLServerManagementStudioandstartrunningtheSELECTcommandstoexplorethecontentsofthetablesinyourAdventureWorks_DWHdatabase.

NoteThemostimportantassetofanyDWHarchitectorETLdeveloperisnottheknowledgeofaprogramminglanguageortheavailabletoolsbuttheabilitytounderstandthedatathatis,orwillbe,populatingthedatawarehouseandthebusinessneedsandrequirementsforthisdata.

Page 65: SAP Data Services 4.x Cookbook

Chapter2.ConfiguringtheDataServicesEnvironmentInthischapter,wewillinstallandconfigureallcomponentsrequiredforSAPDataServices.Inthischapter,wewillcoverthefollowingtopics:

CreatingIPSandDataServicesrepositoriesInstallingandconfiguringInformationPlatformServicesInstallingandconfiguringDataServicesConfiguringuseraccessStartingandstoppingservicesAdministeringtasksUnderstandingtheDesignertool

Page 66: SAP Data Services 4.x Cookbook

IntroductionThesamethingthatmakesSAPDataServicesagreatETLdevelopmentenvironmentmakesitquitenotatrivialonetoinstallandconfigure.Herethough,youhavetorememberthatDataServicesisanenterpriseclassETLsolutionthatisabletosolvethemostcomplexETLtasks.

Seethefollowingimageforaveryhigh-levelDataServicesarchitectureview.DataServiceshastwobasicgroupsofcomponents:clienttoolsandserver-basedcomponents:

Clienttoolsincludethefollowing(therearemore,butwementiontheonesmostoftenused):

TheDesignertool:Thisistheclient-basedmainGUIapplicationforETLdevelopmentRepositoryManager:Thisisaclient-basedGUIapplicationforDataServicestocreate,configure,andupgradeDataServicesrepositories

Themainserver-basedcomponentsincludethefollowingones:

IPSServices:Thisisusedforuserauthentication,systemconfigurationstorage,andinternalmetadatamanagementJobServer:ThisisacoreengineservicethatexecutesETLcode

Page 67: SAP Data Services 4.x Cookbook

Accessserver:Thisisareal-timerequest-replymessagebroker,whichimplementsreal-timeservicesintheDataServicesenvironmentWebapplicationserver:ThisprovidesaccesstosomeDataServicesadministrationandreportingtasksviatheDSManagementConsoleandCentralManagementConsoleweb-basedapplications

Inthecourseofthenextfewrecipes,wewillinstall,configure,andaccessallthecomponentsrequiredtoperformthemajorityofETLdevelopmenttasks.YouwilllearnabouttheirpurposesandsomeusefultipsthatwillhelpyoueffectivelyworkintheDataServicesenvironmentthroughoutthebookandinyourfuturework.

DataServicesinstallationsupportsallmajorOSanddatabaseenvironments.Forlearningpurposes,wehavechosentheWindowsOSasitinvolvestheleastconfigurationontheuserpart.BothclienttoolsandservercomponentswillbeinstalledonthesameWindowshost.

Page 68: SAP Data Services 4.x Cookbook

CreatingIPSandDataServicesrepositoriesTheIPSrepositoryisastorageforenvironmentanduserconfigurationinformationandmetadatacollectedbyvariousservicesofIPSandDataServices.Ithasanothername:theCMSdatabase.ThisnameshouldbequitefamiliartothosewhohaveusedSAPBusinessIntelligencesoftware.Basically,IPSisalightversionofSAPBIproductpackage.YouwillalwaysuseonlyoneIPSrepositoryperDataServicesinstallationandmostlikelywilldealwithitonlyonce:whenconfiguringtheenvironmentattheverybeginning.Mostofthetime,DataServiceswillbecommunicatingwithIPSservicesandtheCMSdatabaseinthebackground,withoutyouevennoticing.

TheDataServicesrepositoryisadifferentstory.ItismuchclosertoanETLdeveloperasitisadatabasethatstoresyourdevelopedcode.Inamultiuserdevelopmentenvironment,everyETLdeveloperusuallyhasitsownrepository.Theycanbeoftwotypes:centralandlocal.TheyservedifferentpurposesintheETLlifecycle,andIwillexplainthisinmoredetailintheupcomingchapters.Meanwhile,let’screateourfirstlocalDataServicesrepository.

Page 69: SAP Data Services 4.x Cookbook

Gettingready…BothrepositorieswillbestoredinthesameSQLServerExpressRDBMS((local)\SQLEXPRESS)thatweusedtocreateoursourceOLTPdatabase,ETLstagingdatabases,andtargetdatawarehouse.So,atthispoint,youonlyneedtohaveaccesstoSQLServerManagementStudioandyourSQLServerExpressservicesneedtostart.

Page 70: SAP Data Services 4.x Cookbook

Howtodoit…Thiswillconsistoftwomajortasks:

1. Creatingadatabase:

1. LogintoSQLServerManagementStudioandcreatetwodatabases:IPS_CMSandDS_LOCAL_REPO.

2. Rightnow,yourdatabaselistshouldlooklikethis:

2. ConfiguringtheODBClayer:InstallationrequiresthatyoucreatetheODBCdatasourcefortheIPS_CMSdatabase.1. GotoControlPanel|AdministrativeTools|ODBCDataSources(64-bit).2. OpentheSystemDSNtabandclickontheAdd…button.3. Choosethenameofthedatasource:SQL_IPS,thedescriptionSQLServer

Express,andtheSQLServeryouwanttoconnecttothroughthisODBCdatasource:(local)\SQLEXPRESS.Then,clickonNext.

4. ChooseSQLServerauthenticationandselectthecheckboxConnecttoSQLtoobtainthedefaultsettings.EntertheloginID(sauser)andpassword.ClickonNext.

5. SelectthecheckboxandchangethedefaultdatabasetoIPS_CMS.ClickonNext.6. SkipthenextscreenbyclickingonNext.7. ThefinalscreenoftheODBCconfigurationshouldlooklikethefollowing

screenshot.Then,clickingontheTestDataSourcebuttonshouldgiveyouthemessage,TESTSCOMPLETEDSUCCESSFULLY!

Page 71: SAP Data Services 4.x Cookbook
Page 72: SAP Data Services 4.x Cookbook

Howitworks…ThesetwoemptydatabaseswillbeusedbyDataServicestoolsduringinstallationandpost-installationconfigurationtasks.Allstructuresinsidethemwillbecreatedandpopulatedautomatically.

Usually,theyarenotbuiltforuserstoaccessthemdirectly,butintheupcomingchapters,Iwillshowyouafewtricksonhowtoextractvaluableinformationfromtheminordertotroubleshootpotentialproblems,doalittlebitofETLmetadatareporting,oruseanextendedsearchforETLobjects,whichisnotpossibleintheGUIoftheDesignertool.

TheODBClayerconfiguredfortheIPS_CMSdatabaseallowsyoutoaccessitfromtheIPSinstallation.WhenweinstallbothIPSandDataServices,youwillbeabletoconnecttothedatabasesdirectlyfromtheDataServicesapplications,asithasnativedriversforvarioustypesofdatabasesandalsoallowsyoutoconnectthroughODBClayersifyouwant.

Page 73: SAP Data Services 4.x Cookbook

SeealsoReferencestoafuturechaptercontainingtechniquesmentionedintheprecedingparagraph.

Page 74: SAP Data Services 4.x Cookbook

InstallingandconfiguringInformationPlatformServicesTheInformationPlatformServices(IPS)productpackagewasaddedasacomponentintotheDataServicesbundlestartingfromtheDataServices4.xversion.ThereasonforthiswastomaketheDataServicesarchitectureflexibleandrobustandintroducesomeextrafunctionality,thatis,ausermanagementlayertotheexistingSAPDataServicessolution.Aswementionedbefore,IPSisalightversionofSAPBIcoreservicesandhasalotofsimilarfunctionality.

Inthisrecipe,wewillperformtheinstallationandbasicconfigurationofIPS,whichisamandatorycomponentforfutureDataServicesinstallations.

TipAsanoption,youcouldalwaysusetheexistingfullenterpriseSAPBIsolutionifyouhaveitinstalledinyourenvironment.However,thisisgenerallyconsideredabadpractice.Imaginethatitislikestoringalleggsinonebasket.WheneveryouneedtoplandowntimeforyourBIsystem,youshouldkeepinmindthatitwillaffectyourETLenvironmentaswell,andyouwillnotbeabletorunanyDataServicesjobsduringthisperiod.Thatiswhy,IPSisinstalledtobeusedonlybyDataServicesasasaferandmoreconvenientoptionintermsofsupportandmaintenance.

Page 75: SAP Data Services 4.x Cookbook

Gettingready…DownloadtheInformationPlatformServicesinstallationpackagefromtheSAPsupportportalandunzipittothelocationofyourchoice.ThemainrequirementforinstallingIPSaswellasDataServicesinthenextrecipeisthatyourOSshouldhavea64-bitarchitecture.

Page 76: SAP Data Services 4.x Cookbook

Howtodoit…1. CreateanEIMfolderinyourCdrivetostoreyourinstallationinoneplace.2. LaunchtheIPSinstallerbyexecutingInstallIPS.exe.3. MakesurethatallyourcriticalprerequisiteshavetheSucceededstatusontheCheck

Prerequisitesscreen.Continuetothenextscreen.4. ChooseC:\EIM\astheinstallationdestinationfolder.Continuetothenextscreen.5. ChoosetheFullinstallationtype.Continuetothenextscreen.6. OnSelectDefaultorExistingDatabase,chooseConfigureanexistingdatabase

andcontinuetothenextscreen.7. SelectMicrosoftSQLServerusingODBCastheexistingCMSdatabasetype.8. SelectNoauditingdatabaseonthenextscreenandcontinue.9. ChooseInstallthedefaultTomcatJavaWebApplicationServerand

automaticallydeploywebapplications.Continuetothenextscreen.10. Forversionmanagement,chooseDonotconfigureaversioncontrolsystematthis

time.11. Onthenextscreen,specifytheSIAnameintheNodenamefieldasIPSandSIA

portas6410.12. DonotchangethedefaultCMSport,6400.13. OntheCMSaccountconfigurationscreen,inputpasswordsfortheadministrator

useraccountandtheCMSclusterkey(theycanbethesameifyouwant).Continuefurther.

14. UsethefollowingsettingsfromthefollowingscreenshottoconfiguretheCMSRepositoryDatabase:

Page 77: SAP Data Services 4.x Cookbook

15. LeavethedefaultvaluesforTomcatportsonthenextscreenandclickonNext.RemembertheConnectionPortsetting(defaultis8080)asyouwillrequireittoconnecttotheIPSandDataServiceswebapplications.

16. DonotconfigureconnectivitytoSMDAgent.17. DonotconfigureconnectivitytoIntroscopeEnterpriseManager.18. Finally,theinstallationwillbegin.Itshouldtakeapproximately5–15minutes,

dependingonyourhardware.

Page 78: SAP Data Services 4.x Cookbook

Howitworks…Now,byinstallingIPS,wepreparedthebaselayers,ontopofwhichwewillinstalltheDataServicesinstallationpackageitself.

TocheckthatyourIPSinstallationwassuccessful,starttheCentralManagementConsolewebapplicationusingthehttp://localhost:8080/BOE/CMCURLandusetheadministratoraccountthatyousetupduringIPSinstallationtologin.Inthesystemfield,uselocalhost:6400(yourhostnameandCMSportnumberspecifiedduringIPSinstallation).

CheckouttheCoreServicestreeintheServerssectionofCMC.AllserviceslistedshouldhavetheRunningandEnabledstatuses.

Page 79: SAP Data Services 4.x Cookbook

InstallingandconfiguringDataServicesTheinstallationofDataServicesinaWindowsenvironmentisasmoothandquickprocess.Ofcourse,youhavevariousinstallationoptions,buthere,wewillchoosetheeasiestpath:thefullinstallationofallcomponentsonthesamehostwithIPSservicesinstalledandthelocalrepositoryalreadycreatedandconfigured.

Page 80: SAP Data Services 4.x Cookbook

Gettingready…CompletionofthepreviousrecipeshouldprepareyourenvironmenttoinstallDataServices.DownloadtheDataServicesinstallationpackagefromtheSAPsupportportalandunzipittoalocalfolder.

Page 81: SAP Data Services 4.x Cookbook

Howtodoit…1. StartDataServicesfromWindowscommandline(cmd)byexecutingthis

command:

setup.exeSERVERINSTALL=Yes

2. MakesurethatallyourcriticalprerequisiteshavetheSucceededstatusontheCheckPrerequisitesscreen.

3. ChoosethedestinationfolderasC:\EIM\ifrequired.4. OntheCMSconnectioninformationstep,specifytheconnectiondetailstoyour

previouslyinstalledCMS(partofIPS)installation.Thesystemislocalhost:6400,andtheuserisAdministrator.ClickonNext.

5. IntheCMSServiceStop/Startpop-upwindow,agreetorestartSIAservers.6. ChooseInstallwithdefaultconfigurationontheInstallationTypeselection

screen.7. Makesurethatyouselectallfeaturesbyselectingallthecheckboxesonthenext

featureselectionscreenandclickonNext.8. SpecifyMicrosoft_SQL_Serverasadatabasetypeforalocalrepository.9. Usethefollowingdetailsasareferencetoconfiguringyourlocalrepositorydatabase

connectiononthenextscreen:

Option Value

RegistrationnameforCMS DS4_REPO

DatabaseType Microsoft_SQL_Server

Databaseservername (local)\SQLEXPRESS

Databaseport 50664

Databasename DS_LOCAL_REPO

UserName sa

Password <sauserpassword>

10. Forlogininformation,choosetheaccountrecommendedbyinstallation.11. Theinstallationshouldbecompletedin5–10minutes,dependingonyour

environment.

Page 82: SAP Data Services 4.x Cookbook

Howitworks…Afterfinishingthisrecipe,youwillhavealltheDataServicesserversandclientcomponentsinstalledonthesameWindowshost.Also,yourDataServicesinstallationisintegratedwithIPSservices.

Tocheckthattheinstallationandintegrationweresuccessful,logintoCMCandseethatinthemainmenu,thereisanewsectioncalledDataServices(seetheOrganizecolumn).GotothissectionandseewhetheryourDS4_REPOexistsinthelistoflocalrepositories.

Page 83: SAP Data Services 4.x Cookbook

ConfiguringuseraccessInthisrecipe,IwillshowyouhowtoconfigureyouraccessasafreshETLdeveloperinaDataServicesenvironment.Wewillcreateauseraccount,assignalltherequiredfunctionalprivileges,andassignownerprivilegesforourlocalDataServicesrepository.Inamultiuserdevelopmentenvironment,youwouldrequiretoperformthisstepforeverynewlycreateduser.

Page 84: SAP Data Services 4.x Cookbook

Gettingready…ChoosetheusernameandpasswordforyourETLdeveloperuseraccount.WewilllogintotheCMCapplicationtocreateauseraccountandgrantittherequiredsetofprivileges.

Page 85: SAP Data Services 4.x Cookbook

Howtodoit…1. LaunchtheCentralManagementConsolewebapplication.2. GotoUsersandGroups.3. ClickonCreateauserbutton(seethefollowingscreenshot):

4. Intheopenedwindow,chooseausername(wepickedetl)andpassword.Also,selectthePasswordneverexpiresoptionandunselectUsermustchangepasswordatnextlogon.ChooseConcurrentUserastheconnectiontype.

5. Now,weshouldaddournewlycreatedaccounttotwopre-existingusergroups.Right-clickontheuserandchoosetheMemberOfoptionintheright-clickmenu.

6. ClickontheJoinGroupbuttoninthenewlyopenedwindowandaddtwogroupsfromthegrouplisttotherightwindowpanel:DataServicesAdministratorUsersandDataServicesDesignerUsers.ClickonOK.

7. Fromtheleft-sideinstrumentpanel,clickontheCMCHomebuttontoreturntothemainCMCscreen.

8. Now,wehavetograntouruserextraprivilegesonthelocalrepository.Forthis,opentheDataServicessection,right-clickonDS4_REPO,andchooseUserSecurityfromthecontextmenu.

9. ClickontheAddprincipalsbutton,movetheetlusertotherightpanelandclickontheAddandAssignSecuritybuttonatthebottomofthescreen.

10. Onthenextscreen,assignthefullcontrol(owner)accesslevelontheAccessLevelstabandgototheAdvancedtab.

11. ClickontheAdd/RemoveRightslinkandsetthefollowingtwooptionsthatappeartoGrantedfortheDataServicesRepositoryapplication(seethefollowingscreenshot):

12. ClickonOKintheAssignSecuritywindowtoconfirmyourconfiguration.13. Asatest,logoutoftheCMCandloginusinganewlycreateduseraccount.

Page 86: SAP Data Services 4.x Cookbook

Howitworks…Inacomplexenterpriseenvironment,youcancreatemultiplegroupsfordifferentcategoriesofusers.Youhavefullflexibilityinordertoprovideuserswithvariouskindsofpermissions,dependingontheirneeds.

Someusersmightrequireadministrationprivilegestostart/stopservicesandtomanagerepositorieswithouttheneedtodevelopETLandaccessDesigner.

TheETLdeveloperrolemightrequireonlypermissionsfortheDesignertooltodevelopETLcode.

Inourcase,wehavecreatedasingleuseraccountthathasbothadministrationanddeveloperprivileges.

Page 87: SAP Data Services 4.x Cookbook

StartingandstoppingservicesInthisrecipe,IwillexplainhowyoucanrestarttheservicesofallthemaincomponentsinyourDataServicesenvironment.

Page 88: SAP Data Services 4.x Cookbook

Howtodoit…Thisrelatestothethreedifferentservices:

Webapplicationserver:

TheTomcatapplicationserverconfiguredinourenvironmentcanbeconfiguredfromtwoplaces:

ComputerManagement|ServicesandApplications|ServiceswhereitexistsasastandardWindowsserviceBOEXI40Tomcat

CentralConfigurationManagementtoolinstalledasapartofIPSproductpackage:

Usingthistool,youcan:

1. Start/stopservices.2. Backupandrestoresystemconfiguration.3. SpecifytheWindowsuserwhostartsandstopstheunderlyingservices.

DataServicesJobServer:TomanageDataServicesJobServerintheWindowsenvironment,SAPcreatedaseparateGUIapplicationcalledDataServicesServerManager.Usingthistool,youcanperformthefollowingtasks:

1. RestartJobServer.2. CreateandconfigureJobServers.3. CreateandconfigureAccessServers.4. PerformSSLconfiguration.5. Setupapageablecachedirectory.6. PerformSMTPconfigurationforthesmpt_to()DataServicesfunction.

InformationPlatformServices:Tomanipulatetheseservices,youhavetwooptions:

CentralManagementConsole(tostop/startandconfigureservicesparameters)

CentralConfigurationManagement(tostop/startservices)

Inmostcases,youwillbeusingtheCMCoption,asitisaquickandconvenientwaytoaccessallservicesincludedintheIPSpackage.Italsoallowsyoutoseemuchmoreservice-relatedinformation.

Thesecondoptionisusefulifyouhavetheapplicationserverstoppedforsomereason(CMCasaweb-basedapplicationwillnotbeworking,ofcourse),andyoustillneedto

Page 89: SAP Data Services 4.x Cookbook

accessIPSservicestoperformbasicadministrationtaskssuchasrestartingthem,forexample.

Page 90: SAP Data Services 4.x Cookbook

Howitworks…Sometimes,thingsturnsour,andrestartingservicesisthequickestandeasiestoptiontoreturnthemtoanormalstate.Inthisrecipe,Imentionedallthemainservercomponentsandpointsofaccesstoperformsuchatask.

Thelastthingyoushouldkeepinmindregardingthisistherecommendedstartup/shutdownsequencesofthosecomponents.

1. ThefirstthingthatshouldstartafterWindowsstartsisyourdatabaseserver,asit

hoststheCMSdatabaserequiredfortheIPSservicesandDataServiceslocalrepository.

2. Second,youshouldstartIPSservices(themainoneistheCMSservice)asanunderlyinglevelforDataServices.

3. Then,itistheturnoftheDataServicesJobServer.4. Finally,itgoestoTomcat(webapplicationserver)thatprovidesuserswithaccessto

web-basedapplications.

Page 91: SAP Data Services 4.x Cookbook

Seealso

IdefinitelyrecommendthatyougetfamiliarwiththeSAPDataServicesAdministratorsGuidetounderstandthedetailsregardingIPSandDataServicescomponentmanagementandconfiguration.KnowledgesourcedanddocumentationlinksfromChapter1,IntroductiontoETLDevelopment.

Page 92: SAP Data Services 4.x Cookbook

AdministeringtasksThepreviousrecipeispartofthebasicadministrationtaskstoo,ofcourse.IseparateditfromthecurrentoneasIwantedtoputanaccentonDataServicesarchitecturedetailsbyexplainingthemainDataServicescomponentsinrelationtothemethodsandtoolsyoucanusetomanipulatethem.

Page 93: SAP Data Services 4.x Cookbook

Howtodoit…Here,wewilllookatsomeofthemostimportantadministrativetasks.

1. UsingRepositoryManager:

Asyoucanprobablyremember,therearetwotypesofrepositoriesinDataServices:thelocalrepositoryandcentralrepository.Theyservedifferentpurposesbutcanbecreatedinquiteasimilarway:withthehelpoftheDataServicesRepositoryManagertool.

ThisisaGUI-basedtoolavailableonyourWindowsmachineandinstalledwithotherclienttools.

AswealreadyhaveonerepositorycreatedandconfiguredautomaticallyduringtheDataServicesinstallation,let’scheckitsversionusingtheRepositoryManagertool.

LaunchRepositoryManagerandenterthefollowingvaluesforthecorrespondingoptions:

Field Value

Repositorytype Local

DatabaseType MicrosoftSQLServer

Databaseservername (local)\SQLEXPRESS

Databasename DS_LOCAL_REPO

UserName sa

Password *******

Afterenteringthesedetails,youhaveseveraloptions:

Create:Thisoptioncreatesrepositoryobjectsinthedefineddatabase.AswealreadyhavearepositoryinDS_LOCAL_REPO,theapplicationwillaskuswhetherwewanttoresettheexistingrepository.Sometimes,thiscanbeuseful,butkeepinmindthatitwillcleansetherepositoryofallobjects,andifnotcareful,allyourETLthatresidesintherepositorycanbelost.

Upgrade:ThisoptionupgradestherepositorytotheversionoftheRepositoryManagertool.Itisusefulduringsoftwareupgrades.AfterinstallingthenewversionofIPSandDataServices,youhavetoupgradeyourrepositorycontentsaswell.ThisiswhenyoulaunchtheRepositoryManagertool(whichhasalreadybeenupdated)andupgradeyourrepositorytothecurrentversion.

Page 94: SAP Data Services 4.x Cookbook

Getversion:Thisisthesafestoptionofthemall.Itjustreturnsthestringcontainingtherepositoryversionnumber.Inourcase,itreturned:BODI-320030:Thelocalrepositoryversion:<14.2.4.0>.

2. UsingServerManagerandCMCtoregisterthenewrepository:

AfteryoucreatethenewrepositorywithRepositoryManager,youhavetoregisteritinIPSandlinkittotheexistingJobServer.

ToregisteranewrepositoryinIPS,usethefollowingsteps:

1. LaunchCentralManagementConsole.2. OpentheDataServicessectionfromtheCMChomepage.3. GotoManage|ConfigureRepository.4. EnterdatabasedetailsofyournewlycreatedrepositoryandclickonSave.5. Toassignusersarequiredsetofprivileges,useUserSecuritywhenright-

clickingontherepositoryinthelist.Fordetails,seetheConfiguringuseraccessrecipe.

TolinkanewrepositorytotheJobServer,performthesesteps:

1. LaunchtheDataServicesServerManagertool.2. ChoosetheJobServertab.3. PressontheConfigurationEditor…button.4. SelectJobServerandpresstheEdit…button.5. IntheAssociatedRepositoriespanel,presstheAdd…buttonandfillin

database-relatedinformationofthenewrepositoryinthecorrespondentfieldsontheright-handside.

6. UsetheCloseandRestartbuttonintheDataServicesServerManagementtooltoapplythechangesdonetoaJobServer.

3. UsingLicenseManager:1. LicenseManagerexistsonlyinacommand-linemode.2. UsethefollowingsyntaxtorunLicenseManager:

LicenseManager[-v|-a<keycode>|-r<keycode>[-l<location>]]

3. Usethe–voptiontoviewexistinglicensekeys,-atoaddanewlicensekey,and–rtoremovetheexistinglicensekeyfromthe–llocationspecified.

ThistoolisavailableatC:\EIM\DataServices\bin\.

Page 95: SAP Data Services 4.x Cookbook

Howitworks…CreatingandconfiguringanewlocalrepositoryisusuallyrequiredwhenyousetupanenvironmentforanewETLdeveloperorwanttouseanextrarepositorytomigrateyourETLforETLtestingpurposesortotestarepositoryupgrade.

Aftercreatinganewlocalrepository,youshouldalwayslinkittoanexistingJobServer.ThislinkensuresthatJobServerisawareoftherepositoryandcanexecutejobsfromit.

Finally,LicenseManagercanbeusedtoseethelicensekeyusedinyourinstallationsandtoaddnewextraonesifrequired.

Page 96: SAP Data Services 4.x Cookbook

SeealsoYoucanpracticewithyourDataServicesadminskillsbycreatinganewdatabaseandnewlocalDataServicesrepository.Donotforgetthatyoudonotjusthavetocreateit,butalsoregisteritwithIPSservicesandDataServicesJobServersothatyoucansuccessfullyrunjobsfromit.

Someotheradministrativetaskscanbefoundinthefollowingchapters:

TheStartingandstoppingservicesrecipefromthischapterTheConfigureODBClayerpointfromtheHowtodoit…sectionoftheCreatingIPSandDataServicesrepositoriesrecipeofthischapter

Page 97: SAP Data Services 4.x Cookbook

UnderstandingtheDesignertoolNowthatwehavereviewedalltheimportantserverandclientcomponentsofournewDataServicesinstallation,itistimetogetfamiliarwiththemostusableandmostimportanttoolintheDataServicesproductpackage.Itwillbeourmainfocusinthefollowingchapters,andofcourse,IamtalkingaboutourdevelopmentGUI:theDesignertool.

EveryobjectyoucreateinDesignerisstoredinalocalobjectlibrary,whichisalogicalstorageunitpartofthephysicallocalrepositorydatabase.Inthisrecipe,wewilllogintoalocalrepositoryviaDesigner,setupacoupleofsettings,andwriteourfirst“HelloWorld”program.

Page 98: SAP Data Services 4.x Cookbook

Gettingready…YourDataServicesETLdevelopmentenvironmentisfullydeployedandconfigured,sogoaheadandstarttheDesignerapplication.

Page 99: SAP Data Services 4.x Cookbook

Howtodoit…First,let’schangesomedefaultoptionstomakeourdevelopmentlifealittlebiteasierandtoseehowoptionswindowsinDataServiceslooks:

1. WhenyoulaunchyourDesignerapplication,youseequiteasophisticatedlogin

screen.Entertheetlusernamewecreatedinoneofthepreviousrecipesanditspasswordtoseethelistofrepositoriesavailableinthesystem.

2. Atthispoint,youshouldseeonlyonelocalrepository,DS4_REPO,thatwascreatedbydefaultduringtheDataServicesinstallation.Double-clickonit.

3. YoushouldseeyourDesignerapplicationstarted.4. GotoTools|Options.5. Intheopenedwindow,expandtheDesignertreeandchooseGeneral.6. SettheNumberofcharactersinworkspaceiconnameoptionto50andselectthe

Automaticallycalculatecolumnmappingscheckbox.7. ClickonOKtoclosetheoptionswindow.

Beforewecreateourfirst“HelloWorld”program,let’squicklytakealookatDesigner’suserinterface.

Inthisrecipe,youwillberequiredtoworkwithonlytwoareas:LocalObjectLibraryandthemaindevelopmentarea.Thebiggestwindowontheright-handsidewiththeStartPagetabwillopenbydefault.

LocalObjectLibrarycontainstabswithlistsofobjectsyoucancreateoruseduringyourETLdevelopment.TheseobjectsincludeProjects,Jobs,WorkFlows,DataFlows,Transforms,Datastores,Formats,andCustomFunctions:

Alltabsareempty,asyouhavenotcreatedanyobjectsofanykindyet,exceptfortheTransformstab.ThistabcontainsapredefinedsetoftransformsavailableforyoutouseforETLdevelopment.DataServicesdoesnotallowyoutocreateyourowntransforms(thereisanexceptionthatwewilldiscussintheupcomingchapters).So,everythingyouseeonthistabisbasicallyeverythingthatisavailableforyoutomanipulateyourdatawith.

Now,let’screateourfirst“HelloWorld”program.AsETLdevelopmentinDataServicesisnotquitetheusualexperienceofdevelopingwithaprogramminglanguage,weshouldagreeonwhatourfirstprogramshoulddo.Inalmostanyprogramminglanguagerelatedbook,thiskindofprogramjustperformsanoutputofa“HelloWorld”stringontoyourscreen.Inourcase,wewillgeneratea“HelloWorld”stringandoutputitinatablethatwillbeautomaticallycreatedbyDataServicesinourtargetdatabase.

Page 100: SAP Data Services 4.x Cookbook

IntheDesignerapplication,gototheLocalObjectLibrarywindow,choosetheJobstab,right-clickontheBatchJobstree,andselectNewfromthelistofoptionsthatappears.

1. ChoosethenameforanewjobJob_HelloWorldandenterit.Afterthejobiscreated,

double-clickonit.2. Youwillenterthejobdesignwindow(seeJob_HelloWorld–Jobatthebottomof

theapplication),andnow,youcanaddobjectstoyourjobandsetupitsvariablesandparameters.

3. InthedesignwindowoftheJob_HelloWorld–Jobtab,createadataflow.Todothis,fromtherighttoolpanel,chooseDataFlowobjectandleft-clickonamaindesignwindowtocreateit.NameitDF_HelloWorld.

4. Double-clickonanewlycreateddataflow(orjustclickonceonitstitle)toopentheDataFlowdesignwindow.Itappearsasanothertabinthemaindesignwindowarea.

5. Now,whenwearedesigningtheprocessingunitordataflow,wecanchoosethetransformsfromtheTransformstaboftheLocalObjectLibrarywindowtoperformmanipulationwiththedata.ClickontheTransformstab.

6. Here,selectthePlatformtransformstreeanddraganddroptheRow_GenerationtransformfromittotheDataFlowdesignwindow.

NoteAswearegeneratinganew“HelloWorld!”string,weshouldusetheRow_Generationtransform.ItisaveryusefulwayofgeneratingrowsinDataServices.Allothertransformsareperformingoperationsontherowsextractedfromsourceobjects(tablesorfiles)thatarepassingfromsourcetotargetwithinadataflow.Inthisexample,wedonothaveasourcetable.Hence,wehavetogeneratearecord.

7. Bydefault,theRow_GenerationtransformgeneratesonlyonerowwiththeIDas0.Now,wehavetocreateourstringandpresentitasafieldinafuturetargettable.Forthis,weneedtousetheQuerytransform.SelectitfromtherighttoolpanelordraganddropitfromTransformstoPlatform.TheiconoftheQuerytransformslookslikethis:

8. IntheDataFlowdesignwindow,linkRow_GenerationtoQuery,asshownhere,anddouble-clickontheQuerytransformtoopentheQueryEditortab:

NoteInthenextchapter,wewillexplainthedetailsoftheQuerytransform.Inthemeantime,let’sjustsaythatthisisoneofthemostusedtransformsinDataServices.Itallowsyoutojoinflowsofyourdataandmodifythedatasetbyadding/removingcolumnsintherow,changingdatatypes,andperforminggroupingoperations.Ontheleft-handsideoftheQueryEditor,youwillseeanincomingsetofcolumns,andontheright-handside,youwillseetheoutput.Thisiswhereyouwilldefineallyourtransformationfunctionsforspecificfieldsorassignhard-codedvalues.Wearenot

Page 101: SAP Data Services 4.x Cookbook

interestedintheincomingIDgeneratedbytheRow_Generationtransform.Forus,itservedthepurposeofcreatingarowthatwillholdour“HelloWorld!”valueandwillbeinsertedinatable.

9. IntherightpanelofQueryEditor,right-clickonQueryandchooseNewOutputColumn…:

10. SelectthefollowingsettingsintheopenedColumnPropertieswindowtodefinethepropertiesofournewlycreatedcolumnandclickonOK:

11. Now,whenourgeneratedrowhasonecolumn,wehavetopopulateitwithvalue.Forthis,wehavetousetheMappingtabinQueryEditor.SelectouroutputfieldTEXTandenterthe“HelloWorld!”valueinthemappingtabwindow.Donotforgetsinglequotes,whichmeanastringinDS.Then,closeQueryEditoreitherwiththetabcrossinthetop-rightcorner(donotconfuseitwiththeDesignerapplicationcrossthatislocateddangerouslyclosetoit)orjustusetheBackbutton(Alt+Left),agreenarrowiconinthetopinstrumentpanel.

Atthispoint,wehaveasourceinourdataflow.Wealsohaveatransformationobject(theQuerytransform),whichdefinesourtextcolumnandassignsavaluetoit.Whatismissingisatargetobjectwewillinsertourrowto.

Aswewilluseatableasatargetobject,wehavetocreateareferencetoadatabasewithinDataServices.Wewillusethisreferencetocreateatargettable.Thosedatabasereferencesarecalleddatastoresandareusedasapresentationofthedatabaselayer.Inthenextstep,wewillcreateareferencetoourSTAGEdatabasecreatedinthepreviouschapter.

12. GototheDatastorestabofLocalObjectLibrary.Then,right-clickontheemptywindowandselectNewtoopentheCreateNewDatastorewindow.

13. Choosethefollowingsettingsforthenewlycreateddatastoreobject:

Page 102: SAP Data Services 4.x Cookbook

14. Repeatsteps12and13tocreatetherestofdatastoreobjectsconnectedtothedatabaseswecreatedinthepreviousrecipes.UsethesamedatabaseservernameandusercredentialsandchangeonlytheDatastoreNameandDatabasenamefieldswhencreatingnewdatastores.Seethefollowingtableforreference:

DatastoreName Databasename

DS_ODS ODS

DWH AdventureWorks_DWH

OLTP AdventureWorks_OLTP

Now,youshouldhavefourdatastorescreated,referencingalldatabasescreatedintheSQLserver:DS_STAGE,DS_ODS,DWH,andOLTP.

15. Now,wecanusetheDS_STAGEdatastoretocreateourtargettable.GobacktotheDF_HelloWorldintheDataFlowtabofthedesignwindowandselectTemplateTableontherighttoolpanel.Putitontheright-handsideoftheQuerytransformandchooseHELLO_WORLDasthetablenameintheDS_STAGEdatastore.

16. Ourfinaldataflowshouldlooklikethisnow:

17. GobacktotheJob_HelloWorld–JobtabandclickontheValidateAllbuttoninthetopinstrumentpanel.YoushouldgetthefollowingmessageintheoutputwindowofDesignerontheleft-handsideofyourscreen:Validate:NoErrorsFound(BODI-1270017).

18. Now,wearereadytoexecuteourfirstjob.Forthis,usetheExecute…(F8)buttonfromthetopinstrumentpanel.AgreetosavethecurrentobjectsandclickonOKonthefollowingscreen.

19. Seethatthelogscreenthatshowsyoutheexecutionstepscontainsnoexecutionerrors.Then,gotoyourSQLServerManagementStudio,opentheSTAGEdatabase,andcheckthecontentsoftheappearedHELLO_WORLDtable.Ithasjustonecolumn,TEXT,withonlyonevalue,“HelloWorld!”.

Page 103: SAP Data Services 4.x Cookbook

Howitworks…“HelloWorld!”isasmallexamplethatintroducesalotofgeneralandevensophisticatedconcepts.Inthefollowingsections,wewillquicklyreviewthemostimportantones.TheywillhelpyougetfamiliarwiththedevelopmentenvironmentinDataServicesDesigner.Keepinmindthatwewillreturntoallthesesubjectsagainthroughoutthebook,discussingtheminmoredetail.

ExecutingETLcodeinDataServicesToexecuteanyETLcodedevelopedintheDataServicesDesignertool,youhavetocreateajobobject.InDataServices,theonlyexecutableobjectisjob.Everythingelsegoesinsidethejob.

ETLcodeisorganizedasahierarchyofobjectsinsidethejobobject.Tomodifyanynewobjectbyplacinganotherobjectinit,youhavetoopentheeditedobjectinthemainworkspacedesignareaandthendraganddroptherequiredobjectinsideit,placingthemintheworkspacearea.Inourrecipe,wecreatedajobobjectandplacedthedataflowobjectinit.Wethenopenedthedataflowobjectintheworkspaceareaandplacedtransformobjectsinsideit.Asyoucanseeinthefollowingscreenshot,workspaceareasopenedpreviouslycouldbeaccessiblethroughthetabsatthebottomoftheworkspacearea:

TheProjectAreapanelcandisplaythehierarchyofobjectsintheformofatree.Toseeit,youhavetoassignyournewlycreatedjobtoaspecificprojectandopentheprojectinProjectAreabydouble-clickingontheprojectobjectinLocalObjectLibrary.

ExecutableETLcodecontainsonejobobjectandcancontainscript,dataflow,andworkflowobjectscombinedinvariouswaysinsidethejob.

Asyousawfromtherecipesteps,youcancreateanewjobbygoingtoLocalObjectLibrary|Jobs.

Althoughyoucancombinealltypesofobjectsbyplacingtheminthejobdirectly,someobjects,forexample,transformobjects,canbeplacedonlyintodataflowobjectsasdataflowistheonlytypeofobjectthatcanprocessandactuallymigratedata(onarow-by-rowbasis).Hence,alltransformationsshouldhappenonlyinsidethedataflow.Inthesameway,youcanonlyplacedatastoreobjects,suchastablesandviews,directlyin

Page 104: SAP Data Services 4.x Cookbook

dataflowsassourceandtargetobjectsfordatatobemovedfromsourcetotargetandtransformedalongtheway.Whenadataflowobjectisexecutedwithinthejob,itreadsdatarowbyrowfromthesourceandmovestherowfromlefttorighttothenexttransformobjectinsidethedataflowuntilitreachestheendandissenttothetargetobject,whichusuallyisadatabasetable.

Throughoutthisbook,youwilllearnthepurposeofeachobjecttypeandhowandwhenitcanbeused.

Fornow,rememberthatallobjectsinsidethejobareexecutedinthesequentialorderfromlefttorightiftheyareconnectedandsimultaneouslyiftheyarenot.Anotherimportantruleisthattheparentobjectstartsexecutingfirstandthenallobjectsinsideit.Theparentobjectcompletesitsexecutiononlyafterallchildobjectshavecompletedsuccessfully.

ValidatingETLcodeToavoidjobexecutionfailuresduetoincorrectETLsyntax,youcanvalidatethejobandallitsobjectswiththeValidateCurrentorValidateAllbuttononthetopinstrumentpanelinsidetheDesignertool:

ValidateCurrentvalidatesonlythecurrentobjectopenedintheworkspacedesignareaandscriptobjectsinitanddoesnotvalidatetheunderlyingchildobjectsuchasdataflowsandworkflows.Intheprecedingexample,theobjectopenedintheworkspaceisajobobjectthathasonechilddataflowobjectcalledDF_HelloWorldinsideit.OnlyonejobobjectwillbevalidatedandnotDF_HelloWorld.

ValidateAllvalidatesthecurrentandallunderlyingobjects.So,botharecurrentlyopenedintheworkspaceobject,andallobjectsyouseeintheworkspacearevalidated.Thesameappliestotheobjectsnestedinsidethem,downtotheveryendoftheobjecthierarchy.

So,tovalidatethewholejobanditsobjects,youhavetogotothejoblevelbyopeningthejobobjectintheworkspaceareaandclickingonValidateAllbuttononthetopinstrumentpanel.

ValidationresultsaredisplayedintheOutputpanel.WarningmessagesdonotaffecttheexecutionofthejobandoftenindicatepossibleETLdesignproblemsorshowdatatypeconversionsperformedbyDataServicesautomatically.ErrormessagesintheOutput|ErrorstabmeansyntaxorcriticaldesignerrorsmadeinETL.Wheneveryoutrytorunthejobafterseeing“red”errorvalidationmessages,thejobwillfailwithexactlythesameerrorsthatyousawatthebeginningofexecution,aseveryjobisimplicitlyvalidatedwhen

Page 105: SAP Data Services 4.x Cookbook

executed.

AlwaysvalidateyourjobmanuallybeforeexecutingittoavoidjobfailuresduetoincorrectsyntaxorincorrectETLdesign.

TemplatetablesThisisaconvenientwaytospecifythetargettablethatdoesnotyetexistinthedatabaseandsenddatatoit.Whenadataflowobjectwherethetemplatetargettableobjectisplacedisexecuted,itrunstwoDDLcommands,DROPTABLE<templatetablename>andCREATETABLE<templatetablename>,usingtheoutputschema(setofcolumns)ofthelastobjectinsidethedataflowbeforethetargettemplatetable.Onlyafterthat,thedataflowprocessesallthedatafromthesource,passingrowsfromlefttorightthroughalltransformations,andfinallyinsertsdataintothefreshlycreatedtargettable.

NoteNotethattablesarenotcreatedonthedatabaselevelfromtemplatetablesuntiltheETLcode(dataflowobject)isexecutedwithinDataServices.Simplyplacingthetemplatetableobjectinsideadataflowandcreatingitinadatastorestructureisnotenoughfortheactualphysicaltabletobecreatedinthedatabase.Youhavetorunyourcode.

Theyaredisplayedunderdifferentcategoriesinthedatastore.Theyappearseparatelyfromnormaltableobjects:

TheusageoftemplatetableisextremelyusefulduringETLdevelopmentandtesting.Itenablesyoutonotthinkaboutgoingtothedatabaselevelandchangingthestructureofthetablesbyaltering,deleting,orcreatingthemmanuallyiftheETLcodethatinsertsthedatainthetablechanges.Everytimedataflowruns,itwillbedeletingandrecreatingthedatabasetabledefinedthroughthetemplatetableobject,withthecurrentlyrequiredtablestructuredefinedbyyourcurrentETLcode.

Templatetableobjectsareeasilyconvertedtonormaltableobjectsusingthe“Import”

Page 106: SAP Data Services 4.x Cookbook

commandonthem.Thiscommandisavailablefromtheobject’scontextmenuinthedataflowworkspaceorinthedatastorestabinLocalObjectLibrary.

QuerytransformbasicsQuerytransformisoneofthemostimportantandmostoftenusedtransformobjectsinDataServices.Itsmainpurposeistoreaddatafromleftobject(s)(inputschema(s))andsenddatatotheoutputschema(objecttotherightoftheQuerytransform).YoucanjoinmultipledatasetswiththehelpoftheQuerytransformusingsyntaxrulesoftheSQLlanguage.

Additionally,youcanspecifythemappingrulesfortheoutputschemacolumnsinsidetheQuerytransformbyapplyingvariousfunctionstothemappedfields.Youcanalsospecifyhard-codedvaluesorevencreateadditionaloutputschemacolumns,likewedidinourHelloWorldexample.

TheexampleinthenextscreenshotisnotfromourHelloWorldexample.However,itdemonstrateshowtherowextractedpreviouslyfromthesourceobject(inputschema)canbeaugmentedwithextracolumnsorcangetitscolumnsrenamedoritsvaluestransformedbyfunctionsappliedtothecolumns:

Seehowcolumnsfromtwodifferenttablesarecombinedinasingledatasetintheoutputschema,withcolumnsrenamedaccordingtonewstandardsandnewcolumnscreatedwithNULLvaluesinthem.

TheHelloWorldexampleYouhavejustcreatedthesimplestdataflowprocessingunitandexecuteditwithinyourfirstjob.

ThedataflowobjectinourexamplehastheRow_Generationtransform,whichgeneratesrowswithonlyonefield.WegeneratedonerowwiththehelpofthistransformandaddedanextrafieldtotherowwiththehelpoftheQuerytransform.WetheninsertedourfinalrowintotheHELLO_WORLDtablecreatedautomaticallybyDataServicesintheSTAGEdatabase.

Page 107: SAP Data Services 4.x Cookbook

YoualsohaveconfiguredacoupleofDesignerpropertiesandcreatedaDatastoreobjectthatrepresentstheDataServicesviewoftheunderlyingdatabaselevel.Notalldatabaseobjects(tablesandviews)arevisiblewithinyourdatastorebydefault.Youhavetoimportonlythoseyouaregoingtoworkwith.InourHelloWorldexample,wedidnotimportthetableinthedatastore,asweusedthetemplatetable.ToimportthetablethatexistsinthedatabaseintoyourdatastoresothatitcanbeusedinETLdevelopment,youcanperformthefollowingsteps:

1. GotoLocalObjectLibrary|Datastores.2. Expandthedatastoreobjectyouwanttoimportthetablein.3. Double-clickontheTablessectiontoopenthelistofdatabasetablesavailablefor

import:

4. Right-clickonthespecifictableintheExternalMetadatalistandchooseImportfromthetablecontextmenu.

5. ThetableobjectwillnowappearintheTablessectionofthechosendatastore.Asithasnotyetbeenplacedinanydataflowobject,theUsagecolumnshowsa0value:

Creatingdifferentdatastoresforthesamedatabasecouldalsobeaflexibleandconvenientwayofcategorizingyoursourceandtargetsystems.

Page 108: SAP Data Services 4.x Cookbook

Thereisalsoaconceptofconfigurationswhenyoucancreatemultipleconfigurationsofthesamedatastorewithdifferentparametersandswitchbetweenthem.Thisisveryusefulwhenyouareworkinginacomplexdevelopmentenvironmentwithdevelopment,test,andproductiondatabases.However,thisisatopicforfuturediscussionintheupcomingchapters.

Page 109: SAP Data Services 4.x Cookbook

Chapter3.DataServicesBasics–DataTypes,ScriptingLanguage,andFunctionsInthischapter,IwillintroduceyoutoscriptinglanguageinDataServices.Inthischapter,wewillcoverthefollowingtopics:

CreatingvariablesandparametersCreatingascriptUsingstringfunctionsUsingdatefunctionsUsingconversionfunctionsUsingdatabasefunctionsUsingaggregatefunctionsUsingmathfunctionsUsingmiscellaneousfunctionsCreatingcustomfunctions

Page 110: SAP Data Services 4.x Cookbook

IntroductionItiseasytounderestimatetheimportanceofthescriptinglanguageinDataServices,butyoushouldnotfallforthispitfall.Insimplewords,scriptinglanguageisagluethatallowsyoutobuildsmartandreliableETLanduniteallprocessingunitsofwork(whicharedataflowobjects)together.

ThescriptinglanguageinDataServicesismainlyusedtocreatecustomfunctionsandscriptobjects.Scriptobjectsrarelyperformdatamovementanddatatransformation.Theyareusedtoassistthedataflowobject(maindatamigrationandtransformationprocesses).Theyareusuallyplacedbeforeandafterthemtoassistwithexecutionlogicandcalculatetheexecutionparametervaluesfortheprocessesthatextract,transform,andloadthedata.

ThescriptinglanguageinDataServicesisarmedwithpowerfulfunctionsthatallowyoutoquerydatabases,executedatabasestoredprocedures,andperformsophisticatedcalculationsanddatavalidations.Itevensupportsregularexpressionsmatchingtechniques,and,ofcourse,itallowsyoutobuildyourowncustomfunctions.ThesefunctionscanbeusednotjustinthescriptsbutalsointhemappingofQuerytransformsinsidedataflows.

Withoutfurtherdelay,let’sgettolearningscriptinglanguage.

Page 111: SAP Data Services 4.x Cookbook

CreatingvariablesandparametersInthisrecipe,wewillextendthefunctionalityofourHelloWorlddataflow(seetheUnderstandingtheDesignertoolrecipefromChapter2,ConfiguringtheDataServicesEnvironment).Alongwiththefirstrowsaying“HelloWorld!”,wewillgeneratethesecondrow,providingyouwiththenameoftheDataServicesjobthatgeneratedthegreetings.

ThisexamplewillnotjustallowustogetfamiliarwithhowvariablesandparametersarecreatedbutalsointroduceustooneoftheDataServicesfunctions.

Page 112: SAP Data Services 4.x Cookbook

GettingreadyLaunchyourDesignertoolandopentheJob_HelloWorldjobcreatedinthepreviouschapter.

Page 113: SAP Data Services 4.x Cookbook

Howtodoit…Wewillparameterizeourdataflowsothatitcanreceivetheexternalvalueofthejobnamewhereitisbeingexecuted,andcreatethesecondrowaccordingly.

Wewillalsorequireanextraobjectinourjob,intheformofascriptthatwillbeexecutedbeforethedataflowandthatwillinitializeourvariablesbeforepassingtheirvaluestothedataflowparameters.

1. Usingthescriptbutton( )fromtherightinstrumentpanel,createascriptobject.

Nameitscr_init,andplaceittotheleftofyourdataflow.Donotforgettolinkthem,asshowninthefollowingscreenshot:

2. Tocreatedataflowparameters,clickonthedataflowobjecttoopenitinthemainworkspacewindow.

3. OpentheVariablesandParameterspanel.AllpanelsinDesignercanbeenabled/displayedwithhelptheofthebuttonslocatedinthetopinstrumentpanel,asinthefollowingscreenshot:

4. Iftheyarenotdisplayedonyourscreen,clickontheVariablesbuttononthetopinstrumentpanel( ).Then,right-clickonParametersandchooseInsertfromthecontextmenu.Specifythefollowingvaluesforthenewinputparameter:

NoteNotethatthe$signisveryimportantwhenyoureferenceavariableorparameter,asitdefinestheparameterinDataServicesandisrequiredsothatthecompilercan

Page 114: SAP Data Services 4.x Cookbook

parseitcorrectly.Otherwise,itwillbeinterpretedbyDataServicesasatextstring.DataServicesautomaticallyputsthedollarsigninwhenyoucreateanewvariableorparameterfromthepanelmenus.However,youshouldnotforgettouseitwhenyouarereferencingtheparameterorvariableinyourscriptorintheCallssectionofthedataflow.

5. Now,let’screateajobvariablethatwewillusetopassthevaluedefinedinthescripttothedataflowparameter.Forthis,usetheBack(Alt+Left)buttontogotothejoblevel(sothatitscontentisdisplayedinthemaindesignwindow).Then,right-clickonVariablesintheVariablesandParameterspanelandchooseInsertfromthecontextmenutoinsertanewvariable.Nameit$l_JobNameandassignthevarchar(100)datatypetoit,whichisthesameasthedataflowparametercreatedearlier.

6. Topassvariablevaluesfromthejobtotheinputparameterofthedataflow,gototheCallstaboftheVariablesandParameterspanelonthejobdesignlevel.Here,youshouldseetheinputdataflow$p_JobNameparameterwithanemptyvalue.

7. Double-clickonthe$p_JobNameparameterandreferencethe$l_JobNamevariableintheValuefieldoftheParameterValuewindow.ClickonOK:

8. Assignavaluetoajobvariableinthepreviouslycreatedscriptobject.Todothis,openthescriptinthemaindesignwindowandinsertthefollowingcodeinit:$l_JobName=‘Job_HelloWorld’;

9. Finally,let’smodifythedataflowtogenerateanewcolumninthetargettable.Forthis,openthedataflowinthemaindesignwindow.

10. OpentheQuerytransformandright-clickontheTEXTcolumntogotoNewOutputColumn…|InsertBelow.

11. IntheopenedColumnPropertieswindow,specifyJOB_NAMEasthenameofthenewcolumnandassignitthesamedatatype,varchar(100).

12. IntheMappingtaboftheQuerytransformfortheJOB_NAMEcolumn,specifythe‘Createdby‘||$p_JobNamestring.

13. Gobacktothejobcontextandcreateanewglobalvariable,$g_JobName,byright-clickingontheGlobalVariablessectionandselectingInsertfromthecontextmenu.

14. YourfinalQueryoutputshouldlooklikethis:

Page 115: SAP Data Services 4.x Cookbook

15. Now,gobacktothejoblevelandexecuteit.Youwillbeaskedtosaveyourworkandchoosetheexecutionparameters.Atthispoint,wearenotinterestedinmodifyingthem,sojustcontinuewiththedefaultones.

16. AfterexecutingthejobinDesigner,gotoManagementStudioandquerytheHELLO_WORLDtabletoseethatanewcolumnhasappearedwiththe‘CreatedbyJob_HelloWorld’value.

Page 116: SAP Data Services 4.x Cookbook

Howitworks…AllmainobjectsinDataServices(dataflow,workflow,andjob)canhavelocalvariablesorparametersdefined.thedifferencebetweenanobjectvariableandanobjectparameterisverysubtle.Parametersarecreatedandusedtoacceptthevaluesfromotherobjects(inputparameters)orpassthemoutsideoftheobject(outputparameters).Otherwise,parameterscanbehaveinthesamewayaslocalvariables—youcanusetheminthelocalfunctionsorusethemtostoreandpassthevaluestoothervariablesorparameters.Dataflowobjectscanonlyhaveparametersdefinedbutnotlocalvariables.Seethefollowingscreenshotoftheearlierexample:

Workflowandjobobjects,ontheotherhand,canonlyhavelocalvariablesdefinedbutnotparameters.Localvariablesareusedtostorethevalueslocallywithintheobjecttoperformvariousoperationsonthem.Asyouhaveseen,theycanbepassedtotheobjectsthatare“calling”forthem(gotoVariablesandParameters|Calls).

Thereisanothertypeofvariablecalledaglobalvariable.Thesevariablesaredefinedatthejoblevelandsharedamongallobjectsthatwereplacedinthejobstructure.

WhatyouhavedoneinthischapterisacommonpracticeinDataServicesETLdevelopment:passingvariablevaluesfromtheparentobject(jobinourexample)tothechildobject(dataflow)parameters.

Tokeepthingssimple,youcanspecifyhard-codedvaluesfortheinputdataflowparameters,butthisisusuallyconsideredbadpractice.

Whatwecouldalsodoinourexampleispassglobalvariablevaluestodataflowparameters.Globalvariablesarecreatedataverytopjoblevelandaresharedbyallnestedobjects,notjustwithimmediatejobchildobjects.Thatiswhytheyarecalledglobal.Theycanbecreatedonlyinthejobcontext,asshownhere:

Also,notethatinDataServices,youcannotreferenceparentobjectvariablesdirectlyintochildobjects.Youalwayshavetocreateinputchildobjectparametersandmapthemon

Page 117: SAP Data Services 4.x Cookbook

theparentlevel(usingtheCallstaboftheVariablesandParameterspanel)tolocalparentvariables.Onlyafterdoingthis,youcangoinyourchildobjectandmapitsparameterstothelocalchildobject’svariables.

Now,youcanseethatparametersarenotthesamethingasvariables,andtheycarryanextrafunctionofbridgingvariablescopebetweenparentandchild.Infact,youdonothavetomapthemtoalocalvariableinsideachildobjectifyouarenotgoingtomodifythem.Youcanuseparametersdirectlyinyourcalculations/columnmapping.

Lastthingtosayhereisthatdataflowsdonothavelocalvariablesatall.Theycanonlyacceptvaluesfromtheparentsandusetheminfunctioncalls/columnmapping.Thatisbecauseyoudonotwritescriptsinsideadataflowobject.Scriptsareonlycreatedatthejoborworkflowlevelorinsidethecustomfunctionsthathavetheirownvariablescope.

DatatypesavailableinDataServicesaresimilartocommonprogramminglanguagedatatypes.Foramoredetaileddescription,referencetheofficialDataServicesdocumentation.

NoteTheblobandlongdatatypescanonlybeusedbystructurescreatedinsideadataflowor,inotherwords,columns.Youcannotcreatescriptvariablesanddataflow/workflowparametersofbloborlongdatatypes.

Page 118: SAP Data Services 4.x Cookbook

There’smore…TrytomodifyyourJob_HelloWorldjobtopassglobalvariablevaluestodataflowparametersdirectly.Todothis,usethepreviouslycreatedglobalvariable$g_JobName,specifyahard-codedvalueforit(orassignitavalueinsideascript,aswedidwiththelocalvariable)andmapittotheinputdataflowparameterontheCallstaboftheVariablesandParameterspanelinthejobcontext.Donotforgettorunthejobandseetheresult.

Page 119: SAP Data Services 4.x Cookbook

CreatingascriptYes,technicallywecreatedourfirstscriptinthepreviousrecipe,butlet’sbehonest—thisisnotthemostadvancedscriptintheworld,anditdoesnotprovideuswithmuchknowledgeregardingscriptinglanguagecapabilitiesinDataServices.Finally,althoughsimplicityisusuallyavirtue,itwouldbenicetocreateascriptthatwouldhavemorethanonerowinit.

Inthefollowingrecipe,wewillcreateascriptthatwoulddosomedatamanipulationandalittlebitoftextprocessingbeforepassingavaluetoadataflowinputparameter.

Page 120: SAP Data Services 4.x Cookbook

Howtodoit…Clearthecontentsofyourscr_initscriptobjectsandaddthefollowinglines.Notethateverycommandorfunctioncallshouldendwithasemicolon:#Scriptwhichdeterminesnameofthejoband

#preparesitfordataflowinputparameter

print(‘INFO:scr_initscripthasstarted…’);

while($l_JobNameISNULL)

begin

if($g_JobNameISNOTNULL)

begin

print(‘INFO:assigning$g_JobNamevalue’

||’of{$g_JobName}toa$l_JobNamevariable…’);

$l_JobName=$g_JobName;

end

else

print(‘INFO:globalvariable$g_JobNameisempty,’

||’calculatingvaluefor$l_JobName’

||’usingDataServicesfunction…’);

$l_JobName=job_name();

print(‘INFO:newvalueassignedtoalocal’

||‘variable:$l_JobName={$l_JobName}!’);

end

print(‘INFO:scr_initscripthassuccessfullycompleted!’);

TrytorunajobnowandconfirmthattherowinsertedintothetargetHELLO_WORLDtablehasaproperjobnameinthesecondcolumn.

Page 121: SAP Data Services 4.x Cookbook

Howitworks…Weintroducedacoupleofnewelementsofscriptinglanguagesyntax.The#signdefinesthecommentsectioninDataServicesscripts.

Notethatwealsoreferencedvariablevaluesinthetextstringusingcurlybrackets{$l_JobName}.Ifyouskipthem,theDataServicescompilerwillnotrecognizevariablesmarkedwiththe$signandwillusethevariablenameanddollarsignaspartofthestring.

TipYoucanalsousesquarebrackets[]insteadofcurlybracketstoreferencevariable/parametervalueswithinatextstring.Thedifferencebetweenthemisthatifyouusecurlybrackets,thecompilerwillputthevariablevalueinthequotedstring`value`insteadofusingitasitisusedinthetextstring.

ScriptinglanguageinDataServicesiseasytolearnasitdoesnothavemuchvarietyintermsofconditionalconstructs.Ithasasimplesyntax,andallitspowerscomefromfunctions.

Inthisparticularexample,youcanseeonewhileloopandoneconditionalconstruct.ThewhileloopistheonlytypeofloopsupportedintheDataServicesscriptinglanguageandtheonlyconditionalsupportedaswell.Thisisreallyallyouneedinmostcases.

Thewhile(<condition>)loopexpressionshouldincludeablockofcodestartingwithbeginandendingwithend.Theconditioncheckhappensatthebeginningofeachiteration(eventheveryfirstone),sokeepitinmindasevenyourveryfirstloopiterationcanbeskipped.Inourexample,thelooprunswhilethe$l_JobNamelocalvariableisempty.

Thesyntaxoftheifconditionalelementisthesame—eachconditionalblockshouldbewrappedinbegin/end.Itsupportselseif,andyoucanincludemultipleconditionalstatementsseparatedbyANDorOR.Wecanusetheconditionaltocheckwhethertheglobalvariablefromwhichwewillbesourcingvalueforthelocalvariableisemptyornot.Ifitisnotempty,wewouldassignittoalocalvariable,andifit’sempty,weshouldgenerateajobnameusingthejob_name()functionthatreturnsthenameofthejobitisexecutedin.

Theprint()functionisamainloggingfunctionintheDataServicesscriptinglanguage.Itallowsyoutoprintoutmessagesinthetracelogfile.Lookatthefollowingscreenshot.Itshowsanexcerptfromthetracelogfiledisplayedinoneofthetabsinthemaindesignwindowafteryouexecutethejob.

NoteWhenyouexecutethejob,DataServicesgeneratesthreelogfiles:tracelog,monitorlog,anderrorlog.Wewillexplaintheselogsindetailintheupcomingrecipesandchapters.Fornow,usethetracelogbuttontoseetheresultofyourjobexecution.

Page 122: SAP Data Services 4.x Cookbook

Messagesgeneratedbytheprint()functionaremarkedinthetracelogasPRINTFN(seethefollowingscreenshot).Youcanalsoaddyourownformattingintheprint()functiontomakethemessagesmoredistinguishablefromtherestofthelogmessages(seetheINFOwordaddedintheexamplehere):

Page 123: SAP Data Services 4.x Cookbook

UsingstringfunctionsHere,wewillexploreafewusefulstringfunctionsbyupdatingourHelloWorldcodetoincludesomeextrafunctionality.ThereisonlyonedatatypeinDataServicesusedtostorecharacterstrings,andthatisvarchar.Itkeepsthingsprettysimpleforstring-relatedandconversionoperations.

Page 124: SAP Data Services 4.x Cookbook

Howtodoit…Here,youwillseetwoexamples:applyingstringfunctionstransformationwithinadataflowandusingstringfunctionsinthescriptobject.

FollowthesestepstousestringfunctionsinDataServicesusingtheexampleofthereplace_substr()function,whichsubstitutespartofthestringwithanothersubstring:

1. OpentheDF_HelloWorlddataflowintheworkspacewindowandaddanewQuery

transformnamedWho_says_What.PutitaftertheQuerytransformandbeforethetargettemplatetable.

2. OpentheWho_says_WhatQuerytransformandaddanewWHO_SAYS_WHAToutputcolumnofthevarchar(100)type.

3. Addthefollowingcodeintoamappingtabofthenewcolumn:replace_substr($p_JobName,‘_’,’’)||’says’||word(Query.TEXT,1)

4. YournewQuerytransformshouldlookliketheoneinthefollowingscreenshot.Notethatyoushouldusesinglequotestodefinethestringtextinmappingorscript:

5. Thefinalversionofthedataflowshouldlooklikethis:

Saveyourworkandexecutethejob.GotoManagementStudiotoseethecontentsofthedbo.HELLO_WORLDtable.Thetablenowhasanewcolumnwiththe“JobHelloWorld”saysHellostring.

UsingstringfunctionsinthescriptWearenotquitehappywiththeWho_says_Whatstring.Obviously,onlyHelloWorldshouldbeputindoublequotes(theydonotaffectthebehaviorofstringtextinData

Page 125: SAP Data Services 4.x Cookbook

Services).Also,wewillusetheinit_cap()functiontomakesurethatonlythefirstletterofourjobnameiscapitalized.

ChangethemappingofWHO_SAYS_WHATtothefollowingcode:‘Job”’||init_cap(ltrim(lower($p_JobName),‘job_’))||’”’||’says’||word(Query.TEXT,1)

Accordingtothislogic,weareexpectingthejobnametostartwiththeJob_prefix.Inthiscase,wehavetoaddanextralogictothescriptlogicrunningbeforethedataflowtomakesurethatwehavethisprefixinourjobname.Thefollowingcodewilladditifthejobnameisnotvalidaccordingtoournamingstandards.Addthefollowingcodebeforethelastprint()functioncall:#Checkthatjobisnamedaccordingtothenamingstandards

if(match_regex($l_JobName,’^(job_).*$’,

‘CASE_INSENSITIVE’)=1)

begin

print(‘INFO:thejobnameiscorrect!”);

end

else

begin

print(‘WARNING:jobhasnotbeennamedaccording’

||‘tothestandards.’

||‘Changingthenameof{$l_JobName}…’);

$l_JobName=‘Job_’||$l_JobName;

print(‘INFO:newjobnameis’||$l_JobName);

end

Asthefinalstep,savethejobandexecuteit.Now,thestringinyourthirdcolumnshouldbeJob“Helloworld”saysHello.Now,evenifyourenameyourjobandremovetheJob_prefix,yourscriptshouldseethisandaddtheprefixtoyourjobname.

Page 126: SAP Data Services 4.x Cookbook

Howitworks…Asyoucanseeintheprecedingexample,weusedcommonstringmanipulationfunctionssimilartotheotherprogramminglanguages.

Inthefirstpartoftherecipe,wetransformedthemappingoftheWHO_SAYS_WHATcolumntostripouttheJob_prefixfromtheparametervalue.Thisallowsustocorrectlywraptherestofthejobnameintodoublequotesforbetterpresentation.

Theinit_cap()functioncapitalizesthefirstcharacteroftheinputstring.

Thelower()functiontransformstheinputstringtolowercase.

Theltrim()functiontrimsthespecifiedcharactersontheleft-handsideoftheinputstring.Usually,itisusedtoquicklyremoveleadingblankcharactersinstrings.Thertrim()functiondoesthesamethingbutfortrailingcharacters.

Theword()functionisextremelyusefulinparsingtheinputstringtoextract“words”orpartsofastringseparatedbyspacecharacters.Thereisanextendedversionoftheword_ext()function.Itacceptsaspecifiedseparatorasthethirdparameter.Asthesecondparameterinboththeseversions,youwillspecifythewordnumbertobeextractedfromthestring.

Youprobablyhavealreadyguessedthat||isusedasastringconcatenationoperator.

Thesecondpartofthechangesimplementedinthisrecipeinthescriptobjectcontainedtheveryinterestingandpowerfulmatch_regex()function.ItisoneofthefewfunctionsthatrepresentsregularexpressionsupportwithinDataServices.Ifyouarenotfamiliarwithregularexpressionconcept,youcanfindmanysourcesontheInternetexplainingitindetail.Regularexpressionsaresupportedinalmostallmajorprogramminglanguagesandallowyoutospecifymatchingpatternsinaveryshortform.Thismakesthemveryeffectivetoparseastringandfindamatchingsubstringorpatternforit.

IntheDataServicesmatch_regex()function,ifyouspecifyaregularexpressionpatternstringasasecondinputparameter,itwillreturn1ifitfindsthematchofthepatternintheinputstring.Itwillreturn0ifitdoesnotfindthematch.Itisaveryeffectivewaytovalidatetheformatofthetextstringorlookforspecificcharactersorpatternsinthestring.

Here,wecheckedwhetherourjobhastheprefixJob_initsname.Ifnot,weshouldaddittothebeginningofthejobnamebeforepassingthevaluetoadataflow.

Page 127: SAP Data Services 4.x Cookbook

There’smore…FeelfreetoexploretheexistingstringfunctionsavailableinDataServices.Therearesomeextendedversionsofthefunctionswealreadyusedintheprecedingrecipe.Youcantakealookatthem.Forexample,theltrim_blanks()functionallowsyoutoquicklyremoveblankcharacterswithoutspecifyingextraparameters.Itsextendedversion,theltrim_blanks_ext(),substr()functionreturnspartofthestringfromanotherstring.Thereplace_substr()functionisusedtosubstitutepartofthestringwithanotherstring.

Wewilldefinitelyusesomeoftheminourfuturerecipesthroughoutthebook.

Page 128: SAP Data Services 4.x Cookbook

UsingdatefunctionsCorrectlydealingwithdatesandtimeiscriticallyimportantindatawarehouses.Intheend,youshouldunderstandthatthisisoneofthemostimportantattributesinamajorityoffacttablesinyourDWH,whichdefinesthe“position”ofyourdatarecords.Lotsofreportsarefilteringdatabydate-timefieldsbeforeperformingdataaggregation.ThisisprobablywhyDataServiceshasadecentamountofdatefunctions,allowingavarietyofoperationsondate-timevariablesandtablecolumns.

DataServicessupportsthefollowingdatedatatypes:date,datetime,time,andtimestamp.Theydefinewhatpartoftimeunitsarestoredinthefield:

date:Thisstoresthecalendardatedatetime:Thisstoresthecalendardateandthetimeofthedaytime:Thisstoresonlythetimeofthedaywithoutthecalendardatetimestamp:Thisstoresthetimeofthedayinsubseconds

Page 129: SAP Data Services 4.x Cookbook

Howtodoit…GeneratingcurrentdateandtimeHereisascriptthatcanbeincludedinyourcurrentscriptobjectintheHelloWorldjobtodisplaythegenerateddatevaluesinthejobtracelog.

Totestthisscript,createanewjobcalledJob_Date_FunctionsandanewscriptwithinitcalledSCR_Date_Functions.Also,createfourlocalvariablesinthejob:$l_dateofthedatedatatype,$l_datetimeofthedatetimedatatype,$l_timeofthetimedatatype,and$l_timestampofthetimestampdatatype.

Printoutdatefunctionexamplestothetracelog:$l_date=sysdate();

print(‘$l_date=[$l_date]’);

$l_datetime=sysdate();

print(‘$l_datetime=[$l_datetime]’);

$l_time=systime();

print(‘$l_time=[$l_time]’);

$l_timestamp=systime();

print(‘$l_timestamp=[$l_timestamp]’);

$l_timestamp=sysdate();

print(‘$l_timestamp=[$l_timestamp]’);

Thetracelogfiledisplaysthefollowinginformation:$l_date=2015.05.05

$l_datetime=2015.05.0518:47:27

$l_time=18:47:27

$l_timestamp=1900.01.0118:47:27.030000000

$l_timestamp=2015.05.0518:15:21.472000000

Asyoucansee,differentdatatypesareabletostoredifferentamountsofdata.Also,youseethatthesystime()functiondoesnotgeneratedate-relateddata(days,months,andyears),and1900.01.01thatyouseeinthefirsttimestampvariableoutputisadummydefaultdatevalue.Thesecondoutputshowsthatweusedthesysdate()functiontogetthisinformation.

ExtractingpartsfromdatesHerearesomeusefuloperationsyoucanperformtoextractpartsfromdatatypevalues.Notethatallofthemreturnintegervalues.Youcanappendthesecommandstothescriptobjectalreadycreatedinordertotesthowtheywork:$l_datetime=sysdate();

print(‘$l_datetime=[$l_datetime]’);

#ExtractYearfromdatefield

print(‘Year=’||date_part($l_datetime,‘YY’));

#ExtractDayfromdatefield

Page 130: SAP Data Services 4.x Cookbook

print(‘Day=’||date_part($l_datetime,‘DD’));

#ExtractMonthfromdatefield

print(‘Month=’||date_part($l_datetime,‘MM’));

#Displaydayinmonthfortheinputdate

print(‘DayinMonth=’||day_in_month($l_datetime));

#Displaydayinweekfortheinputdate

print(‘DayinWeek=’||day_in_week($l_datetime));

#Displaydayinyearfortheinputdate

print(‘DayinYear=’||day_in_year($l_datetime));

#Displaynumberofweekinyear

print(‘WeekinYear=’||week_in_year($l_datetime));

#Displaynumberofweekinmonth

print(‘WeekinMonth=’||week_in_month($l_datetime));

#Displaylastdayofthecurrentmonthintheprovidedinputdate

print(‘Lastdateofthedatemonth=’||last_date($l_datetime));

Theoutputinatracelogshouldbesimilartothis:$l_datetime=2015.05.0515:55:09

Year=2015

Day=5

Month=5

DayinMonth=5

DayinWeek=2

DayinYear=125

WeekinYear=18

WeekinMonth=1

Lastdateofthedatemonth=2015.05.3115:55:09

Page 131: SAP Data Services 4.x Cookbook

Howitworks…Somefunctionsusetheextraformattingparameter,forexample,date_part()does.Youcanalsouse‘HH’,‘MI’,‘SS’toextracthours,minutes,andsecondsrespectively.

Therearealsoshorterversionsofthedate_part()functionthatallowyoutoextractyear,month,orquarterwithoutspecifyinganyextraformattingparameters.Forthis,youcanusetheyear(),month(),andquarter()functions.

Aninterestingfunctionistheisweekend()function.Itreturns1ifthespecifieddatevalueisaweekend,and0ifit’snot.

Page 132: SAP Data Services 4.x Cookbook

There’smore…YoucanaccessthefulllistoffunctionsavailableinDataServicesfromdifferentplacesinDesigner.Oneoptionistoopenthescriptobject.ThereisaFunctions…buttonatthetopofthemaindesignwindow.ClickittoopentheSelectFunctionwindow.Allfunctionsarecategorizedandhaveashortdescriptionexplaininghowtheyworkandwhattheyrequireasinputparameters.Lookatthisscreenshot:

ThesamebuttonisalsoavailableontheMappingtaboftheQuerytransforminsideadataflow,soyoucanaccessitifyouaretryingtocreateatransformationruleforoneofthecolumns.

ThislistisalsoavailableinSmartEditor,butwewilldiscussitindetailinoneofthenextrecipes.Ofcourse,youcanalwaysreferencetheDataServicesdocumentationtoseeallfunctionsavailableinDataServices,andsomeexamplesoftheirusage.

Page 133: SAP Data Services 4.x Cookbook

UsingconversionfunctionsConversionfunctionsallowyoutochangethedatatypeofthevariableorcolumndatatypeintheQuerytransformfromonetoanother.Thisisveryhandy,forexample,whenyoureceivedatevaluesasstringcharactersandwanttoconvertthemtointernaldatedatatypestoapplydatefunctionsorperformarithmeticoperationsonthem.

Page 134: SAP Data Services 4.x Cookbook

Howtodoit…Oneofthemostusedfunctionstoconvertfromonedatatypetoanotheristhecast()function.Lookattheexampleshere.Asusual,createanewjobwithanemptyscriptobjectandtypethiscodeinit.Createa$l_varcharjoblocalvariableofthevarchar(10)datatype:$l_varchar=‘20150507’;

#Castingvarchartointeger

print(cast($l_varchar,‘integer’));

#Castingvarchartodecimal

print(cast($l_varchar,‘decimal(10,0)’));

#Castingintegervaluetovarchar

print(cast(987654321,‘varchar’(10)’));

#Castingintegertoadouble

print(cast($l_varchar,‘double’));

Theoutputisshownhere:

Rememberthattheprint()functionautomaticallyconvertstheinputtovarcharinordertodisplayitinatracefile.Notehowcastingtoadoubledatatypechangedtheappearanceofthenumber.

Castingishelpfulinordertomakesurethatyouaresendingvaluesofthecorrectdatatypetothecolumnofaspecificdatatypeorfunctionthatexpectsthedataoftheparticulardatatyperequiredforittoworkcorrectly.AutomaticconversionsperformedbyDataServiceswhenthevalueofonedatatypeisassignedtoavariableorcolumnofadifferentdatatypecouldproduceunexpectedresultsandleadtoerrors.

However,themostusefulconversionfunctionsarefunctionsusedtoconvertastringtoadateandviceversa.Addthefollowinglinestoyourscriptandrunthejob:$l_varchar=‘20150507’;

#Castingvarchartoadate

print(to_date($l_varchar,‘YYYYMMDD’));

#Convertingchangingformatoftheinputdate

#from”YYYYMMDD’to‘DD.MM.YYYY’

print(

to_char(to_date($l_varchar,‘YYYYMMDD’),‘DD.MM.YYYY’)

);

Page 135: SAP Data Services 4.x Cookbook

Whenconvertingtextstringtoadate,youhavetospecifytheformatofthestringsothattheDataServicescompilercaninterpretandconvertthevaluescorrectly.ThefulltableofpossibleformatsavailableinthesetwofunctionsisavailableintheDataServicesReferenceGuideavailablefordownloadathttp://help.sap.com.Refertoitformoredetails.Herearesomemoreexamplesoftheto_char()functionconversionsofadatevariable:$l_date=sysdate();

print(to_char($l_date,‘DDMONYYYY’));

print(to_char($l_date,‘MONTH-DD-YYYY’));

Thetracelogshouldbesimilartothefollowingone:07MAY2015

MAY-07-2015

Let’sgetfamiliarwithanotherinterestingdatatype:interval.Ithelpsyouperformarithmeticoperationsondates.Thescripthereperformsarithmeticoperationsonadatestoredinthe$l_datevariablebyfirstadding5daystoit,thencalculatingthefirstdateofthenextmonth,andfinallysubtracting1secondfromthedate-timevaluestoredinthe$l_datetimevariable.

Seetheexamplehere:$l_date=to_date(‘01/05/2015’,‘DD/MM/YYYY’);

print(‘Date=’||$l_date);

#Add5daystothe$l_datevalue

print(‘{$l_date}+5days=’||$l_date+num_to_interval(5,‘D’));

#Calculatefirstdayofnextmonth

print(‘Firsdayofnextmonth=’||last_date($l_date)+num_to_interval(1,‘D’));

#Subtract1secondoutofthedatetime

$l_datetime=to_date(‘01/05/201500:00:00’,‘DD/MM/YYYYHH24:MI:SS’);

print(‘{$l_datetime}minus1second=’||$l_datetime-num_to_interval(1,‘S’));

Page 136: SAP Data Services 4.x Cookbook

Howitworks…Youprobablyhavenotnoticed,butyouhavealreadyseentheresultsofimplicitdatatypeconversionmadeautomaticallybyDataServicesinthepreviousrecipes.Forexample,dateextractfunctionsreturnedintegervaluesthatwereconvertedautomaticallytovarcharsothattheycouldbeconcatenatedwiththestringpartanddisplayedusingtheprint()function,which,bytheway,canacceptonlyvarcharasaninputparameter.

DataServicesdoesdatatypeconversionsautomaticallywheneveryouassignavalueofonedatatypetoavariableorcolumnofadifferentdatatype.TheonlypotentialpitfallhereisthatifyourelyonautomaticconversionyouareleavingsomeguessingworktoDataServicesandcangetunexpectedresultsintheend.So,understandinghowandwhenconversionhappensautomaticallytoimplementmanualchecksinsteadcouldbecritical.ManybugsinETLcodearerelatedtoincorrectdatatypeconversion,soyoushouldbeextracareful.

Page 137: SAP Data Services 4.x Cookbook

There’smore…Trytoexperimentwithautomaticconversion.Forexample,whenaddingintegernumberstodatevariables:sysdate()+10toseehowDataServicesbehavesandwhichdefaultparametersitusesforformattingautomaticallyconvertedvalue.

Page 138: SAP Data Services 4.x Cookbook

UsingdatabasefunctionsThereisnogreatvarietyoffunctionsinthisarea.DataServicesencouragesyoutocommunicatewithdatabaseobjectsandcontroltheflowofdatawithinadataflow.

Page 139: SAP Data Services 4.x Cookbook

Howtodoit…Youwilllearnalittlemoreaboutthefunctionshere.

key_generation()First,let’slookatthekey_generation()function.Thisisthefunctionthecanbecalledonlyfromthedataflow(whenusedincolumnmapping),sowearenotinterestedinitatthispointaswecannotuseitintheDataServicesscripts.

ThisfunctionisactuallysimilartotheKey_Generationtransformobjectthatcanbeusedaspartofadataflowaswell,anditisusedtolookupthehighestkeyvaluefromatablecolumnandgeneratethenextone.Thisisoftenusedtopopulatethekeycolumnofthenewrecordwiththeuniquevaluesbeforeinsertingthisrecordtoatargettable.WewilltakeacloserlookattheKey_Generationtransformintheupcomingchapters.

total_rows()Thisfunctionisusedtocalculatethetotalnumberofrowsinthedatabasetable.Theeasiestandquickestwaytocheckinthescriptwhetherthetableisemptyornotbeforerunningadataflowpopulatingthistableistorunthisfunction.Then,accordingtotheresults,youcanmakefurtherdecisions,thatis,truncatethetabledirectlyfromascriptbeforerunningthenextdataflow.Alternatively,youcanuseconditionalstoskipthenextportionofETLcodeentirely.

Seetheexampleofhowthisfunctionisused.Asusual,youcancreateanewjobwithascriptobjectinsideit.Typethefollowingcodeandrunthejob:print(

total_rows(‘DWH.DBO.DIMACCOUNT’)

);

DonotforgettoimportthetableintoyourDWHdatastoreasyoucanreferenceonlytablesthathavebeenimportedinyourDataServicesrepository.Lookatthisscreenshot:

sql()Thesql()functionisauniversalfunctionthatallowsyoutoperformSQLcallstoanydatabaseforwhichyoucreatedadatastoreobject.YoucanrunDDLandDMLstatements,SELECTqueries,andevencallstoredproceduresanddatabasefunctions.

NoteYoushouldbeusingthesql()functionverycarefullyinyourscripts,andwedonotrecommendthatyouuseitatallincolumnmappingsinsideadataflow.Thisfunctionshouldonlybeusedtoreturnonerecordwithasfewfieldsaspossible.So,alwaystestthe

Page 140: SAP Data Services 4.x Cookbook

statementyouplaceinsidethesql()functiondirectlyinthedatabasefirsttomakesureitbehavesasexpected.

Forexample,tocalculatethetotalnumberofrowsintheDimAccounttablewiththesql()function,youcanusethefollowingcode:print(‘TotalnumberofrowsinDBO.DIMACCOUNTtableis:’||

sql(‘DWH’,‘SELECTCOUNT(*)FROMDBO.DIMACCOUNT’)

);

Page 141: SAP Data Services 4.x Cookbook

Howitworks…Thesql()functionisveryconvenientfordoingstoredproceduresexecutions,truncating,andcreatingdatabaseobjectsanddoinglookupsforaggregatedvalueswhenthequeryreturnsonlyoneroworevenonevalue.Ifyoutrytoreturnthedatasetofmultiplerows,youwillgetonlythevalueofthefirstfieldfromthefirstrow.Itisstillpossibletoquerymultiplefields,butitwillrequirethatyoumodifythequeryitselfandaddextracodetoparsethereturnedstring(seetheexamplehere):#returningmultiplefieldsfromadatabasetable

$l_row=sql(‘DWH’,‘SELECTCONVERT(VARCHAR(10),ACCOUNTKEY)’||

’+','+CONVERT(VARCHAR(50),ACCOUNTDESCRIPTION)’||

‘FROMDBO.DIMACCOUNT’);

$l_AccountKey=word_ext($l_row,1,’,’);

$l_AccountDescription=word_ext($l_row,2,’,’);

print(‘AccountKey={$l_AccountKey}’);

print(‘AccountDescription={$l_AccountDescription}’);

Asyoucansee,thisisalotofcodeforsuchasimpleprocedure.IfyouwanttoextractandparseamultiplerowsintheDataServicesscript,youwillhavetocreatearow-countingmechanismandloopthroughtherowsbydoingmultiplequeryexecutionswithinaloop.However,youcantrytodothisyourselfasanexercisetopracticealittlebitofDataServicesscriptinglanguage.

NoteNotethatyoudonothavetoimportthetableyouwanttoreferenceinthesql()functionintoadatastore.

Page 142: SAP Data Services 4.x Cookbook

UsingaggregatefunctionsAggregatefunctionsareusedindataflowQuerytransformstoperformaggregationonthegroupeddataset.

YoushouldbefamiliarwiththesefunctionsastheyarethesameonesusedintheSQLlanguage:avg(),min(),max(),count(),count_distinct(),andsum().

Page 143: SAP Data Services 4.x Cookbook

Howtodoit…Todemonstratetheuseofaggregatefunctions,wewillperformasimpleanalysisofoneofourtables.ImporttheDimGeographytableintotheDWHdatastoreandcreateanewjobwithasingledataflowinsideitusingthesesteps:

1. YourdataflowshouldincludetheDimGeographysourcetableandtheDimGeography

targettemplatetableinaSTAGEdatabasetosendtheoutputto:

2. OpentheQuerytransformandcreatethefollowingoutputstructure:

TheCOUNTRYREGIONCODEcolumncontainscountrycodevaluesandwillbethecolumnonwhichweperformthegroupingofthedataset.Itismappedfromtheinputdatasettotheoutput.Also,draganddropittotheGROUPBYtaboftheQuerytransformfromtheinputdatasettospecifyitasagroupingcolumn.OthercolumnsarecreatedasNewOutputColumn…(choosethisoptionfromthecontextmenuoftheCOUNTRYREGIONCODEcolumn)andcontainthefollowingmappings(seethetablehere):

Outputcolumnname Mappingexpression

COUNT_DISTINCT_PROVINCE count_distinct(DIMGEOGRAPHY.STATEPROVINCENAME)

COUNT_PROVINCE count(DIMGEOGRAPHY.STATEPROVINCENAME)

MIN_KEY min(DIMGEOGRAPHY.GEOGRAPHYKEY)

MAX_KEY max(DIMGEOGRAPHY.GEOGRAPHYKEY)

3. Savethechangesandrunthejob.Now,gotoManagementStudioandquerythecontentsofthenewlycreatedDimGeographytableintheSTAGEdatabase.Youshould

Page 144: SAP Data Services 4.x Cookbook

gettheresultsasshowninthisscreenshot:

Page 145: SAP Data Services 4.x Cookbook

Howitworks…WhatwehavejustbuiltinthedataflowintheQuerytransformcanbedonewiththefollowingSQLstatement:select

CountryRegionCode,

COUNT(DISTINCTStateProvinceName),

COUNT(StateProvinceName),

MIN(GeographyKey),

MAX(GeographyKey)

from

dbo.DimGeography

groupby

CountryRegionCode;

First,thecount_distinct()functioncalculatesthenumberofdistinctprovinceswithineachcountry,count()calculatesthetotalnumberofrowsforeachcountry,andmin()andmax()showthelowestandhighestGeographyKeyvalueswithineachcountrygroup,respectively.

NoteYoucannotusethesefunctionsdirectlyinthescriptinglanguagebutonlyintheQuerytransform.IfyouneedtoextracttheaggregatedvaluesfromthedatabasetableswithinDataServicesscript,youcanusesql()containingtheSELECTstatementwithaggregateddatabasefunctions.

Page 146: SAP Data Services 4.x Cookbook

UsingmathfunctionsDataServiceshasastandardsetoffunctionsavailabletoperformmathematicaloperations.Inthisrecipe,wewillusethemostpopularofthemtoshowyouwhatoperationscanbeperformedonnumericdatatypes.

Page 147: SAP Data Services 4.x Cookbook

Howtodoit…1. CreateanewjobandnameitJob_Math_Functions.2. Insidethisjob,createasingledataflowcalledDF_Math_Functions.3. ImporttheFactResellerSalestableinyourDHWdatastoreandaddittothe

dataflowasasourceobject.4. AddthefirstQuerytransformafterthesourcetableandlinkthemtogether.Then,

openitanddragtwocolumnstotheoutputschema:PRODUCTKEYandSALESAMOUNT.SpecifytheFACTRESELLERSALES.PRODUCTKEY=354filteringconditionintheWHEREtab:

5. AddthesecondQuerytransformandrenameitGroup.Here,wewillperformagroupingoperationontheproductkeyweselectedintheprevioustransform.Todothis,addthePRODUCTKEYcolumnintheGROUPBYtabandapplythesum()aggregatefunctiononSALESAMOUNTintheMappingtab:

6. Finally,addthelastQuerytransformcalledMathandlinkittothepreviousone.Insideit,dragallcolumnsfromthesourcetothetargetschemaandaddthenewonesusingNewOutputColumn….Specifymappingexpressions,asinthefollowingscreenshot:

Page 148: SAP Data Services 4.x Cookbook

7. Asthelaststep,addanewtemplatetablelocatedintheSTAGEdatabaseownedbythedbouser.ThistemplateiscalledFACTRESELLERSALES.Yourdataflowshouldlooklikethisnow:

8. Saveandrunthejob.Then,tochecktheresultdataset,eitherquerythenewtablefromSQLServerManagementStudio.Alternatively,openyourdataflowinDataServicesandclickonthemagnifiedglassiconofyourFACTRESELLERSALES(DS_STAGE.DBO)targettableobjecttobrowsethedatadirectlyfromDataServices.

Page 149: SAP Data Services 4.x Cookbook

Howitworks…TheresultyouseehereverywellexplainstheeffectofthemathfunctionsappliedtoyourSALESAMOUNTcolumnvalue:

Theceil()functionreturnsthesmallestintegervalue(automaticallyconvertedtoaninputcolumndatatype;thatiswhy,youseetrailingzeroes)equaltoorgreaterthanthespecifiedinputnumber.

Thefloor()functionreturnsthehighestintegervalueequaltoorlessthantheinputnumber.

Therand_ext()functionreturnsarandomrealnumberfrom0to1.InDataServices,youdonothavemuchcontroloverthebehaviorofthefunctionsthatgeneraterandomnumbers.So,youhavetoapplyextramathematicaloperationstodefinetherangeofthegeneratedrandomnumbersandtheirtypes.Intheexampleearlier,wegeneratedrandomintegernumbersfrom0to10inclusively.

Thetrunc()andround()functionsperformroundingoperationssimilartoceil()andfloor(),buttrunc()justtruncatesthenumbertothelengthspecifiedinthesecondparameterandshowsyoutheresultasis.Ontheotherhand,theround()functionroundsthenumberaccordingtotheprecisionspecified.

Page 150: SAP Data Services 4.x Cookbook

There’smore…Asanexercise,trytheotherDataServicesmathematicalfunctions.Modifythecreateddataflowtoincludeexamplesoftheirusage.Toseethefulllistofmathematicalfunctionsavailable,usetheFunctions…buttoninthescriptobjectorcolumnmappingfieldandchoosetheMathFunctionscategoryintheSelectFunctionwindow:

Page 151: SAP Data Services 4.x Cookbook

UsingmiscellaneousfunctionsActually,miscellaneousgroupsincludealmostalltypesoffunctionsthatcannoteasilybecategorized.Amongmiscellaneousfunctions,therearefunctionsthatallowyoutoextractusefulinformationfromtheDataServicesrepository,forexample,nameofthejob,theworkflowordataflowitisexecutedfrom,functionsthatallowyoutoperformadvancedstringsearches,functionssimilartootherstandardSQLfunctions,andmanyothers.Throughoutthebook,wewillveryoftenuseDataServicesmiscellaneousfunctions.So,inthisrecipe,wewilltakealookatsomeofthosethatareusuallyusedinthescriptsandhelpyouquerytheDataServicesrepository.

Page 152: SAP Data Services 4.x Cookbook

Howtodoit…Atthispoint,youshouldbeprettycomfortablecreatingnewjobs,scriptobjects,anddataflowobjects.So,Iwillnotexplainthestepsindetaileverytimeweneedtocreateanewtestjobobject.Ifyouforgothowtodoit,refertothepreviousrecipesinthebook.

1. Createanewjobandaddascriptobjectinit.2. Openthescriptandpopulateitwiththefollowingcode.Thiscodeshowsyouan

exampleofhowtousethreemiscellaneousfunctions:ifthenelse(),decode(),andnvl():#Conditionalfunctions

$l_string=‘Lengthofthatstringis38characters’;

$l_result=ifthenelse(length($l_string)=8,print(‘TRUE’),print(‘FALSE’));

$l_string=‘Lengthofthatstringis38characters’;

$l_result=decode(

length($l_string)=10,print(‘TRUE’),

length($l_string)=12,print(‘TRUE’),

length($l_string)=38,print(‘TRUE’),

print(‘FALSE’)

);

$l_string=NULL;

$l_string=nvl($l_string,‘Emptystring’);

print($l_string);

3. Forthisscripttowork,youshouldalsomakesurethatyouhavelocalvariablescreatedatthejoblevel—$l_stringand$l_resultofthevarchar(255)datatype.

Page 153: SAP Data Services 4.x Cookbook

Howitworks…MostofthemiscellaneousfunctionsarefunctionsthatrequireadvancedknowledgeofDataServices.Inthisbook,youwillseealotofexamplesofhowtheycanbeusedincomplexdataflowsandDataServicesscripts.

Inthisrecipe,wecanseethreeconditionalfunctions:ifthenelse(),decode(),andnvl().Theyallowyoutoevaluatetheresultofanexpressionandexecuteotherexpressions,dependingontheresultoftheinitialevaluation.

Afterexecutingtheearlierscript,youcanseethefollowingtracelogrecords:817212468PRINTFN18/05/20158:12:34p.m.FALSE

817212468PRINTFN18/05/20158:12:34p.m.TRUE

817212468PRINTFN18/05/20158:12:34p.m.Emptystring

Theifthenelse()functionacceptsoneinputparameter:acomparisonexpression,whichreturnseitherTRUEorFALSE.IfTRUE,thenthesecondparameterofifthenelse()isexecuted(ifitisanexpression)orjustreturnedastheresultofthefunction.Thethirdparameterisexecuted(orreturned)ifthecomparisonexpressionreturnsFALSE.

Thedecode()functiondoesthesamethingastheifthenelse()function,exceptthatitallowsyoutoevaluatemultipleexpressions.Itsparametersgoinpairs,asyoucanseeintheexample.ThefirstparameterinapairisacomparisonexpressionandthesecondparameteriswhatisreturnedbythefunctionifthecomparisonexpressionisTRUE.IfitreturnsFALSE,thendecode()movestothenextpairandthenthenextoneuntilitreachesthelastpair.IfnoneoftheexpressionsreturnedTRUE,thenthelastparameterofthedecode()isreturnedasadefaultvalue.

NoteBearinmindthatthedecode()functionfirstreturnsTRUEwithoutevaluatingtherestoftheconditions.So,becarefulwiththeorderofconditionalexpressionsinthedecode()function.

Finally,thelastfunctionintheexampleisthecommonSQLfunctionnvl().ItreturnsthevaluespecifiedinthesecondparameterifthefirstparameterisNULL.Thisfunctionisveryusefulindataflows.Usually,itisusedasamappingexpressionintheQuerytransformtopreventNULLvaluesfromcomingthroughforaspecificcolumn.AllNULLvalueswillbeconvertedtothevalueyoudefineinthenvl()function.

Page 154: SAP Data Services 4.x Cookbook

CreatingcustomfunctionsInthisrecipe,wewillgetfamiliarwithaSmartEditortoolavailableinDesignertohelpyouwriteyourscriptsorfunctionsinaconvenientway.

Wewillcreateanewfunctionthatcanbeexecutedeitherwithinascriptorwithinadataflow.Thisfunctionacceptstwoparameters:datevalueandnumberofdays.Itthenaddsthenumberofdaystotheinputdateandreturnstheresultdate.

Page 155: SAP Data Services 4.x Cookbook

Howtodoit…1. OpenDesignerandgotoTools|CustomFunctions…fromthetoplevelmenu:

2. Intheopenedwindow,right-clickintheareawiththelistoffunctionsandchooseNew….

3. Choosethenameofthenewfn_add_daysfunctionandpopulatethedescriptionsection,asshowninthisscreenshot:

4. Then,clickonNexttoopenaSmartEditorwindowandinputthefollowingcode:try

begin

$l_Date=to_date($p_InputDate,‘DD/MM/YYYY’);

$l_Days=num_to_interval($p_InputDays,‘D’);

end

Page 156: SAP Data Services 4.x Cookbook

catch(all)

begin

print(‘fn_add_days()FAILED:checkinputparameters’);

raise_exception(‘fn_add_days()FAILED:checkinputparameters:’||

’DateformatDD/MM/YYYYandnumberofdaysshouldbeanintegervalue’);

end

$l_Result=$l_Date+$l_Days;

Return$l_Result;

5. Forittowork,youhavetocreateasetofrequiredinput/outputparametersandlocalvariablesforthiscustomfunction.YourfunctionintheSmartEditorshouldlookliketheoneshowninthisscreenshot:

6. Createthefollowinginputparameters:$p_InputDateofthevarchardatatypeand$p_InputDaysoftheintegerdatatype.UsetheleftpanelVariablesinsidetheCustomFunctionwindow.

7. Theselocalvariableswillbeusedonlywithinafunctionandwillnotbeaccessiblefromoutsideofthefunction.Create$l_Dateofthedatedatatype,$l_Daysoftheintervaldatatype,and$l_Resultofthedatedatatype.

8. Now,itistimetoclickonOKtocreateourfirstcustomfunctionanduseitinthejob.Forthis,youcancreateasimplejobwithonescriptobjectinsideitusingthefollowingcode:print(to_char(fn_add_days(‘10/10/2015’,12),‘DD-MM-YYYY’));

Page 157: SAP Data Services 4.x Cookbook

Howitworks…Wemadetheinputparametersofthevarcharandintegerdatatypesfortheconvenienceofcallingthefunction.Itwillitselfperformtheconversiontothecorrectdateandintervaldatatypesbeforereturningtheresultofthedatesumoperation.

Eventhoughwehavenotusedthenum_to_interval()functiontoconvertintegervaluestointervals,DataServiceswillstillperformthecorrectsumoperation.Thisisbecauseitdoesanautomaticconversionofthenumericdatatypeintointervalsofdayswhenitisusedinarithmeticoperationwithdates.Thatiswhy,print(sysdate()+1)willreturnyoutomorrow’sdate.

Inthecodementionedearlier,youcanalsoseetheerror-handlingmechanismthatcanbeusedinDataServicesscripts:thetry-catchblock.Everythingexecutedbetweentryandcatchiffailedwillneverfailtheparentobjectexecution.Itisveryusefulifyoudonotwanttofailyourjobbecauseofthenon-criticalpieceofcodefailingsomewhereinsideit.Incaseofafailedexecution,itispassedtothesecondbegin-endblockofthetry-catch.Here,youcanwriteextralogmessagestothetracelogfileandstillfailthejobexecutionwiththeraise_exception()functionifyouwantto.WewilldiscussitinmoredetailinChapter5,Workflow–ControllingExecutionOrder,andChapter9,AdvancedDesignTechniques.

Page 158: SAP Data Services 4.x Cookbook

There’smore…ThescriptinglanguageinDataServicesisaveryimportanttoolextensivelyusedinsimpleorcomplexjobs.Inthischapter,weestablishedagoodbaseregardingbuildingDataServicesscriptlanguageskills.Youwillfindalotmoreexamplesthroughoutthisbook.

Page 159: SAP Data Services 4.x Cookbook

Chapter4.Dataflow–Extract,Transform,andLoadInthischapterwewilltakealookatexamplesofthemostimportantprocessingunitinDataServices—thedataflowobject—andthemostusefultypesoftransformationsyoucanuseinsidethem.Wewillcover:

CreatingasourcedataobjectCreatingatargetdataobjectLoadingdataintoaflatfileLoadingdatafromaflatfileLoadingdatafromtabletotable–lookupsandjoinsUsingtheMap_OperationtransformUsingtheTable_ComparisontransformExploringtheAutocorrectloadoptionSplittingtheflowofdatawiththeCasetransformMonitoringandanalyzingdataflowexecution

Page 160: SAP Data Services 4.x Cookbook

IntroductionInthischapterwemovetothemostimportantcomponentoftheETLdesigninDataServices:thedataflowobject.Thedataflowobjectisthecontainerthatholdsalltransformationsthatcanbeperformedondata.

Thestructureofthedataflowobjectissimple:oneormanysourceobjectsareplaced,ontheleft-handside(whichweextractthedatafrom),thensourceobjectsarelinkedtotheseriesoftransformobjects(whichperformmanipulationonthedataextracted),andfinally,thetransformobjectsarelinkedtooneormanytargettableobjects(tellingDataServiceswherethetransformeddatashouldbeinserted).Duringthetransformationofthedatasetinsidethedataflow,youcansplitthedatasetintomultipledatasetflows,orconversely,mergemultipleseparatelytransformeddataflowstogether.

Manipulationsperformedondatainsidedataflowsaredoneonarow-by-rowbasis.Therowsextractedfromthesourcegofromlefttorightthroughallobjectsplacedinsidethedataflow.

WewillreviewallmajoraspectsofdataflowdesigninDataServices,fromcreatingsourceandtargetobjectstotheusageofcomplextransformationsavailableaspartoftheDataServicesfunctionality.

Page 161: SAP Data Services 4.x Cookbook

CreatingasourcedataobjectInacoupleofpreviousrecipes,youhavealreadybecomefamiliarwithdatasources,importingtables,andusingimportedtablesinsidedataflowsassourceandtargetobjects.Inthisrecipe,wewillcreatetherestofthedatastoreobjectslinkingallourexistingdatabasestoaDataServicesrepositoryandwillspendmoretimeexplainingthisprocess.

Page 162: SAP Data Services 4.x Cookbook

Howtodoit…IntheUnderstandingtheDesignertoolrecipeinChapter2,ConfiguringtheDataServicesEnvironment,wealreadycreatedourfirstdatastoreobject,STAGE,forthe“HelloWorld”example.

So,whydoyouneedadatastoreobjectandwhatisitexactly?DatastoreobjectsarecontainersrepresentingtheconnectionstospecificdatabasesandstoringimporteddatabasestructuresthatcanbeusedinyourDataServicesETLcode.Inreality,datastoreobjectsdonotstorethedatabaseobjectsthemselvesbutratherthemetadatafortheobjectsbelongingtotheapplicationsystemordatabasethatthedatastoreobjectconnectsto.Theseobjectsmostcommonlyincludetables,views,databasefunctions,andstoredprocedures.

Ifyouhavenotfollowedthestepsinthe“HelloWorld”examplepresentedinChapter2,ConfiguringtheDataServicesEnvironment,youcanfindherethestepstocreatealldatastoreobjectsthatwillbeusedinthebookexplainedinbetterdetail.Withthesesteps,wewillcreatedatastoreobjectsreferencingalldatabaseswehavecreatedpreviouslyinSQLServerinthefirsttwochapters:

1. OpentheDatastorestabinLocalObjectLibrary.2. Right-clickonanyemptyspaceinthewindowandchooseNewfromthecontext

menu:

Page 163: SAP Data Services 4.x Cookbook

3. FirstspecifyDatastoreTypeforthedatastoreobject.ThedatastoretypedefinestheconnectivitytypeanddatastoreconfigurationoptionsthatwillbeusedbyDataServicestocommunicatewithreferencedsource/targetsystemobjectslyingbehindthisdatastoreconnection.Inthisbook,wewillmainlybeworkingwithdatastoresoftheDatabasetype.SeethatassoonasthedatastoretypeDatabaseisselected,asecondDatabaseTypeoptionappearswithalistofavailabledatabases:

4. TheCreateNewDatastorewindow,withalloptionsexpandingafteryouchooseDatastoreTypeandDatabaseType,lookslikethisscreenshot:

Page 164: SAP Data Services 4.x Cookbook

5. Leavealladvancedoptionsattheirdefaultvaluesandconfigureonlythemandatoryoptionsinthetopwindowpanel:thedatabaseconnectivitydetailsandusercredentials,whichwillbeusedbyDataServicestoaccessthedatabaseandread/insertthedata.

6. Usingtheprevioussteps,createanotherdatastorenamedODS.7. Altogether,youshouldhavethefollowinglistofdatastoreobjectscreatedforallour

localtestdatabases.Ifyoudonothaveallofthem,pleasecreatethemissingonesusingthesamestepsjustmentioned:

DS_ODS:ThisisthedatastorelinkingtotheODSdatabaseDS_STAGE:ThisisthedatastorelinkingtotheSTAGEdatabaseDWH:ThisisthedatastorelinkingtotheAdventureWorks_DWHdatabaseOLTP:ThisisthedatastorelinkingtotheAdventureWorks_OLTPdatabase

8. TocreateareferencetoadatabasetableinthedatastoreOLTP,expandtheOLTPdatastoreintheLocalObjectLibrarytabanddouble-clickontheTableslist.

9. TheDatabaseExplorerwindowopensinaworkspaceDesignersection,showing

Page 165: SAP Data Services 4.x Cookbook

youallthetableandviewobjectsintheOLTPdatabase.10. FindtheHumanResources.EmployeetableintheExternalMetadatalist,right-click

onit,andchoosetheImportoptionfromthecontextmenu:

11. YoucanseehowthetablestatushaschangedinDatabaseExplorertoYesunderImportedandNounderChanged.

12. Also,youcanseethetablereferencesappearinthedatastoreOLTPtablelist.AsitisnotusedanywhereinETLcode,theUsagecolumnintheLocalObjectLibraryshows0forthattable.

13. Now,closetheDatabaseExplorerwindowanddouble-clickontheimportedtablenameintheLocalObjectLibrarywindow.TheTableMetadatawindowopensshowingyourtableattributesandevenallowingyoutoviewthecontentsofthetable:

NoteThisTableMetadatawindowisextremelyusefulforperformingasourcesystemanalysiswhenyouhavetolearnthesourcedatatounderstanditbeforestartingto

Page 166: SAP Data Services 4.x Cookbook

developyourETLcodeandapplyingtransformationrulesonit.

14. TheViewDatatabhasthreesubtabswithinit:theData,Profile,andColumnprofiletabs.ChoosetheColumnprofiletabandselecttheGENDERcolumninthedrop-downlist.

15. ClickontheUpdatebuttontoseethecolumnprofiledata:

Columnprofilingdatashowsthatthereare206maleemployees(71.03%)against84(28.97%)femaleones.

Page 167: SAP Data Services 4.x Cookbook

Howitworks…Themostimportantthingyoushouldunderstandaboutdatastoreobjectsisthatwhenyouimportadatabaseobjectintoadatastore,allyoudoisyoucreateareferencetothedatabaseobject.YouarenotcreatingaphysicalcopyofthetableinyourDataServicesdatastorewhenyouimportatable.Hence,whenyouuseViewdataintheTableMetadatawindowforthattable,DataServicesexecutestheSELECTqueryinthebackgroundtoextractthisdataforyou.

Lookingatthebrowsingexternalmetadatascreenagain,youcanseethattherearetwootheroptionsavailableinthetablecontextmenu:OpenandReconcile:

TheOpenoptionallowsyoutoopenanexternaltablemetadatawindowwhichcandisplaytabledefinitioninformation,partitions,indices,tableattributes,andotherusefulinformation.

TheReconcileoptionsimplyupdatesthetwocolumns,ImportedandChanged,intheExternalMetadatalist.Itisusefulwhenyouwanttocheckwhetherthetableobjecthasbeenimportedintoadatastorealreadyandwhetherithaschangedinthedatabasesincethelasttimeitwasimportedintoadatastore.

NoteItistheETLdeveloper’sresponsibilitytoreimportthetableobjectsinthedatastoreiftheirdefinitionorstructurehasbeenchangedonthedatabaselevel.DataServicesdoesnotautomaticallyperformthisoperation.ThemostcommonproblemwithtableobjectsynchronizationiswhenthecolumnpopulatedbyETLgetsremovedfromthetableinthedatabase.Toreflectthischange,thedeveloperhastoreimportthetableobjectinthedatastoretoupdatetableobjectstructureinDataServicesandthenupdateETLcodetomakesurethatanon-existingcolumnisnotreferencedasthetargetcolumnanymore.

Viewsassourceobjectsbehaveexactlyastables.TheycanbeimportedinthedatastoreinthesameTablessectionalongwithothertableobjects.Theonlydifferenceisthatyoucannotspecifytheimportedviewasatargetobjectinyourdataflow.

Youmayalsowonderthat,ifthedatastoreobjectrepresentstheconnectiontoaspecific

Page 168: SAP Data Services 4.x Cookbook

database,whydoyounotseeallthedatabaseobjectsstraightawayaftercreatingit.Theanswerissimple:youimportonlythosedatabaseobjectsyouwillbeusinginyourETLcode.Ifthedatabasehasafewhundredtables,itwouldbeextremelytime-andresource-intensiveforDataServicestoautomaticallysynchronizealldatastoreobjectreferenceswithactualdatabaseobjectseachtimeyouopenaDesignerapplication.ItisalsoeasierforthedevelopertobeabletoseeonlythetablesusedinETLdevelopment.Plus,withthedatastoreconfigurationsfeature,youcanusethesamedatastoreobjecttoconnecttodifferentphysicaldatabases,thatmighthavedifferentversionsofthetableswiththesamenames,sothesynchronizationofobjectsimportedinthedatastoreissolelyyourresponsibilityandhastobedonemanually.Wewilldiscussconfigurationsinthefuturechapters.

TheprofilingfunctionalityofDataServicesthatweusedinthisrecipeallowsyoutolookintothedatawithouttheneedforgoingtoSQLServerManagementStudioandmanuallyqueryingthetables.ItiseasyandconvenienttouseduringETLdevelopment.

Page 169: SAP Data Services 4.x Cookbook

There’smore…Itisquitedifficulttocoveralltheinformationaboutalldatastoresettingsinonechapter,asDataServicesisabletoconnecttosomanydifferentdatabasesandapplicationsystems.Asthedatastoreoptionsaredatabasespecific,thenumberofoptionsandtheirbehaviorvarydependingonwhichdatabaseorsystemyouaretryingtoconnectto.

Page 170: SAP Data Services 4.x Cookbook

CreatingatargetdataobjectAtargetdataobjectistheobjecttowhichwesendthedatawithinadataflow.Thereareafewdifferenttypesoftargetdataobjects,butthetwomainonesaretablesandflatfiles.Inthisrecipe,wewilltakealookatatargettableobject.

NoteViewsimportedintoadatastorecannotbetargetobjectswithinadataflow.Theycanonlybeasourceofdata.

Page 171: SAP Data Services 4.x Cookbook

GettingreadyToprepareforthisrecipe,weneedtocreateatableinourSTAGEdatabase.Todothat,pleaseconnecttoSQLServerManagementStudioandcreatethePersontableintheSTAGEdatabaseusingthefollowingcommand:CREATETABLEdbo.Person

(

FirstNamevarchar(50),

LastNamevarchar(50),

Ageinteger

);

Thistablewillbeusedasatargettable,whichwewillloaddataintobyusingDataServices.WewillusethedatastoredinthePersontablefromtheOLTPdatabaseasthesourcedatatobeloaded.

Page 172: SAP Data Services 4.x Cookbook

Howtodoit…1. OpentheDataServicesDesignerapplication.2. IntheDS_STAGEdatastore,right-clickonTablesandchoosetheoptionImportBy

Name…(anotherquickmethodtoimportatabledefinitionintoDataServiceswithoutopeningDatabaseExplorer).Ofcourse,inordertodothat,youshouldknowtheexacttablenameandschemaitwascreatedin.

3. Intheopenedwindow,entertherequireddetails,asinthefollowingscreenshot:

4. ClickontheImportbuttontofinish.5. AlsointheOLTPdatastore,importanewtablePersonfromthePersonschema

insidetheAdventureWorks_OLTPdatabase.Wewillusethistableasasourceofdata.

NoteIntheexampleofusingSQLServerasanunderlyingdatabase,theownerissynonymouswiththedatabaseschema.Whenimportingatablebynameinthedatastoreorcreatingtemplatetables,theOwnerfielddefinestheschemawherethetablewillbeimportedfrom/createdinthedatabase.So,keepinmindthatyouhavetouseexistingschemacreatedpreviously.

6. Createanewjobwithanewdataflowobject,openthedataflow,anddragthePersontablefromtheOLTPdatastoreintothisdataflowasasource.Then,dragthePersontablefromtheDS_STAGEdatastoreasatarget.

7. CreateanewQuerytransformbetweenthemandlinkittobothsourceandtargettables:

8. AssoonasyouopentheQuerytransform,youwillseethatbothinputandoutputstructureswerecreatedforyou.AllcolumnnamesanddatatypeswereimportedfromthesourceandtargetobjectsyoulinkedtheQuerytransformto,andallyouhavetodoistomapthecolumnvaluesfromthesourcetopasstothecolumnsinthetargetyouwanttopassthemto.

9. MapthesourceFIRSTNAMEcolumntothetargetFIRSTNAMEcolumnandperformthe

Page 173: SAP Data Services 4.x Cookbook

samemappingforLASTNAME.AsthereisnoAGEcolumninthesource,putNULLasthevalueforthemappingexpressionfortheAGEtargetcolumnintheQuerytransform.Thiscanbedonebydragginganddroppingfromtheinputtotheoutputschemaorbytypingthemappingmanually:

10. EachtargetobjectwithinadataflowhasasetofoptionsthatisavailableintheTargetTableEditorwindow.Toopenit,double-clickonatargettableobjectinthedataflowworkspace:

11. Fornow,let’sjustselecttheDeletedatafromtablebeforeloadingcheckbox.Thisoptionmakessurethateachtimethedataflowruns,alltargettablerecordsaredeletedbyDataServicesbeforepopulatingthetargettablewithdatafromasourceobject.

12. ValidatethedataflowbyclickingontheValidateCurrentbuttonwhenthedataflowisopenedinthemainworkspacetomakesurethatyouhavenotmadeanydesignerrors.

13. NowexecutethejobandclickontheViewDatabuttoninthebottom-rightcornerofthetargettableiconwithinadataflowtoseethedataloadedintothetargettable.

Page 174: SAP Data Services 4.x Cookbook

Howitworks…Youcanseethatthetargettableobjecthasalotofoptions.DataServicescanperformdifferenttypesofloadingofthesamedataset,andallthosetypesareconfiguredinthetargettableobjecttabs.Someofthemareusediftheinserteddatasetisvoluminous,whilesomeofthemallowyoutoinsertdatawithoutduplicatingit.Wewilldiscussallofthisindetailinalaterchapters.

WhenDataServicesselectsdatafromsourcetables,allitdoesisexecutetheSELECTstatementinthebackground.ButwhenDataServicesinsertsthedata,thereareriskssuchasincompatibledatatypes/values,duplicatedata(whichviolatesreferentialintegrityinthetargettable),slowperformance,andsoon.Donotforgetthatyouinsertdataaftertransformingit,soitisyourresponsibilitytounderstandthetargetdatabaseobjectrequirementsandspecificsofthedatayouareinserting.

ThatiswhytheloadingmechanisminDataServiceshasmanymoresettingstoconfigureandismuchmoreflexiblethanthemechanismofgettingsourcedatainsideadataflow.

Page 175: SAP Data Services 4.x Cookbook

There’smore…Asyoumightrememberfromthe“HelloWorld”exampleinChapter2,ConfiguringtheDataServicesEnvironment,thereisagreatandsimplewaytocreatetargettableobjectsinadataflowwithoutthenecessitytocreateaphysicaltableinthedatabasefirstandimportitintotheDSdatastore.Weusedthistypeoftargettablebefore,andIamtalkingabouttemplatetables.ObjectsthatweusedinthepreviousrecipeswhenwewantedDataServicestocreateaphysicaltargettableforusfromthemappingswedefinedinaQuerytransforminsideourETLcodeinadataflow.

NoteNotethatthetemplatetargettablehasanextratargettableoption;Dropandre-createtable.Bydefault,itistickedandgetsphysicallydroppedandrecreatedeachtimethedataflowruns.DataServicesgeneratesatabledefinitionfromtheoutputschemaofthelasttransformobjectinthedataflowlinkedtothetargettableobject.

Asyoucanseeinthefollowingfigure,youcanspecifymultipletargettables.Theygetpopulatedwiththesamedatasetcomingfromthesourcetable,andastheygetpopulatedfromthesameoutputschemaoftheQuerytransform,theyhavethesametabledefinitionformat:

Tocreateatemplatetable,youusetheright-handsidetoolmenuintheDesignerandthetemplatetableiconshowninthefollowingscreenshot:

Page 176: SAP Data Services 4.x Cookbook

Clickonthetemplatetablebuttoninthetoolmenuandthenontheemptyspaceinthedataflowworkspacetoplaceitasatargettableobject.

Specifythetemplatetablename,theDataServicesdatastorewhereitshouldbecreated,andthedatabaseowner(schema)namewherethetablegetscreatedphysicallywhenthedataflowisexecuted:

Page 177: SAP Data Services 4.x Cookbook

LoadingdataintoaflatfileThisrecipewillteachyouhowtoexportinformationfromatableintoaflatfileusingDataServices.

Flatfilesareapopularchoicewhenyouneedtostoredataexternallyforbackuppurposesorinordertotransferandfeeditintoanothersystemorevensendtoanothercompany.

Thesimplestfileformatusuallydescribesalistofcolumnsinspecificorderandadelimiterusedtoseparatefieldvalues.Inmostcases,it’sallyouneed.Asyouwillseeabitlater,DataServiceshasmanyextraconfigurationoptionsavailableintheFileFormatobject,allowingyoutoloadthecontentsoftheflatfilesintoadatabaseorexportthedatafromadatabasetabletoadelimitedtextfile.

Page 178: SAP Data Services 4.x Cookbook

Howtodoit…1. CreateanewdataflowandusetheEMPLOYEEtablefromtheOLTPdatastoreimported

earlierasasourceobject.2. LinkthesourcetablewithaQuerytransformanddrag-and-dropallsource

columnstotheoutputschemaformappingconfiguration.3. IntheQuerytransform,right-clickontheparentQueryitem,whichincludesall

outputmappingcolumns,andchoosetheCreateFileFormat…optionatthebottomoftheopenedcontextmenu:

4. ThemainFileFormatEditorwindowopens:

Page 179: SAP Data Services 4.x Cookbook

5. RefertothefollowingtableformoredetailsaboutFileFormatoptionsandtheircorrespondingvalues:

FileFormatoptions

Description Value

TypeSpecifiesthetypeofthefileformat:Delimited,FixedWidth,UnstructuredText,andsoon.

Inthisrecipe,wearecreatingplaintextfilewithrowfieldsseparatedbyacomma.ChoosetheDelimitedoption.

NameNameoftheFileFormat.NotethatthisisnotthenameofthefilethatwillbecreatedbutthegeneralnameoftheFileFormat.

TypeF_EMPLOYEE.

Location

Thisisthephysicallocationofthefilereferencedusingthisfileformat.Inourcase,thelocationsofJobServerandLocalarethesameasDataServicesinstalledonthesame ChooseJobServer.

Page 180: SAP Data Services 4.x Cookbook

machinewhereweexecutedourDesignerapplication.

Rootdirectory

Directorypathtothefile.Makesurethatthisdirectoryexists. TypeC:\AW\Files.

Filename(s)

Nameofthefilethatwereaddatafromorwriteinto. TypeHR_Employee.csv.

Delimiters|Column

Youcaneitherchoosefromexistingoptions:Tab,Semicolon,Comma,Space,orjusttypeinyourowncustomdelimiterasonecharacterorasequenceofcharacters.

ChooseComma.

Delimiters|Text

Youcanspecifywhetheryouwantcharactervaluestobewrappedinquotes/doublequotesornot.

Choose“.

Skiprowheader

Whenyoureadfromthefile,usethisoptiontoskiptherowheadersoitisnotconfusedasafirstdatarecord.

Wedonothavetochangethisoptionasitwouldnotmakeanyeffectbecausewearegoingtowritetoaflatfile,notreadfromit.

Writerowheader

Sameoptionsasthepreviousone,butforcaseswhenyouwriteintoafile.IfsettoYes,therowheaderwillbecreatedasafirstlineinthefile.IfNo,thefirstlineinthefilewillbeadatarecord.

ChooseYestocreatearowheaderwhenwritingtoafile.

6. ClickontheSave&ClosebuttontocloseFileFormatEditorandsavethenewFileFormat.

7. NowyoucanopentheLocalObjectLibrary|FormatstabandseeyournewlycreatedfileformatF_EMPLOYEE.

8. Openthedataflowworkspaceanddrag-and-dropthisfileformatfromtheLocalObjectLibrarytabtoadataflowandchoosetheMakeTarget…option.

9. LinkyourQuerytransformtoatargetfileobjectandvalidateyourdataflowtomakesurethattherearenoerrors.

10. Runthejob.YouwillseethatthefileHR_Employee.csvappearsinC:\AW\Filesandgetspopulatedwith292records(1headerrecord+291datarecords).

Page 181: SAP Data Services 4.x Cookbook

Howitworks…Fileformatconfigurationprovidesyouwithaflexiblesolutionforreadingdatafromandloadingdataintoflatfiles.Youcanevensetupautomaticdaterecognitionandconfigureanerrorhandlingmechanismtorejectrowsthatdonotfitintoadefinedfileformatstructure.

NotethateditingthefileformatfromLocalObjectLibraryandeditingitdirectlyfromthedataflowwhereitwasplacedtobeusedtoreadorwritefromflatfilesisnotthesame.Ifyouedititinsidethedataflow,youwillnoticethatsomefieldsintheFileFormatEditoraregrayedout.OpeningthesamefileformatforeditingfromLocalObjectLibrarymakesthosefieldsavailableforediting.Thishappensbecausewhenimportedinadataflow,theFileFormatobjectbecomesaninstanceoftheparentFileFormatobjectstoredinaLocalObjectLibrary.andbecauseallchangesappliedtoaninstanceinsideadataflowarenotpropagatedtootherinstancesofthisFileFormatobjectimportedintootherdataflows.Alternatively,whenyoumodifytheFileFormatdefinitioninLocalObjectLibrary,changesmadearepropagatedtoallinstancesofthisFileFormatobjectimportedtodifferentdataflowsacrossETLcode.

NoteSomefileformatconfigurationparameterscanbechangedonlyontheparentfileformatobjectinLocalObjectLibrary.

YoushouldalsokeepinmindthatexporttoaflatfileinDataServicesisquiteaforgivingprocess.Forexample,ifyourfileformathasthevarchar(2)characterfieldandyouaretryingtoexportalineof50characterstoafileinthisfield,DataServiceswillallowyoutodothat.InfactDataServicesdoesnotcaremuchaboutthecolumnsspecifiedinthefileformatatallifyouuseyourfileformattoexportdatatoaflatfile.Datadefinitionwillbesourcedfromtheoutputschemaoftheprecedingtransformationobjectlinkedtothetargetfileobject.

Importingfromaflatfileontheotherhandisaverystrictprocess.DataServiceswillrejecttherecordimmediatelyifitdoesnotfitthefileformatdefinition.

Page 182: SAP Data Services 4.x Cookbook

There’smore…TherearemorewaystocreateaFileFormatobjectthanshowninthisrecipe.Somearelistedhere:

CreatinginLocalObjectLibrary:OpentheFormatstabintheLocalObjectLibrarywindow,right-clickonFlatFiles,chooseNewfromthecontextmenu.YoucanusetheLocation,Rootdirectory,andFileName(s)optionstoautomaticallyimporttheformatfromanexternalfile.Otherwise,youwillhavetodefineallcolumnsandtheirdatatypesmanually,onebyone.ReplicatingafileformatfromanexistingFileFormatobjectinLocalObjectLibrary:OntheFormatstab,choosetheobjectyouwanttoreplicate,right-clickonit,andchoosetheReplicate…optioninthecontextmenu.

Page 183: SAP Data Services 4.x Cookbook

LoadingdatafromaflatfileYoucanusethesameFileFormatobjectcreatedinthepreviousrecipetoloaddatafromaflatfile.Inthefollowingsection,wewilltakeacloserlookatfileformatoptionsrelevanttoloadingdatafromthefiles.

Page 184: SAP Data Services 4.x Cookbook

Howtodoit…1. Createanewjobandanewdataflowobjectinit.2. Createanewtextfile,Friends_30052015.txt,withthefollowinglinesinsideit:

NAME|DOB|HEIGHT|HOBBY

JANE|12.05.1985|176|HIKING

JOHN|07-08-1982|182|FOOTBALL

STEVE|01.09.1976|152|SLEEPING|10

DAVE|27.12.1983|AB5

3. GotoLocalObjectLibraryandcreateanewfileformatbyright-clickingonFlatFilesandchoosingNew.

4. PopulatetheFileFormatoptionsasshowninthefollowingscreenshot:

Delimiters|Columnissetto|inthiscaseasourfilehasthepipeasadelimiter.NULLIndicatorwassettoNULL,whichmeansthatonlyNULLvaluesintheincomingfileareinterpretedasNULLwhenreadbyDataServices.Theother“empty”valueswillbeinterpretedasemptystrings.Dateformatissettodd.mm.yyyyaswespecifiedinthefileformatthatweare

Page 185: SAP Data Services 4.x Cookbook

loadingtheDOB(DateofBirth)columnofthedatedatatype.ImaginethatyouhaveconfiguredtheQuerytransformmappingforthatcolumnusingtheto_date(<date>,‘dd.mm.yyyy’)function.SkiprowheaderissettoYesinordertospecifythatthefilehasaheaderrowwhichhastobeskipped.ThenwesetalloptionsrelatedtoerrorcapturingtoYestocatchallpossibleerrors.Writeerrorstoafileallowsyoutorecordtherejectedrecordsinaseparatefileforfurtheranalysis.WewillbewritingthemtotheFriends_rejected.txtfile.

5. Importthisfileformatobjectasadatasourceinyournewlycreateddataflow.6. MapallsourcecolumnstoaQuerytransformandcreatethetargettemplatetable

FRIENDSintheDS_STAGEdatastore.7. Saveandrunthejob.

Asaresult,youcanseethattworecordswererejected,onebecauseofanextracolumnintherowandtheotherbecausetherowhadonecolumnlessthanthedefinedfileformat.

Thecontentsofyourtargettableshouldlooklikethis:

Youcanseethatsomelinesaremissinghereduetotheerrorsintheinputfile.

What’sinterestingisthatDataServiceswassmartenoughtocorrectlyrecognizeandconvertthedateofbirthforJOHN.Rememberitwas07-08-1982inthefileandthedateformatwespecifiedwasdd.mm.yyyy.

Page 186: SAP Data Services 4.x Cookbook

Howitworks…Asyoucansee,mostofthefileformatoptionswehaveusedareusefulforvalidatingthecontentsofthesourcedatafileinordertorejectrecordswithdataofincorrectdatatypeorformat.

Themainquestionyouhavetoaskyourselfiswhetheryouwantalltheserecordstoberejected.Thealternativemightbetobuildadataflowthatloadsallrecordsofthevarchardatatypeandtriestocleanseandconvertincorrectvaluestoanacceptableformat,orputsthedefaultvalueinsteadofawrongonetomarkthefield.Sometimesyoudonotwanttolosethewholerecordifjustonevalueisincorrect.

Now,let’sfixthe“numberofcolumns”probleminthesourcefiletoseehowDataServicesdealswithconversionproblems.Doyourememberweputthecharactersymbolinoneoftheintegerdatatypefields?

ChangetherecordsforSteveandDavetothefollowinglinesandrerunthejob:STEVE|1976.01.01|152|SLEEPING

DAVE|27.12.1983|AB5|DREAMING

Bothrecordsarerejectedwiththefollowingerrormessagesappearingintheerrorlogfilewhenyouexecutethejob:

YoucanseethosemessagesinthejoberrorlogandintheFriends_rejected.txtfilealongwiththerejectedrecordsthemselves.Thenameandlocationoftherejectfileisdefinedbytwofileformatoptions:ErrorfilerootdirectoryandErrorfilename.Theybecomeavailablewhenyouopenthefileformatobjectinstanceforeditingfromwithinadataflow:

Page 187: SAP Data Services 4.x Cookbook

Aswejuststated,inordertoloadthoserecords,youshouldputinsomeextradevelopmenteffortandcreatealogicinyourdataflowtodealwithallpossiblescenariosinordertocleanseandcorrectlyconvertthedata,andofcourseyoushouldamendthefileformat,changingalldatatypestovarcharinordertopassthoserecordsthroughforfurthercleansing.

NoteYoucanusemasksintheFilename(s)optionwhenconfiguringthefileobjectinyourdataflow.Forexample,specifyinginvoice_*.csvasafilenamewillallowyoutoloadbothinvoice_number_1.csvandinvoice_number_2.csvfilesinasingleexecutionofthedataflow.Theywillbeloadedoneafteranother.

Page 188: SAP Data Services 4.x Cookbook

There’smore…TrytoexperimentfurtherwiththecontentsoftheFriends_30052015.txtfilebyaddingextrarowswithdifferentdatatypestoseewhethertheywillberejectedorloaded,andwhicherrormessagesyouwillgetfromDataServices.

Page 189: SAP Data Services 4.x Cookbook

Loadingdatafromtabletotable–lookupsandjoinsWhenyouspecifyarelationalsourcetableinthedataflow,DataServicesexecutessimpleSQLSELECTstatementsinthebackgroundtofetchthedata.Ifyouwantto,youcanseethelistofstatementsexecutedforeachsourcetable.Inthisrecipe,weexplorewhathappensunderthehoodwhenyouaddmultiplesourcetablesandhowDataServicesoptimizestheextractionofthedatafromthesesourcetablesandevenjoinsthemtogether,executingcomplexSQLqueriesinsteadofmultipleSELECT*FROM<table>.

Page 190: SAP Data Services 4.x Cookbook

Howtodoit…Inthisrecipe,wewillextractaperson’sname,address,andphonenumberfromthesourceOLTPdatabaseandpopulateanewstagetablePERSON_DETAILSwiththisdataset.

1. Createanewjobandanewdataflow.Specifyyourownnamesforthecreated

objects.2. Toextracttherequireddata,youwillneedtoimportthetablesPERSON,ADDRESS,and

BUSINESSENTITYADDRESS(whichisatablelinkingthefirsttwo)intoyoursourceOLTPdatastore.AllthesetablesarelocatedinthePersonschemaoftheAdventureWorks_OLTPdatabase.

3. Placetheimportedtablesassourceobjectsinyourdataflow,asshowninthefollowingfigure,andlinkthemwiththeQuerytransform.InsertthetargettemplatetablePERSON_DETAILStobecreatedintheDS_STAGEdatastore:

4. Tosettherequiredjoinconditions,youshouldusetheJoinpairssectionlocatedontheFROMtaboftheQuerytransform.Inthisexample,thesejoinconditionsshouldbegeneratedautomaticallyassoonasyouopentheQuerytransform.Iftheyweren’t,youcanclickontheiconwithtwointersectinggreencircleswiththehintClicktoproposejointogeneratethem,orclickontheJoinConditionfieldandtyperequiredjoinconditionsmanuallyforeachtablepairtocreatejoinconditionsmanually.PleaseusethefollowingscreenshotasareferencetocreatetwoInnerjoinpairs(PERSON-BUSINESSENTITYADDRESSandBUSINESSENTITYADDRESS-ADDRESS):

Page 191: SAP Data Services 4.x Cookbook

5. Atthispoint,youareabletoseewhichSQLstatementDSusestoextracttherequiredinformationbychoosingValidation|DisplayOptimizedSQLfromthemainmenu.ItopensthefollowingwindowshowingyouthenumberofdatastoresqueriedinthewindowontheleftandthefullSELECTstatementexecutedineachofthemontheright:

6. Weforgottoaddcountryinformationforeachperson.ItlooksliketheAddresstablehasonlystreetinformationandcitybutnocountryorstatedata.ImportanothertwotablesintheOLTPdatastore:STATEPROVINCEandCOUNTRYREGION.

7. Addthemassourcetablesinthedataflow,butdonotjointhemtoalreadyexistingonesinthesameQuerytransform.CreateanotherQuerytransformandcallitGet_Country.UseittojointheQuerydatasetwithtwonewsourcetables,asshowninthefollowingfigure:

Page 192: SAP Data Services 4.x Cookbook

8. AddtwonewcolumnmappingsintheGet_CountryQuerytransform:BUSINESSENTITYID,mappedfromthefieldwiththesamenamefromtheQueryinputschema,andtheCOUNTRYcolumn,mappedfromtheNAMEcolumnoftheCOUNTRYREGIONtableinputschema.

9. IfyoucheckValidation|DisplayOptimizedSQLagain,youwillseethattheSQLstatementhaschanged,nowincludingtwonewtables:

10. WestillhavemissingphoneinformationforourPERSON_DETAILStables.AddathirdQuerytransformontherightandcallitLookup_Phone.Tolookforthephoneinformation,wewillusethelookup_ext()functionexecutedfromafunctioncallwithinaQuerytransform.Thefunctionlookup_ext()ismostcommonlyusedincolumnmappingstoperformthelookupoperationforthevaluesfromothertables.

11. OpentheLookup_PhoneQuerytransformandmapallsourcecolumnstothetargetonesexceptfortheBUSINESSENTITYIDandADDRESSLINE2columns(wearenotgoingtopropagatethose).

12. Right-clickonthelastmappedcolumninthetargetschema(shouldbeCOUNTRY)andselecttheoptionNewFunctionCall…fromthecontextmenu:

Page 193: SAP Data Services 4.x Cookbook

13. ChooseInsertBelow…,andintheopenedSelectFunctionwindow,chooseLookupFunctions|lookup_ext

14. TheopenedLookup_ext|SelectParameterswindowallowsyoutosetlookupparametersforthetableyouwanttoextractinformationfrom.Rememberthatthisisbasicallyaformofajoin,soyouwouldhavetospecifythejoinconditionsoftheinputdatasettothelookuptable.Inourcase,thelookuptableisPERSONPHONE.IfyoudidnotimportitearlierinyourOLTPdatastore,pleasedothatnow.Usethelookupparameterdetailsshowninthefollowingscreenshot:

Page 194: SAP Data Services 4.x Cookbook

15. AfteryouclickonFinish,yourtargetschemaintheLookup_Phonetransformshouldlooklikethis:

16. ItsohappensthatthePHONENUMBERfieldwehaveextractedfromthelookuptableisakeycolumninthattable.DataServicesautomaticallydefineskeycolumnsfromsourcetablesintheQuerytransformasprimarykeysaswell.Tochangethisandmakesurethatourfinaldatasetdoesnotincludeduplicates,wearegoingtocreatealastQuerytransformandnameitDistinct.LinkitontherighttotheLookup_Phonetransformandopenitchoosingthefollowingoptions:

Page 195: SAP Data Services 4.x Cookbook

TochangethePHONENUMBERcolumnfrombeingaprimarykey,double-clickonthecolumninthetargetschemaanduncheckthePrimarykeyoption.Togetridoftheduplicatefields,opentheSELECTtabandcheckDistinctrows.

17. Saveandrunthejobandviewthedatausingthedataflowtargettableoption:

Asthefinalstep,importthetemplatetablePERSON_DETAILSsoitisconvertedintothenormaltableobjectinsidetheDS_STAGEdatastore.Todothat,right-clickonthetableeitherinLocalObjectLibraryorinsidethedataflowworkspace,asshowninthefollowingscreenshot,andchoosetheImportTableoptionfromtheobject’scontextmenu:

Page 196: SAP Data Services 4.x Cookbook
Page 197: SAP Data Services 4.x Cookbook

Howitworks…YouhaveseenanexampleofhowmultipletablescanbejoinedinDataServices.TheQuerytransformrepresentsthetraditionalSQLSELECTstatementwiththeabilitytogrouptheincomingdataset,usevariousjoinconditions(INNER,LEFT,orOUTER),usetheDISTINCToperator,sortdata(ontheORDERBYtab),andapplyfilteringconditionsintheWHEREtab.

TheDataServicesoptimizertriestobuildasfewSQLstatementsaspossibleinordertoextractthesourcedatabyjoiningtablesinacomplexSELECTstatement.Inafuturechapters,wewillseewhichfactorspreventthepropagationofdataflowlogictoadatabaselevel.

Wehavealsotriedtouseafunctioncallinthemappingsinordertojoinatabletoextractadditionaldata.ItwouldbeperfectlyvalidtoimportthePERSONPHONEtableasasourcetableandjoinitwiththerestofthetableswiththehelpoftheQuerytransform,butusingthelookup_ext()functionsgivesyouagreatadvantage.Italwaysreturnsonlyonerecordfromthelookuptableforeachrecordwelookupvaluesfor.WhereasjoiningwithaQuerytransformdoesnotpreventyoufromgettingduplicatedormultiplerecordsinthesamewayasifyouhavejoinedtwotablesinstandardSQLquery.Ofcourse,ifyouwantyourQuerytransformtobehaveexactlylikeaSELECTstatementjoiningtablesinthedatabase,producingmultipleoutputrecordsforeachlookuprecord,thelookup_ext()functionshouldnotbeused.

IfyouarewritingacomplexSQLSELECTstatement,youareprobablyawarethatjoiningmultipletablescanleadtoduplicaterecordsintheresultdataset.Thisdoesnotnecessarilymeanthatjoinsareincorrectlyspecified.Sometimesitistherequiredbehavior,oritcanbeadatabasedesignproblemorsimplythepresenceof“dirty”datainoneofthesourcetables.

Thefunctionlookup_ext()makessurethatifitfindsmultiplerecordsinthelookuptableforyoursourcerecord,itpicksonlyonevalueaccordingtothemethodspecifiedintheReturnpolicyfieldoftheLookup_extparameterswindow:

Page 198: SAP Data Services 4.x Cookbook

NoteThemaindisadvantagesofusingthelookup_ext()functionarelowtransparencyoftheETLcode—asitishiddeninsidetheQuerytransform—andthefactthatlookup_ext()functionspreventthepropagationofexecutionlogictoadatabaselevel.DataServicesalwaysextractsthefulltablespecifiedasthelookuptableinthelookup_ext()functionparameters.

Dependingonwhichversionoftheproductisusedandondatabaseenvironmentconfiguration,DataServicescanautomaticallygeneratealljoinconditionswhenyoujointablesintheQuerytransformandspecifyjoinpairs.Thisisbecause,whenyouimportthesourcetablestoadatastore,DataServicesimportsnotjusttabledefinitionsbutalsoinformationaboutprimarykeys,indexes,andothertablemetadata.So,ifDataServicesseesthatyouarejoiningtwotableswithidenticallynamedfieldswhicharemarkedasprimaryorforeignkeysonthedatabaselevel,itautomaticallyassumesthatthosetablescanbejoinedusingthosekeyfields.

KeepthatinmindthatifbusinessrulesorETLlogicdictatesjoinconditionstobedifferentfromwhatDataServicesautomaticallyproducesandyouhavetomodifythosevaluesinQuerytransformlogicorevenwriteyourownjoinconditionsbymanuallyenteringthem.

Page 199: SAP Data Services 4.x Cookbook

UsingtheMap_OperationtransformHereweexploreaveryinterestingtransformationavailabletoyouinDataServices.Infact,itdoesnotperformanytransformationofdataperse.WhatitdoesisthatitchangesthetypeoftheSQLDataModificationLanguage(DML)operationthatshouldbeappliedtotherowwhenitreachesthetargetobject.Asyouprobablyknowalready,theDMLoperationsinSQLlanguagearetheoperationswhichmodifythedata,inotherwords,theINSERT,UPDATE,andDELETEstatements.

FirstwewillseetheeffectMap_Operationhaswhenusedinadataflow,andthenwewillexplainindetailhowitworks.Inafewwords,theMap_OperationtransformallowsyoutochoosewhatDataServiceswilldowiththemigratedrowwhenpassingitfromMap_Operationtothenexttransform.Map_Operationassignsoneofthefourstatusestoeachrecordpassingthrough:normal,insert,update,ordelete.Bydefault,themajorityoftransformsinDataServicesproducerecordswithanormalstatus.Thismeansthattherecordwillbeinsertedwhenitreachesthetargettableobjectinadataflow.WithMap_Operation,youcancontrolthisbehavior.

Page 200: SAP Data Services 4.x Cookbook

Howtodoit…Inthisexercise,wearegoingtoslightlychangethecontentsofourPERSON_DETAILStable.WewillchangecountryvaluesforrecordsbelongingtoSamanthaSmithfromUnitedStatestoUSAandremovetherecordsforthesamepersonwithUnitedKingdomasthecountry.Thatmeanswewillspecifythesametablebothasasourceandasatarget:

1. CreateanewjobandnewdataflowobjectandplacethePERSON_DETAILStablefrom

theDS_STAGEdatastoreasasourcetable.2. JointhesourcetabletoanewQuerytransformnamedGet_Samantha_Smith.Mapall

columnsfromsourcetotargetandspecifyfilteringconditions,asshowninthefollowingscreenshot.Also,double-clickoneachofthethreecolumns,FIRSTNAME,LASTNAME,andADDRESSLINE1,todefinethemasprimarykeycolumns:

3. SplitthedataflowintwobycreatingtwonewQuerytransforms:USandUK.LinkthemtotwoMap_OperationtransformsimportedfromLocalObjectLibrary|Transforms|Platform|Map_Operationnamedupdateanddeleterespectively.ThenmergethedataflowstogetherwiththeMergetransform,whichcanbefoundinthesamePlatformcategory,andfinallylinkittothesametablePERSON_DETAILSspecifiedasatargettableobject.TheMergetransformdoesnotperformanytransformationsordoesnothaveanyconfigurationoptionsasitsimplymergestwodatasetstogether(liketheUNIONoperationinSQL).Ofcourse,inputschemaformatsshouldbeidenticalfortheMergetransformtowork.Seewhatthedataflowshouldlooklikeinthefollowingfigure:

Page 201: SAP Data Services 4.x Cookbook

4. IntheUStransform,mapallkeycolumnsandtheCOUNTRYcolumntotargetandchangemappingforCOUNTRYtoahardcodedvalue,USA.Mostimportantly,specifyGet_Samantha_Smith.COUNTRY=‘UnitedStates’intheWHEREtabtoselectonlyUnitedStatesrecords:

5. IntheUKtransform,maponlykeycolumnsandtheCOUNTRYcolumntotargetitaswellandputGet_Samantha_Smith.COUNTRY=‘UnitedKingdom’intheWHEREtab:

6. NowwehavetotellDataServicesthatwewanttoupdateonesetofrecordsanddeletetheother.Double-clickonyourupdateMap_Operationtransformandsetupthefollowingoptions:

Page 202: SAP Data Services 4.x Cookbook

Bydoingthis,wechangerowtypesfornormalrows(theQuerytransformproducesrowsofnormaltype)toupdate.ThismeansthatDataServiceswillexecuteanUPDATEstatementforthoserowsonthetargettable.

7. RepeatthesamefortheDeleteMap_Operationtransformbutnowchangenormaltodeleteanddiscardtherestoftherowtypes:

8. ForDataServicestocorrectlyperformanupdateanddeleteoperations,wehavetodefinethecorrecttargettablekeycolumns.Double-clickonatargettableobjectPERSON_DETAILSinthedataflowandchangeUseinputkeystoYesintheOptionstab.ThattellsDataServicestoconsiderprimarykeyinformationfromthesourcedatasetratherthanusingthetargettableprimarykeys:

Page 203: SAP Data Services 4.x Cookbook

9. Beforeexecutingthejob,let’scheckwhatourdatalookslikeinthePERSON_DETAILStableforSamanthaSmith.ClickontheViewdatabuttoninthetargettableandapplyfiltersbyclickingontheFiltersbutton.SpecifyfiltersintheFIRSTNAMEandLASTNAMEcolumnsandchecktherecords:

10. Setthefilters:

Page 204: SAP Data Services 4.x Cookbook

11. Thisiswhatthedatainthetablelookslikebeforejobexecution:

12. Runthejobandviewthedatausingthesamefilterstoseetheresult:

Page 205: SAP Data Services 4.x Cookbook

Howitworks…ThisisthekindoftaskthatwouldbemucheasiertoaccomplishwiththefollowingtwoSQLstatements:updatedbo.person_detailssetcountry=‘USA’

wherefirstname=‘Samantha’andlastname=‘Smith’andcountry=‘UnitedStates’;

deletefromdbo.person_details

wherefirstname=‘Samantha’andlastname=‘Smith’andcountry=‘UnitedKingdom’;

Butforus,thisexampleperfectlyillustrateswhatcanbedonewiththeuseoftheMap_OperationtransforminDataServices.

Eachrowpassedfromthesourcetoatargettableinadataflowthroughvarioustransformationobjectscanbeassignedoneofthefourtypes:normal,insert,update,anddelete.

Sometransformationscanchangethetypeoftherow,whileothersjustbehavedifferently,dependingonwhichtypetheincomingrowhas.Forthetargettableobject,thetypeoftherowdefineswhichDMLinstructionithastoexecuteonthetargettableusingsourcerowdata.Thisislistedasfollows:

insert:Iftherowcomeswithnormalorinserttype,DataServicesexecutestheINSERTstatementinordertoinsertthesourcerowintoatargettable.Itwillcheckthekeycolumnsdefinedonatargettableinordertocheckforduplicatesandpreventthemfrombeinginserted.update:Ifarowismarkedasanupdate,DataServicesdeterminesthekeycolumnsitwillusetofindthecorrespondingrecordinthetargettableandupdatesallnon-keycolumnvaluesofthetargettablerecordwiththevaluesfromthesourcerecord.delete:DataServicesdeterminesthekeycolumnstolinksourcerowsmarkedwiththedeletetypewithcorrespondingtargetrow(s)andthendeletestherowfoundinthetargettable.normal:Thisistreatedasaninsertwhentherowcomestoafinaltargettableobject.ItisthedefaulttypeofrowproducedbytheQuerytransformandthemajorityofothertransformsinDataServices.

WhattheMap_Operationtransformallowsyoutodoistochangethetypeoftheincomingrow.Thisallowsyoutoimplementsophisticatedlogicinyourdataflows,makingyourdatatransformationextremelyflexible.

NoteDefiningprimarykeysinDataServicesobjects,suchasQuerytransforms,tableandviewobjects,importedindatastoresdoesnotcreatethesameprimarykeyconstraintsforthecorrespondenttablesonthedatabaselevel.Ifyouhavethemdefinedonthedatabaselevel,theywillbeimportedalongwiththetabledefinitionandwillappearinDataServicesautomatically.Otherwise,youdefineprimarykeycolumnsmanuallytohelpDataServicestoefficientlyandcorrectlyprocessthedata.ManyDataServicestransformsandtargetobjectsrelyonthisinformationtocorrectlyprocessthepassingrecords.

Page 206: SAP Data Services 4.x Cookbook

SettingOutputrowtypetoDiscardinMap_Operationforaspecificinputrowtypewillcompletelyblocktherowsofthechosentype,notlettingthempassthroughtheMap_Operationtransform.Thisisagreatwaytomakesurethatyourdataflowdoesnotperformanyunexpectedinsertswhenitshould,forexample,alwaysonlyupdatethetargettable.

Notehowourtargettableinthisrecipedoesnothavetheprimarykeyconstraintsspecifiedatthedatabaselevel.ItsohappensthatweanalyzedthedatainthePERSON_DETAILStableandknowthattheFIRSTNAME,LASTNAME,andADDRESSLINEcolumnsdefinetheuniquenessoftherecord.Thatiswhy,wemanuallyspecifythemasprimarykeysinDataServicestransformsandusetheUpdatecontroloptionUseinputkeysonthetargettableobjectsoitknowswheretogetinformationregardingkeycolumnstoperformthecorrectexecutionoftheINSERT,UPDATE,andDELETEstatements.IncaseofUPDATE,allnon-keycolumnswillbeupdateswiththevaluesfromthesourcerow.ThatiswhywepropagatedonlytheCOUNTRYcolumnaswewantedtoupdateonlythisfield.IncaseofDELETE,thesetofnon-keycolumnsdoesnotmattermuchasonlysourcekeycolumnswillbeconsideredinordertofindthetargetrowtodelete.

TheotheroptionwouldbetomodifythetableobjectPERSON_DETAILSindatastoreandspecifyprimarykeysthere(seethefollowingscreenshot).Inthatcase,wewouldnothavetodefinekeysinthetransformsandusethetargettableloadingoptionasDataServiceswouldpickupthisinformationfromthetargettableobject.Todothat,expandthedatastoreobjectanddouble-clickonthetabletoopenthetableeditor,thendouble-clickonthecolumnandcheckPrimarykeyinthenewlyopenedwindow:

Page 207: SAP Data Services 4.x Cookbook

UsingtheTable_ComparisontransformTheTable_ComparisontransformcomparesadatasetgeneratedinsideadataflowtoatargettabledatasetandchangesthestatusesofdatasetrowstodifferenttypesaccordingtotheconditionsspecifiedintheTable_Comparisontransform.

DataServicesusesprimarykeyvaluesfortherowcomparisonandmarksthepassingrowaccordinglyas:aninsertrow,whichdoesnotexistinthetargettableyet;anupdaterow,therowforwhichprimarykeyvaluesexistinthetargettablebutwhosenon-primarykeyfields(orcomparisonfields)havedifferentvalues;andfinally,adeleterow(whenthetargetdatasethasrowswithprimarykeyvaluesthatdonotexistinthesourcedatasetgeneratedinsideadataflow).Insomeway,Table_ComparisondoesexactlythesamethingasMap_Operation:itchangestherowtypeofpassingrowsfromnormaltoinsert,update,ordelete.Thedifferenceisthatitdoesitinasmartway—aftercomparingthedatasettothetargettable.

Page 208: SAP Data Services 4.x Cookbook

GettingreadyInordertopreparethesourcedataintheOLTPsystemforthisrecipe,pleaseexecutethefollowingUPDATEintheAdventureWorks_OLTPdatabase.Itonlyupdatesonerowinthetable.updateProduction.ProductDescriptionsetDescription=‘EnhancedChromolysteel.’

whereDescription=‘Chromolysteel.’;

WeperformedthismodificationofthesourcedatasowecanusethischangetodemonstratethecapabilitiesoftheTable_Comparisontransform.

Page 209: SAP Data Services 4.x Cookbook

Howtodoit…Ourgoalinthisrecipeissimple.YourememberthatourDWHdatabasesourcesdatafromtheOLTPdatabase.OneofthetablesinthetargetDWHdatabaseweareinterestedinrightnowistheDimProducttable,whichisadimensiontablethatholdstheinformationaboutallcompanyproducts.Inthisrecipe,wearegoingtobuildajob,whichifexecuted,willchecktheproductdescriptionswithinsourceOLTPtables,andifnecessary,willapplyanychangestotheproductdescriptioninourdatawarehousetableDimProduct.

Thisisasmallexampleofpropagatingdatachangeshappeninginthesourcesystemstothedatawarehousetables.

Asanexample,imaginethatweneedtochangethenameofoneofthematerialsusedtoproduceoneofourproducts.InsteadoftheEnglishdescription“Chromolysteel”,wehavetouse“EnhancedChromolysteel”now.PeopleworkingwiththeOLTPdatabaseviaapplicationssystemshavealreadymadetherequiredchange,andnowitisourresponsibilitytodevelopanETLcodethatpropagatesthischangefromthesourcetothetargetdatawarehousetables.

1. Createanewjobwithonedataflow,sourcingdatafromthefollowingOLTPtables

(Productionschema):Product:Thisisatablecontainingproductswithsomeinformation(price,color,andsoon)ProductDescription:ThisisatablecontainingproductdescriptionsProductModelProductDescriptionCulture:Thisisalinkingtable,whichholdsthekeyreferencesofbothProductandProductDescriptiontables

2. Ifyoudonothavethesetablesimportedalreadyintoyourdatastore,pleasedothatinordertobeabletoreferencethemwithinyourdataflowobject.

3. AddaDimProducttablefromDWHasasourcetable.Yes,donotbesurprised,wearegoingtousethesametableasasourceandasatargetwithinthesamedataflow.TheTable_Comparisontransformwillcomparetwodatasets:thesourcedataset,whichisbasedontheDimProducttablemodifiedwiththehelpofthesourceOLTPtablesandthetargetdatasetoftheDimProducttableitself.

4. CreateanewJoinQuerytransformandmodifyitspropertiestojoinallfourtables,asshowninthefollowingscreenshot:

Page 210: SAP Data Services 4.x Cookbook

YoucanseethatweusetheProductandProductModelProductDescriptionCulturetablesjusttolinktheProductDescriptiontabletoourtargetDimProducttableinordertogetadatasetofDimProductprimarykeyvaluesandthecorrespondingEnglishdescriptionvaluesforspecificproducts.

5. NexttoyourJoinQuerytransform,placetheTable_Comparisontransform,whichcanbefoundinLocalObjectLibrary|Transforms|DataIntegrator|Table_Comparison.

6. OpentheTable_Comparisoneditorintheworkspaceandspecifythefollowingparameters:

Page 211: SAP Data Services 4.x Cookbook

7. Then,placetheMap_OperationtransformcalledMO_Updateanddiscardallrowsofnormal,insert,anddeletetypes,lettingthroughonlyrowswiththeupdatestatus:

8. Finally,linkMO_UpdatetothetargetDimProducttableandcheckwhetheryourdataflowlookslikethefollowingfigure:

Page 212: SAP Data Services 4.x Cookbook

Now,savethejobandexecuteit.Then,runthefollowingcommandinSQLServerManagementStudiotochecktheresultdataintheDimProducttable:selectEnglishDescriptionfromdbo.DimProductwhereEnglishDescriptionlike’%Chromolysteel%’;

Youshouldgetthefollowingresultingvalue:EnhancedChromolysteel

Page 213: SAP Data Services 4.x Cookbook

Howitworks…ToseewhatexactlyishappeningwiththedatasetbeforeandaftertheTable_Comparisontransform,replicateyourdataflowandchangethecopyinthefollowingmanner:

HerewedumptheresultoftheJoinQuerytransforminthetemporarytabletoseewhichdatasetwecomparetotheDimProducttableinsidetheTable_Comparisontransform.

ExtraMap_OperationtransformsallowustocapturerowsofdifferenttypescomingoutofTable_Comparison.UsingMap_Operation,weconvertallofthemtonormaltypeinordertoinsertthemintotemporarytablestoseewhichrowswereassignedwhichrowtypesbytheTable_Comparisontransform:

Page 214: SAP Data Services 4.x Cookbook

NoteAddingmultipletargettemplatetablesafteryourtransformationsisaverypopularmethodofdebugginginETLdevelopment.Itallowsyoutoseeexactlyhowyourdatasetlooksaftereachtransformation.

Let’sseewhatisgoingoninourETLbyanalyzingthedatainsertedintothetemporarytargettables.

ThePRODUCT_TEST_COMPAREtablecontainstherowsstartingfromProductKey=210.ThisissimplybecauseProductKeys<210intheDimProducttabledoesnothaveEnglishdescriptionsinthesourcesystem.

ThePRODUCT_DESC_INSERTtableisempty.Table_ComparisonusestheprimarykeyspecifiedintheInputprimarykeycolumnssectiontoidentifynewrowsintheinputdatasetthatdonotexistinthespecifiedcomparisontable,DWH.DBO.DIMPRODUCT.AsweusedtheDimProducttableasasourceofthePRODUCTKEYvalues,therecouldn’tbeanynewvaluesofcourse.Sonorowswereassignedtheinserttype.

PRODUCT_DESC_UPDATEcontainsexactlyonerowwithanewENGLISHDESCRIPTIONvalue:

Asyoucansee,therestoftherowfieldsDataServiceshassourcedfromthecomparison

Page 215: SAP Data Services 4.x Cookbook

table.AllofthemexceptforthecolumnspecifiedintheComparecolumnssectionoftheTable_Comparisontransform.

ThePRODUCT_DESC_DELETEtable,ontheotherhand,hasalotofrecords.Thosearethetargetrecords(fromcomparisontableDimProduct)forwhichprimarykeyvaluesdonotexistinthedatasetcomingtoaTable_ComparisontransformfromaJoinQuerytransform.Asyoumayremember,thosearerecordsthatdonothaveEnglishdescriptionrecordsinthesourcetables.ThisisanoptionalfeatureofTable_Comparison.DataServiceswilluseprimarykeyvaluesofthoserecordstoexecutetheDELETEstatementonthetargettable.YoucaneasilypreventdeleterowsfrombeinggeneratedbycheckingtheDetectdeletedrow(s)fromcomparisontableoptionintheTable_Comparisontransform.

NoteTheFiltersectionofTable_Comparisonallowsyoutoapplyadditionalfiltersonthecomparisontableinordertorestrictthenumberofrowsyouarecomparing.Thisisveryusefulifyourcomparisontableislarge.ThisallowsoptimizingtheresourcesconsumedbyDataServicesinordertoextractandstorethecomparisondatasetandalsospeedsupthecomparisonprocessitself.

Page 216: SAP Data Services 4.x Cookbook

ExploringtheAutocorrectloadoptionTheAutocorrectloadoptionisaconvenientmeansDataServicesprovidesforpreventingtheinsertionofduplicatesintoyourtargettable.Thisisthemethodofinsertingdataintoatargettableobjectinsidethedataflow.ItcaneasilybeconfiguredbysettingthetargettableoptiontoYes,withnomoreconfigurationrequired.Thisrecipedescribesdetailsregardingtheusageofthisloadmethod.

Page 217: SAP Data Services 4.x Cookbook

GettingreadyForthisrecipe,wewillcreateanewtableintheSTAGEdatabaseandpopulateitwithalistofcurrenciesfromtheDimCurrencydimensiontableintheAdventureWorks_DWHdatawarehouse.

ExecutethefollowingstatementsinSQLServerManagementStudio:SELECTCurrencyAlternateKey,CurrencyName

INTOSTAGE.dbo.NewCurrency

FROMAdventureWorks_DWH.dbo.DimCurrency;

ALTERTABLESTAGE.dbo.NewCurrency

ADDPRIMARYKEY(CurrencyAlternateKey);

WewillusetheAutocorrectloadoptiontomakesurethatourdataflowdoesnotinsertrowsalreadyexistinginthetargettable.

Page 218: SAP Data Services 4.x Cookbook

Howtodoit…First,wearegoingtodesignthedataflowthatwillpopulatethetargettableNewCurrency.

Inthedataflow,wewillusetheRow_Generationtransformtogeneratethreenewrows,eachfordifferentcurrencies,andtrytoinsertitintothepreviouslycreatedcurrencystagetableNewCurrency.TheNewCurrencytablealreadyhassomedataprepopulatedfromtheDimCurrencytable.ThatisrequiredifwewanttotesttheAutocorrectloadoption.

ThefirstgeneratedrowwillbeforEURcurrency(theCURRENCYALTERNATEKEYcolumn),whichalreadyexistsinatargettablebutwithadifferentcurrencyname:CURRENCYNAME=‘NEWEURO’.

Thesecondgeneratedrowwillbeanewcurrencywhichdoesnotexistinthetableyet:‘CRO’withCURRENCYNAME=‘CROWN’.

Thethirdgeneratedrowwillbe‘NZD’withCURRENCYNAME=‘NewZealandDollar’,matchingbothvaluesinfieldsCURRENCYALTERNATEKEYandCURRENCYNAMEoftheexistingrecordinNewCurrencytable.

1. Createanewjobandanewdataflow,pickingyourownnamesforthecreated

objects.2. Openthedataflowintheworkspacewindowtoedititandaddthreenew

Row_Generationtransforms,whichwewilluseasasourceofdatawithdefaultparameters.Bydefault,thistransformobjectgeneratesonerowwithasingleIDcolumnpopulatedwithintegervaluesstartingwith0.NamethethreenewlyaddedRow_GenerationtransformsGenerate_EURO,Generate_NZD,andGenerate_CROWN:

3. LinkeachRow_GenerationtransformtoarespectiveQuerytransformtocreateanoutputschemamatchingthetargettableschemawithtwocolumns:CURRENCYALTERNATEKEYandCURRENCYNAME.SeetheexampleforEUROshowninthefollowingscreenshot:

Page 219: SAP Data Services 4.x Cookbook

TheothertwoareCRO(CROWN)andNZD(NewZealandDollar)

4. Finally,mergethesethreerowsintoonedatasetwiththehelpoftheMergetransform(LocalObjectLibrary|Transforms|Platform|Merge).

5. MaptheMergetransformoutputtoQuerytransformcolumnswiththesamenamesandlinkQuerytothetargettableNewCurrencypreviouslyimportedintotheDS_STAGEdatastore.

6. CheckthetargetdataintheNewCurrencytablebeforerunningthiscode.ApplyfiltersinaViewDatawindowofthetargettable,asshowninthefollowingscreenshot,toseetheexistingrowsweareinterestedin:

Page 220: SAP Data Services 4.x Cookbook

YoucanseethatwehavetworecordsinthetargettableforEURandNZD.

7. Saveandrunthejob.Youshouldgetthefollowingerrormessage:

RecallhowweappliedtheprimarykeyconstraintontheNewCurrencytable.TheDataServicesjobfailsinanattempttoinsertrowswiththeprimarykeyvaluesthatalreadyexistinthetargettable.

8. NowtoenabletheAutocorrectloadoption,openthetargettableeditorintheworkspace.OntheOptionstab,changeAutocorrectloadtoYes:

9. Nowsavethejobandrunitagain.Itrunswithouterrors,andifyoubrowsethedatainthetargettableusingthesamefiltersasbefore,youwillseethatthenewCROcurrencyappearsinthelistandtheEURcurrencyhasanewcurrencyname:

Page 221: SAP Data Services 4.x Cookbook
Page 222: SAP Data Services 4.x Cookbook

Howitworks…PreventingduplicatedatafrombeinginsertedisoftenoneoftheresponsibilitiesoftheETLsolution.Inthisexample,wecreatedaconstraintobjectonourtargettable,delegatingcontroltothedatabaselevel.Butthisisnotacommonpracticeinmoderndatawarehouses.

Ifnotforthatconstraint,wewouldsuccessfullyhaveinsertedduplicaterowsonthefirstattemptandourjobwouldnotfail.ThebeautyoftheAutocorrectloadoptionisitssimplicity.Allittakesistosetupasingleoptiononatargetobject.Whenthisoptionisenabled,DataServicescheckseachrowbeforeinsertingittoatargettable.

Iftargettablehasarowidenticaltotheincomingdataflowrow,thentherowissimplydiscarded.Ifthetargettablehastherowwiththesameprimarykeyvaluesbutdifferentvaluesinoneormorecolumns,theDataServicesexecutestheUPDATEstatement,updatingallnon-primarykeycolumns.Andfinally,ifthetargettabledoesnothavetherowwiththesameprimarykeyvalues,theDataServicesexecutestheINSERTstatement,insertingtherowintothetargettable.

Youcanbuildadataflowwiththesamelogic,preventingduplicatesfrombeinginsertedbyusingtheTable_Comparisontransform.AutocorrectloadperformsthecomparisonbetweenthedataflowdatasetandthetargettabledatasetjustaswellasTable_Comparisondoes.BothmethodsproduceINSERT/UPDATErowtypes.TheonlydifferenceisthatAutocorrectloadcannotperformthedeletionoftargettablerecords.Thus,themainpurposeoftheAutocorrectloadoptionistoprovideyouwithasimpleandefficientmethodofprotectingyourtargetdatafromincomingduplicaterecords.

WealsousedtheMergetransforminthisrecipe.TheMergetransformdoesthesamethingastheSQLUNIONoperatorandhasthesamerequirements:thedatasetsshouldhavethesameformatinordertobesuccessfullymerged:

MergeisoftenusedincombinationwithTable_Comparison.First,yousplityourrows,assigningthemdifferentrowtypeswithTable_Comparison.Then,youdealwithdifferenttypesofrows,applyingdifferenttransformationsdependingonwhethertherow

Page 223: SAP Data Services 4.x Cookbook

isgoingtobeinsertedorupdatedinthetargettable.Finally,youjoinbothsplitdatasetsbackintoonewiththehelpofMergetransformsasyoucannotlinkmultipletransformstoasingletargetobject.

Page 224: SAP Data Services 4.x Cookbook

SplittingtheflowofdatawiththeCasetransformTheCasetransformallowsyoutoputbranchlogicinasinglelocationinsideadataflowinordertosplitthedatasetandsendpartsofittodifferentlocations.Theymightbetargetdataflowobjects,suchastablesandfiles,orjustothertransforms.TheuseoftheCasetransformsimplifiesETLdevelopmentandincreasesthereadabilityofyourcode.

Page 225: SAP Data Services 4.x Cookbook

GettingreadyInthisrecipe,wewillbuildthedataflowthatreadsthecontentsofthedimensiontableDimEmployeeandupdatesitaccordingtothefollowingbusinessrequirements:

AllmaleemployeesintheproductiondepartmentgetsextravacationhoursAllfemaleemployeesintheproductiondepartmentget10extrasickhoursAllemployeesinthequalityassurancedepartmentgettheirbaserateincreasedby1.5

So,beforeyoubegindevelopingyourETL,makesureyouimporttheDimEmployeetableintheDWHdatastore.Wearegoingtouseitasbothsourceandtargetobjectinourdataflow.

Page 226: SAP Data Services 4.x Cookbook

Howtodoit…1. Firstofall,letscalculateaveragevaluesperdepartmentandgenderweareinterested

in.ExecutethefollowingqueriesinSQLServerManagementStudio:—AveragevacationhoursforallmalesinProductiondepartment

selectavg(VacationHours)asAvgVacHrsfromdbo.DimEmployeewhereDepartmentName=‘Production’andGender=‘M’andStatus=‘Current’;

—AveragesickhoursforallfemalesinProductiondepartment

selectavg(SickLeaveHours)asAvgSickHrsfromdbo.DimEmployeewhereDepartmentName=‘Production’andGender=‘F’andStatus=‘Current’;

—AveragebaserateforallemployeesinQualityAssurancedepartment

selectavg(BaseRate)asAvgBaseRatefromdbo.DimEmployeewhereDepartmentName=‘QualityAssurance’andStatus=‘Current’;

2. Pleasenotetheresultantvaluestocomparethemwiththeresultswhenwerunourdataflowafterhavingupdatedthosefields:

3. Createanewjobandanewdataflowobject,andopenthedataflowintheworkspacewindowforediting.

4. PuttheDimEmployeetableobjectasasourceinsideyournewdataflowandlinkittotheCasetransform,whichcanbefoundatLocalObjectLibrary|Transforms|Platform|Case.

5. OpenCaseEditorintheworkspacebydouble-clickingontheCasetransform.Hereyoucanchooseoneofthetreeoptionsandspecifyconditionsaslabel-expressionpairs(bymodifyingtheLabelandExpressionsettings),accordingtowhichtherowwillbesendtooneoutputoranother:

Page 227: SAP Data Services 4.x Cookbook

6. Labelvaluesareusedtolabeldifferentoutput.YouwillusetheselabelstooutputinformationtodifferenttransformobjectswhenyouarelinkingCaseoutputtothenextobjectsinadataflow.

7. CheckonlytheRowcanbeTRUEforonecaseonlyoptionandaddthefollowingconditionexpressionsbyclickingontheAddbutton:

Label Expression

Female_in_Production

DIMEMPLOYEE.DEPARTMENTNAME=‘Production’AND

DIMEMPLOYEE.STATUS=‘Current’AND

DIMEMPLOYEE.GENDER=‘F’

Male_in_Production

DIMEMPLOYEE.DEPARTMENTNAME=‘Production’AND

DIMEMPLOYEE.STATUS=‘Current’AND

DIMEMPLOYEE.GENDER=‘M’

All_in_Quality_AssuranceDIMEMPLOYEE.DEPARTMENTNAME=‘Quality

Assurance’ANDDIMEMPLOYEE.STATUS=‘Current’

8. YourCaseEditorshouldlooklikethefollowingscreenshot:

Page 228: SAP Data Services 4.x Cookbook

9. NowwehavetolinkourCasetransformoutputtothreedifferentQuerytransformobjects.Eachtimeyoulinktheobjects,youwillbeaskedtochoosetheCaseoutputfromwhatwecreatedbefore.

10. ForQuerytransformnames,letschoosemeaningfulvaluesthatrepresentthetypeoftransformationswearegoingtoperforminsidethem.

TheIncrease_Sick_HoursQuerytransformislinkedtotheFemale_in_ProductionCaseoutputTheIncrease_Vacation_HoursQuerytransformislinkedtotheMale_in_ProductionCaseoutputTheIncrease_BaseRateQuerytransformislinkedtotheAll_in_Quality_AssuranceCaseoutput

11. Lastly,mergeallQueryoutputswiththeMergetransformobject,linkittotheMap_Operationtransformobject,andfinallytotheDimEmployeetableobjectbroughtfromtheDWHdatastoreasatargettable.

12. Pleaseusethefollowingscreenshotasareferenceforhowyourdataflowshouldlook:

13. NowwehavetoconfigureoutputmappingsinourQuerytransforms.Asweareinterestedinupdatingonlythreetargetcolumns—VacationHours,SickLeaveHours,

Page 229: SAP Data Services 4.x Cookbook

andBaseRate—wemapthemfromthesourceCasetransform.TheCasetransforminheritsallcolumnmappingsautomaticallyfromthesourceobject.WealsomaptheprimarykeycolumnEmployeeKeysoDataServiceswillknowwhichrowstoupdateinthetarget.

14. ThenineachQuerytransform,modifythemappingexpressionofthecorrespondentcolumnaccordingtothebusinesslogic.Usethefollowingtableforthelistofcolumnsandtheirnewmappingexpressions.RememberthateachofourQuerytransformsmodifiesonlyonecorrespondentcolumn;theothercolumnmappingsshouldremainintact.Wearesimplygoingtopropagatethemfromthesourceobject:

Querytransform Modifiedcolumn Mappingexpression

Increase_Sick_Hours SICKLEAVEHOURSCase_Female_in_Production.SICKLEAVEHOURS

+10

Increase_Vacation_Hours VACATIONHOURSCase_Male_in_Production.VACATIONHOURS+

5

Increase_BaseRate BASERATECase_All_in_Quality_Assurance.BASERATE*

1.5

15. SeetheexampleoftheIncrease_Vacation_Hoursmappingconfiguration:

16. ThelastobjectweneedtoconfigureistheMap_OperationtransformobjectnamedUpdate.YoushouldalreadyknowbynowthattheQuerytransformgeneratesthenormaltypeofrows,whichareinsertedintoatargetobjectwhentheyreachtheendofthedataflow.

17. Inourexample,aswewanttoperformanupdateofnon-keycolumnsdefinedinoursourcedatasetusingmatchingprimarykeyvaluesinthetargettable,weneedtomodifytherowtypefromnormaltoupdate:

Page 230: SAP Data Services 4.x Cookbook

18. TobeabsolutelyclearaboutthepurposeofthisMap_Operationobject,wechangetheotherrowtypestodiscard,thoughwewouldnevergettheinsert,update,ordeleterowsinthisdataflowwithoutmodifyingit.

19. Saveandrunthejobandrunthequeriestoseenewaverageresultsforthecolumnsupdatedinthetable:

Thedifferencebetween“before”and“after”valuesprovesthatDataServicescorrectlyupdatestherequiredrowsintheDimEmployeetable.

Page 231: SAP Data Services 4.x Cookbook

Howitworks…Thedevelopeddataflowisagoodexampleofadataflowperforminganupdateofthetargettable.

Wehavesplittherowsaccordingtotheconditionsspecified,performedtherequiredtransformationofthedataaccordingtothelogicprovidedintheconditions,andthenmergedallsplitdatasetsbacktogetherandmodifiedallrowtypestoupdate.WedidthissothatDataServiceswouldexecuteUPDATEstatementsforthewholedataset,updatingthecorrespondingrowsthathavethesameprimarykeyvalues.

Asweusedthetargettableasasourceobjectaswell,wecanbesurethatwewillnothaveanyextrarowsinourupdatedatasetthatdonotexistinthetarget.

Notethatthedatasetgeneratedinyourdataflowdoesnothavetomatchexactlythetargettablestructure.Whenyouperformtheupdateofthetargettable,makesureyouhavetheprimarykeydefinedcorrectlyandkeepinmindthatthetargettablewillhaveupdatedallcolumnsdefinedasnon-primarycolumnsinthesourceschemastructure.

NoteDataServicesusesprimarykeycolumnsdefinedinthetargettabletofindthematchingrows.Ifyouwanttouseadifferentsetofcolumnstofindthecorrespondingrecordtoupdateinthetarget,setthemupasprimarykeycolumnsintheoutputschemaoftheQuerytransforminsideadataflow,andsetUseinputkeystoYesintheUpdatecontrolsectionofthetargettableobject.

Thereisanother,lesselegantwayofdoingthesamethingthatCasetransformdoes.ItinvolvesusingtheWHEREtaboftheQuerytransformstofilterthedatarequiredfortransformation:

Thatdoeslooklikeasimplersolution,buttherearetwomaindisadvantages:

Youlosereadabilityofyourcode:WithCasetransform,youcanseelabelsoftheoutput,whichcanexplaintheconditionsusedtosplitthedata.Youloseinperformance:Insteadofsplittingthedataset,youactuallysenditthree

Page 232: SAP Data Services 4.x Cookbook

timestodifferentQuerytransforms,eachofwhichperformsthefiltering.Technically,youaretriplingthedataset,makingyourdataflowconsumemuchmorememory.

Page 233: SAP Data Services 4.x Cookbook

MonitoringandanalyzingdataflowexecutionWhenyouexecutethejob,DataServicespopulatesrelevantexecutioninformationintothreelogfiles:thetrace,monitor,anderrorlogs.Inlaterchapters,wewilltakeacloserlookattheconfigurationparametersavailableatthejoblevelinordertogathermoredetailedinformationregardingjobexecution.Meanwhile,inthisrecipe,wewillspendsometimeanalyzingthemonitorlogfile,whichlogsprocessinginformationfrominsidethedataflowcomponents.

Page 234: SAP Data Services 4.x Cookbook

GettingreadyForsimplicity,wewillusetheseconddataflowfromtherecipeUsingtheTable_ComparisontransformcreatedfordetailedexplanationoftheflowofthedatabeforeandafteritpassestheTable_Comparisontransformobject:

OpentheTable_ComparisontransformeditorintheworkspaceandchangethecomparisonmethodtoCachedcomparisontable:

WechangethisoptiontoslightlychangethebehavioroftheDataServicesoptimizer.Now,insteadofcomparingdatarowbyrow,executingtheSELECTstatementagainstthecomparisontableinthedatabaseforeachinputrow,DataServiceswillreadthewholecomparisontableandcacheitontheDataServicesserverside.OnlyafterthiswillitperformthecomparisonofinputdatasetrecordswithtablerecordscachedontheDataServicesserverside.Thatslightlyspeedsupthecomparisonprocessandchangeshowtheinformationaboutdataflowexecutionisloggedinthemonitorlog.

Page 235: SAP Data Services 4.x Cookbook

Howtodoit…1. Savethedataflowandexecutethejobwiththedefaultparametersasusual.2. Inthemainworkspace,opentheJobLogtabtoshowthetracelogsection,which

containsinformationaboutjobexecution.Toseethemonitorlog,clickonthesecondbuttonatthetopoftheworkspacearea.Forconvenience,youmayselecttherecordsfromthelogyouareinterestedinandcopyandpastethemintotheExcelspreadsheetusingtheright-clickcontextmenu:

3. Thismonitorlogsectiondisplaysinformationaboutthenumberofrecordsprocessedbyeachdataflowcomponentandhowlongittakestoprocessthem.Thereadercomponentsshowninthefollowingscreenshotareresponsibleforextractinginformationfromthesourcedatabasetables.YoucanseethattheDimProducttableisextractedbyaseparateprocess(probablybecauseitislocatedinadifferentdatabase),whereastheotherthreetablesarejoinedandextractedwithasingleSELECTstatementbyasinglecomponentwithquiteasophisticatedname,asyoucansee:

4. ThecomponentJoin_PRODUCT_TEST_COMPAREpassesthedatasetfromtheJoinQuerytransformtothefirsttargettable,PRODUCT_TEST_COMPARE.Youcanseethatithasprocessed396rowsin0.136seconds:

5. Finally,informationaboutdataflowcomponentsresponsibleforprocessingdatainMap_Operationtransformsshowsthattherewere210rowsprocessedbytheMO_DeletetransformandpassedtoatargetPRODUCT_DESC_DELETEtemplatetable.OnlyonerowwasprocessedbyMO_Updateandpassedtoa

Page 236: SAP Data Services 4.x Cookbook

correspondingtargettableandnorowswereprocessedbyMO_Insertasthereweren’tanyrowswithinsertrowtypegeneratedbythisdataflow:

6. Thelastcolumnshowsthetotaltimepassedintheexecuteddataflowobjectwhenthecomponentwasprocessingrecords.

Page 237: SAP Data Services 4.x Cookbook

Howitworks…DataServicesputsprocessinginformationfromalldataflowobjectsinasingleplace.Ifyouhaveajobwith100dataflowsandsomeofthemruninparallel,youcanimaginethatrecordsinthemonitorlogcouldbemixed.ThatiswhycopyingthelogdatatoaspreadsheetforfurthersearchandfilteringwithfunctionalityofExcelisquiteuseful.

Dataflowexecutionisaverycomplexprocess,andthecomponentsyouseeinthemonitorlogarenotalwaysinaone-to-onerelationshipwiththeobjectsplacedinsideadataflow.Therearevariousinternalservicecomponentsperformingjoins,splits,andthemergingofdatathatwillbedisplayedinthemonitorlog.SometimesDataServicescreatesafewprocessingcomponentsforasingletransformobject.

Ifyouknowwhatyouarelookingfor,readingthemonitorlogismucheasier.Hereisasummaryofwhatthecolumnsmean:

Thefirstcolumninthemonitorlogisthenameofthecomponentcontainingthenameofthedataflowandthenamesofthecomponentsinsidethedataflow.Thesecondcolumnisthestatusoftheprocessingcomponent.READYmeansthatthecomponenthasnotstartedprocessingdata;inotherwords,norecordshavereachedityet.PROCEEDmeansthatthecomponentisprocessingrowsatthemoment,andSTOPmeansthatallrowshavepassedthecomponentandithasfinishedprocessingthembypassingthemfurtherdownthedataflowexecutionsequence.Thethirdcolumnshowsyouthenumberofrowsprocessedbyacomponent.ThisvalueisinfluxwhilethecomponenthasthePROCEEDstatusandattainsafinalvaluewhenthecomponent’sstatuschangestoSTOP.Thefourthcolumnshowsyoutheexecutiontimeofthecomponent.Thefifthcolumnshowsyouthetotalexecutiontimeofthedataflowwhilethecomponentwasprocessingtherows.Assoonasthecomponent’sstatuschangestoSTOP,bothexecutiontimevaluesfreezeandstopchanging.

Toillustratethisevenfurther,let’scounttherowsinthesourcetablestocomparewithwhatwehaveseeninthemonitorlog.

First,seetheresultsofcountingthenumberofrecordsinthetablesDIMPRODUCTandPRODUCTMODELPRODUCTDESCRIPTIONCULTUREwiththehelpoftheViewDatafunctionavailableontheProfileTabfortableobjectsinsideadataflow.ClickontheRecordsbuttontocalculatethenumberofrecordsinthetable:

NowseetheresultofcountingthenumberofrecordsinthetablesPRODUCTandPRODUCTDESCRIPTIONwiththesameViewData|Profilefeature:

Page 238: SAP Data Services 4.x Cookbook

ByusingthetransformnameJoin,youcanseethecomponentsrelatedtotheexecutionofthefirstQuerytransform.

YouseetheDIMPRODUCT_11component(606rows)asbeingnotpartoftheJointransformcomponentsbecauseitwasexecutedseparately.DataServicescouldnotincludeitinasingleSELECTstatement(rememberthatthistableisintheDWHdatabase)withthreeothertablesthathadjoinconditionsspecifiedinsidetheJointransform.DataServicescouldrecognizethemasbelongingtothesamedatabaseandpusheddownthesingleSELECTstatementtothedatabaselevel,extracting294rows.

Somecomponents,thatisMap_Operationrelatedones,areeasilyrecognizablebyname,whichincludesthenameofthecurrenttransformationandthenexttargettableobjectname:Join_PRODUCT_TEST_COMPARE,MO_Update_PRODUCT_DESC_UPDATE,andsoon.

TheTable_Comparisonexecutionisthemostcomplexone,asyoucanseefromthemonitorlog.Allcompareddatasetsarefirstcachedbyseparatecomponentsandthencomparedtoeachotherbytheotherones.YoucanidentifycomponentsbelongingtoaTable_ComparisontransformbyusingthekeywordsTCRdrandTable_Comparison.

Page 239: SAP Data Services 4.x Cookbook

There’smore…Readingthemonitorlog,whichisthemainsourceofthedataflowexecutioninformation,canrequirealotofexperience.Inthefollowingchapters,wewillspendalotoftimepeekingintothemonitorlogfordifferentkindsofinformationaboutthedataflowexecution.Often,itisveryusefulforidentifyingpotentialperformancebottlenecksinsidethedataflow.

Page 240: SAP Data Services 4.x Cookbook

Chapter5.Workflow–ControllingExecutionOrderThischapterwillexplainindetailanothertypeofDataServicesobject:workflow.Workflowobjectsallowyoutogroupotherworkflows,dataflowsandscriptobjectsintoexecutionunits.Inthischapter,wewillcoverthefollowingtopics:

CreatingaworkflowobjectNestingworkflowstocontroltheexecutionorderUsingconditionalandwhileloopobjectstocontroltheexecutionorderUsingthebypassingfeatureControllingfailures–try-catchobjectsUsecaseexample–populatingdimensiontablesUsingacontinuousworkflowPeekinginsidetherepository–parent-childrelationshipsbetweenDataServicesobjects

Page 241: SAP Data Services 4.x Cookbook

IntroductionInthischapter,wewillmovetothenextobjectintheDataServiceshierarchyofobjectsusedinETLdesign:theworkflowobject.Workflowsdonotperformanymovementofdatathemselves;theirmainpurposeistogroupdataflows,scripts,andotherworkflowstogether.

Inotherwords,workflowsarecontainerobjectsgroupingpiecesofETLcode.TheyhelpdefinethedependenciesbetweenvariouspiecesofETLcodeinordertoproviderobustandflexibleETLarchitecture.

IwillalsoshowyouhowyoucanquerytheDataServicesrepositoryusingdatabasetoolsinordertoquerythehierarchyofobjectsdirectlyandwillshowyouhowthishierarchyisstoredinrepositorydatabasetables.Thismaybeveryusefulifyouwanttounderstandabitmoreabouthowthesoftwareisfunctioning“underthehood”.

Additionally,wewillbuildareal-lifeusecaseETLcodebypopulatingdimensiontablesindatawarehouse.ThisusecaseexamplewillincludethefunctionalityalreadyreviewedinthepreviouschaptersandwillshowyouhowyoucanaugmentexistingETLprocessesandmigratedata(dataflows)withthehelpofworkflowobjects.

Page 242: SAP Data Services 4.x Cookbook

CreatingaworkflowobjectAworkflowobjectisareusableobjectinDataServices.Oncecreated,thesameobjectcanbeusedindifferentplacesofyourETLcode.Forexample,youcanplacethesameworkflowindifferentjobsornestitinotherworkflowobjectsbyplacingthemintheworkflowworkspace.

NoteNotethataworkflowobjectcannotbenestedinsideadataflowobject.Workflowsareusedtogroupdataflowobjectsandotherworkflowssothatyoucancontroltheirexecutionorder.

Everyworkflowobjecthasitsownlocalvariablescopeandcanhaveasetofinput/outputparameterssothatitcan“communicate”withtheparentobject(inwhichitisnested)byacceptinginputparametervaluesorsendingvaluesbackthroughoutputparameters.Ascriptobjectplacedinsidetheworkflowbecomespartoftheworkflowandsharesitsvariablescope.ThatiswhyallworkflowlocalvariablescanbeusedwithinthescriptsplaceddirectlyintotheworkfloworpassedtothechildobjectsbygoingtoVariablesandParameters|Calls.

Laterinthischapter,wewillexplorehowthisobjecthierarchyisstoredwithintheDataServicesrepository.

Page 243: SAP Data Services 4.x Cookbook

Howtodoit…Therearefewwaystocreateaworkflowobject.Followthesesteps:

1. Tocreateaworkflowobjectintheworkspaceoftheotherparentobject,youcanuse

thetoolpalletontheright-handsideoftheDesignerinterface.Followthesesteps:1. Createanewjobandopenitintheworkspaceforediting.2. Left-clickontheWorkFlowiconintheworkspacetoolpalette(seethe

followingscreenshot),dragittothejobworkspace,andleft-clickontheemptyspaceintheworkspacetoplacethenewworkflowobject:

3. NametheobjectWF_exampleandpressEntertocreateit.NotethattheobjectimmediatelyappearsintheLocalObjectLibraryworkflowlist.TheparentobjectoftheWF_exampleworkflowisthejobitself.

2. CreateanotherworkflowobjectinsideWF_example.Now,wewilluseadifferentmethodtocreateworkflowsdirectlyfromLocalObjectLibraryratherthanusingtheworkspacetoolpalette.Then,performthesesteps:1. OpenWF_exampleinthemainworkspacewindow.2. GototheLocalObjectLibrarywindowandselecttheWorkFlowstab.3. Right-clickontheLocalObjectLibraryemptyareaofthistabandchoose

Newfromthecontextmenu.4. Fillintheworkflowname,WF_example_child,anddraganddropthecreated

objecttotheworkspaceareaofWF_examplefromLocalObjectLibrary.

Page 244: SAP Data Services 4.x Cookbook

Howitworks…AworkflowobjectorganizesandgroupspiecesofETLprocesses(dataflowandsometimesscripts).Itdoesnotperformanydataprocessingitself.Whenitisbeingexecuted,itsimplystartsexecutingsequentially(orinparallel)allitschildobjectsintheorderdefinedbytheuser.

Youcanthinkofaworkflowasacontainerthatholdstheexecutableelements.Likeaprojectobjectfunctionissimilartoarootfolder,workflowservesthesame“folder”functionalitywithafewextrafeatures,whichyouwillbeabletogetfamiliarwithinthenextfewrecipes.

Likethefolderstructureonyourdisk,youcancreatesophisticatednestedtreestructureswiththehelpofworkflowobjectsbyputtingthemintoeachother.

Onethingtorememberisthateachworkflowhasitsownscopeofvariablesorcontext.Topassvariablesfromaparentworkflowtoachildobject,selecttheCallstabontheVariablesandParameterspanel.Itshowsthelistofinputparametersfromthechildobjectsfortheobjectcurrentlyopeninthemainworkspacearea.

ToopentheVariablesandParameterswindow,youcanclickontheVariablesbuttoninthetoolmenuatthetopofyourDesignerscreen.

Here,youseethecontextofthecurrentlyopenobject,thatis,thelistofdefinedlocalvariables,inputparameters,andavailableglobalvariablesinheritedfromthejobcontext:

TheCallssectionallowsyoutopassyourpreviouslycreatedlocalvariable$WF_example_local_varoftheWF_exampleworkflowtotheWF_example_childchildworkflowobject’s$WF_example_child_var1inputparameter,asshownhere:

Ofcourse,youhavetoopenthechildobjectcontextfirstandcreateaninputparameterso

Page 245: SAP Data Services 4.x Cookbook

thatitscallisvisibleinthecontextoftheparent.

Scriptsarenotreusableobjectsanddonothavelocalvariablescopeorparametersoftheirown.Theybelongtotheworkfloworjobobjecttheyhavebeenplacedinto.Inotherwords,theycanseeandoperateonlyonthelocalvariablesandparametersdefinedattheparentobjectlevel.

Ofcourse,youcancopyandpastethecontentsofasinglescriptobjecttoanotherscriptobjectinadifferentworkflow.However,itwillbeanewinstanceofthescriptobjectthatwillberunninginanewcontextofthedifferentparentworkflow.Hence,thevariablesandparametersusedcouldbecompletelydifferent.

Page 246: SAP Data Services 4.x Cookbook

NestingworkflowstocontroltheexecutionorderInthisrecipe,wewillseehowworkflowobjectsareexecutedinthenestedstructure.

Page 247: SAP Data Services 4.x Cookbook

GettingreadyWewillnotcreatedataflowobjectsinthisrecipe,sotoprepareanenvironment,justcreateanemptyjobobject.

Page 248: SAP Data Services 4.x Cookbook

HowtodoitWewillcreateanestedstructureofafewworkflowobjects,eachofwhich,whenexecuted,willrunthescript.Itwilldisplaythecurrentworkflownameandthefullpathtotherootjobcontext.Followthesesteps:

1. Inthejobworkspace,createanewworkflowobject,WF_root,andopenit.2. IntheVariablesandParameterswindow,whenintheWF_rootcontext,createone

localvariable$l_wf_nameandoneinputparameter$p_wf_parent_name,bothofthevarchar(255)datatype.

3. Also,insideWF_root,addthenewscriptobjectnamedScriptwiththefollowingcode:$l_wf_name=workflow_name();

print(‘INFO:running{$l_wf_name}(parent={$p_wf_parent_name})’);

$l_wf_name=$p_wf_parent_name||’>’||$l_wf_name;

4. InthesameWF_rootworkflowworkspace,addtwootherworkflowobjects,WF_level_1andWF_level_1_2,andlinkallofthemtogether.

5. Repeatsteps2and3forbothnewworkflowsWF_level_1andWF_level_1_2.6. OpenWF_level_1,createanewworkflow,WF_parallel,andlinkittothescript

object.7. InsidetheWF_level_1workflow,createtwootherworkflowobjects,WF_level_3_1

andWF_level_3_2.Then,createonlyoneinputparameter,$p_wf_parent_name,withoutcreatingalocalvariable.

8. Repeatsteps2and3forboththeWF_level_3_1andWF_level_3_2workflows.9. Now,wehavetospecifymappingsfortheinputparametersofthecreatedworkflows.

Todothis,double-clickonparametername$p_wf_parent_namebygoingtoVariablesandParameters|Callsandinputthenameofthe$l_wf_namelocalvariable.

10. Therearetwoexceptionstotheinputparametermappingsettings.InthecontextofthejobfortheinputparameteroftheWF_rootworkflow,youhavetospecifythejob_name()functionasavalue.Performthesesteps:1. Openthejobinthemainworkspace(sothattheWF_rootworkflowisvisibleon

thescreen).2. ChooseVariablesandParameters|Callsanddouble-clickonthe

$p_wf_parent_nameinputparametername.3. IntheValuefield,enterthejob_name()functionandclickonOK.

11. ThesecondexceptionistheinputparametermappingsforworkflowsWF_level_3_1andWF_level_3_2.Performthefollowingsteps:1. OpentheWF_parallelworkflowtoseebothWF_level_3_1andWF_level_3_2

displayedonthescreen.2. GotoVariablesandParameter|Callsandspecifythefollowingvalueforboth

inputparametercalls:(($p_wf_parent_name||’>’)||workflow_name())

12. Yourjobshouldhavethefollowingworkflownestedstructure,asshowninthe

Page 249: SAP Data Services 4.x Cookbook

screenshothere:

TheonlyworkflowobjectthatdoesnothaveascriptobjectinsideitisWF_parallel.Itwillbeexplainedlaterintherecipe.

13. Now,openthejobintheworkspaceareaandexecuteit.14. Thetracelogshowstheorderofworkflowexecutions,currentlyexecutedworkflow

names,andtheirlocationintheobjecthierarchywithinthejob.Seethefollowingscreenshot:

Page 250: SAP Data Services 4.x Cookbook
Page 251: SAP Data Services 4.x Cookbook

Howitworks…Aswehavepassedvaluestotheinputparametersoftheobjectsinthepreviouschapterdedicatedtothecreationofdataflowobjects,youprobablyalreadyknowhowthismechanismworks.Theobjectcallsfortheinputparametervaluerightbeforeitsexecutionintheparentobjectwhereitislocated.

Everyworkflowinourstructure(exceptWF_parallel)hasalocalvariablethatisusedinthescriptobjecttosaveanddisplaythecurrentworkflownameandconcatenateittotheworkflowpathinthehierarchyreceivedfromtheparentobjectinordertopasstheconcatenatedvaluetothechildobjectintheircalls.

Let’sfollowtheexecutionssteps:

Whenajobexecutes,itfirstrunstheobjectthatislocatedinthejobcontext;inourcase,itisWF_root.Aswedonotspecifyanylocalvariableforthejob,wecannotpassitsvaluetotheinputparameteroftheWF_rootobject.So,wesimplypassitajob_name()functionthatreturnsthenameofthejobwhereitisbeingexecuted.Thejob_name()functiongeneratesthevaluethatispassedtotheinputparameterrightbeforetheWF_rootexecution.TheWF_rootexecutionrunsthescriptobjectfromlefttoright.Inthescript,thelocalvariablegetsthevaluefromtheoutputoftheworkflow_name()function,whichreturnsthenameoftheworkflowwhereitisbeingexecuted.Withtheprint()function,wedisplaythelocalvariablevalueandvalueoftheinputparameterreceivedfromtheparentobject(job).Asthenextstep,thevalueofthelocalvariableisbeingconcatenatedwiththevalueoftheinputparametertogetthecurrentlocationpathinthehierarchyforthechildobjectsWF_level_1andWF_level_1_2.AsallobjectsinsideWF_rootarelinkedtogether,theyareexecutedsequentiallyfromlefttoright.Everynextobjectonlyrunsaftersuccessfulcompletionofthepreviousobject.DataServicesrunsWF_level_1andrepeatsthesamesequenceofdisplayingthecurrentworkflownameandcurrentpathwiththeconsequentconcatenationandpassingofthevaluetotheinputparameteroftheWF_parallelworkflow.TheWF_parallelworkflowdemonstrateshowDataServicesexecutestwoworkflowobjectsplacedinthesamelevelthatarenotlinkedtoeachother.Here,wecannotusethescripttopreparetoperformourusualsequenceofscriptlogicsteps.Ifyoutrytoaddascriptobjectnotlinkedtotheparallelworkflows,DataServicesgivesyouanerrormessagefromthejobvalidationprocess:

Page 252: SAP Data Services 4.x Cookbook

Ifyoutrytolinkthescriptobjecttooneoftheworkflows,youwillgetthefollowingerrormessage:

NoteNotehowDataServicesdoesnotallowyoutolinkthescriptobjecttobothworkflows.

Ifusedwithinajoboraworkflow,scriptobjectsdisableparallelexecutionlogic,allowingyouonlyasequentialexecutionwithinthecurrentcontext:

Tomakesurethatyourworkflowexecutessimultaneouslyandrunsinparallel,makesurethatyoudonotusethescriptobjectinthesameworkspace.

Thatiswhy,whenwepassthevaluestotheinputparametersoftwoworkflowsexecutedinparallel,WF_level_3_1andWF_level_3_2,wespecifytheconcatenationformularightintheinputparametervaluefield:

It’sveryimportanttounderstandthat$p_wf_parent_namearetwodifferentparametersintheprecedingscreenshot.Theoneontheleft-handsideisthe$p_wf_parent_nameinputparameterbelongingtothechildobjectWF_level_3_1,whichasksforavalue.Theoneontheright-handsidebelongstothecurrentworkflowWF_parallel,inwhichcontextwearelocatedatthemoment,anditholds

Page 253: SAP Data Services 4.x Cookbook

thevaluereceivedfromitsparentobjectWF_level_1.

AftercompletionofWF_level_3_1andWF_level_3_2,DataServicescompletestheWF_parallelworkflow,thentheWF_level_1workflow,andfinallyrunstheWF_level_1_2workflow.WF_rootisthelastworkflowobjectthatisfinishingitsexecutionwithinthejob,sothejobcompletesitsexecutionsuccessfully.

Seethetracelogagaintofollowthesequenceofstepsexecuted,andmakesurethatyouunderstandwhytheywereexecutedinthisparticularorder.

Page 254: SAP Data Services 4.x Cookbook

UsingconditionalandwhileloopobjectstocontroltheexecutionorderConditionalandwhileloopobjectsarespecialcontrolobjectsthatbranchtheexecutionlogicattheworkflowlevel.Inthisrecipe,wewillmodifythejobfromthepreviousrecipetomaketheexecutionofourworkflowobjectsmoreflexible.

ConditionalandloopstructuresinDataServicesaresimilartotheonesusedinotherprogramminglanguages.

Forreaderswithnoprogrammingbackground,hereisabriefexplanationofconditionalandloopstructures.

TheIF-THEN-ELSEstructureallowsyoutochecktheresultoftheconditionalexpressionpresentedintheIFblockandexecuteseithertheTHENblockorELSEblockdependingonwhethertheresultoftheconditionalexpressionisTRUEorFALSE.TheLOOPstructureinprogramminglanguageallowsyoutoexecutethesamecodeagainandagainintheloopuntilthespecifiedconditionismet.Youshouldbeverycarefulwhencreatingloopstructuresinprogramminglanguageandcorrectlyspecifytheconditionthatexitsorendstheloop.Ifincorrectlyspecified,thecodeintheloopcouldrunindefinitely,makingyourprogramhang.

Page 255: SAP Data Services 4.x Cookbook

GettingreadyOpenthejobfromthepreviousrecipe.

Page 256: SAP Data Services 4.x Cookbook

Howtodoit…WewillgetridofourWF_parallelworkflowandexecuteonlyoneoftheunderlyingWF_level_3_1orWF_level_3_2workflowsrandomly.Thisisnotacommonscenarioyouwillseeinreallife,butitgivesaperfectexampleofhowDataServicesallowsyoutocontrolyourexecutionlogic.Performthesesteps:

1. OpenWF_level_1intheworkspaceandremoveWF_parallelfromit.2. Usingthetoolpaletteontheright-handside,createaconditionalobject,andlink

yourscriptobjecttoit.NametheconditionalobjectIf_Then_Else:

3. Double-clickontheIf_Then_ElseconditionalobjectorchooseOpenfromtheright-clickcontextmenu.

4. Youcanseethreesections:If,Then,andElse.IntheThenandElsesections,youcanputanyexecutionalelements(workflows,scripts,ordataflows).TheIffieldshouldcontaintheexpressionreturningaBooleanvalue.IfitreturnsTRUE,thenallobjectsintheThensectionareexecutedinsequentialorparallelorder,dependingontheirarrangement.IftheexpressionreturnsFALSE,thenallelementsfromtheElsesectionareexecuted:

Page 257: SAP Data Services 4.x Cookbook

5. PutWF_level_3_1fromLocalObjectLibraryintotheThensection.6. PutWF_level_3_2fromLocalObjectLibraryintotheElsesection.7. Mapinputparametercallsofeachworkflowtothelocal$l_wf_namevariableofthe

parentWF_level_1workflowobject.YoucannowseethatwithouttheWF_parallelworkflow,bothWF_level_3_1andWF_level_3_2areoperatingwithinthecontextoftheWF_level_1workflow(rememberthattheconditionalobjectdoesnothaveitsowncontextandvariablescope,anditistransparentinthataspect).

8. Typeinthefollowingexpressionthatrandomlygenerates0or1intheIfsection:cast(round(rand_ext(),0),‘integer’)=1

Wewillusethisexpressiontorandomlygenerateeither0or1inordertoexecutetheETLplacedinTHENorELSEblockseverytimeweruntheDataServicesjob.

9. Saveandexecutethejob.Thetracelogshowsthatonlyoneworkflow,WF_level_3_2,wasexecuted.TohavemorevisibilityonthevaluesgeneratedbytheIfexpression,youcanputinthescriptbeforeIf_Then_Elseandassignitsvaluetoalocalvariable,whichcanbeusedafterthatintheIfsectionoftheIf_Then_ElseobjecttogettheBooleanvalue:

Page 258: SAP Data Services 4.x Cookbook

Now,let’smakeourlastworkflowobjectinthejobrun10timesinaloop,usingthesesteps:

1. OpenWF_rootinthemainworkspace.2. DeleteWF_level_1_2fromtheworkspace.3. Addawhileloopobjectfromthetoolpalette,nameitWhile_Loop,andlinkitto

WF_level_1,asshowninthefollowingscreenshot.Asweknowthatwearegoingtorunaloopfor10cycles,weneedtocreateacounterthatwewilluseintheloopcondition.Forthispurpose,createa$l_countlocalintegervariablefortheWF_rootworkflowandassignitavalue“1”intheinitialscript.YourcodeintheScriptobjectshouldlooklikethis:$l_wf_name=workflow_name();

print(‘INFO:running{$l_wf_name}(parent={$p_wf_parent_name})’);

$l_wf_name=$p_wf_parent_name||’>’||$l_wf_name;

$l_count=1;

Page 259: SAP Data Services 4.x Cookbook

4. OpentheWhile_LoopintheworkspaceandplacetheWF_level_1_2workflowbycopyingordraggingitfromLocalObjectLibrary.

5. Placetwoscriptobjects,scriptandincrease_counter,beforeandaftertheworkflowandlinkallthreeobjectstogether.

6. Theinitialscriptwillcontaintheprint()functiondisplayingthecurrentloopcycle,andthefinalscriptwillincreasethecountervalueby1.YoualsohavetoputtheconditionalexpressionthatchecksthecurrentcountervalueinthewhilefieldoftheWhile_Loopobject.Theexpressionis$l_count<=10:

Theconditionalexpressionischeckedaftereachloopcycle.TheloopexecutessuccessfullyassoonastheconditionalexpressionreturnsFALSE.

7. Mapthe$p_wf_parent_nameinputparameterofWF_level_1_2tothelocalvariablefromtheparent’scontext,$l_wf_name,bygoingtoVariablesandParameters|Calls.

8. Saveandexecutethejob.CheckyourtracelogfiletoseethatWF_level_1_2wasexecuted10times:

Page 260: SAP Data Services 4.x Cookbook

Howitworks…Theif-then-elseconstructionisavailableinthescriptinglanguageaswell,butasyouknowalready,theusageofscriptobjectswithworkflowsisquitelimited—youcanonlyjointheseobjectssequentially.Thisiswhereconditionalobjectscomeinaction.

Themaincharacteristicoftheconditionalandwhileloopobjectsisthattheyarenotworkflowsanddonothavetheirowncontext.Theyoperatewithinthevariablescopeoftheirparentobjectsandcanonlybeplacedwithinaworkfloworjobobject.Thatiswhy,youneedtocreateanddefinealllocalvariablesusedintheif-then-elseorwhileconditionalexpressioninsidetheparentobjectcontext.

NoteScriptobjectshavetheirownif-then-elseandwhileloopconstructions,andtobranchlogicwithindataflows,youcanuseCase,Validation,orsimplyQuerywithfilteringconditionstransforms.

Page 261: SAP Data Services 4.x Cookbook

Thereismore…WorkflowobjectsthemselveshaveafewoptionstocontrolhowtheyareexecutedwithinthejobthataddssomeflexibilitytotheETLdesign.Theywillbeexplainedinthefollowingrecipesofthischapter.Now,wewilljusttakealookatoneofthem.

ThisistheExecuteonlyonceoptionavailableintheworkflowobjectpropertieswindow.

Toopenit,justright-clickontheworkfloweitherintheworkspaceorinLocalObjectLibraryandchooseProperties…fromthecontextmenu:

Toseetheeffectthisoptionhasonworkflowexecution,takethejobfromthisrecipeandtickthisoptionfortheWF_level_1_2workflow—theonethatrunsinloop.

Then,savethejobandexecuteit.Thetraceloglookslikethisnow:

Whatishappeningherenowisthataftersuccessfullyexecutingtheworkflowforthefirsttimeinthefirstrun,thewhilelooptriestodothisanother9times.However,astheworkflowhasalreadyrunwithinthisjobexecution,itskipsitwithsuccessfulworkflowcompletionstatus.

Thisoptionisrarelyusedwithinaloopasyou,ofcourse,donotputanythinginloopthatcanbeexecutedonlyonce,butitshowshowDataServicesdealswithworkflowslikethis.

Themostcommonscenarioiswhenyouputthespecificworkflowinmultiplebranchesoftheworkflowhierarchyasadependencyforotherworkflowsandyouonlyneedittobeexecutedoncewithoutcaringwhichbranchitwillbeexecutedinfirstaslongasitcompletessuccessfully.

Thescopeofthisoptionisrestrictedbyajoblevel.Ifyouplacetheworkflowwiththisoptionenabledinmultiplejobsandruntheminparallel,theworkflowwillbeexecutedonceineachjob.

Page 262: SAP Data Services 4.x Cookbook

UsingthebypassingfeatureThebypassingoptionallowsyoutoconfigureaworkflowordataflowobjecttobeskippedduringthejobexecution.

Page 263: SAP Data Services 4.x Cookbook

Gettingready…Wewillusethesamejobasinthepreviousrecipe.

Page 264: SAP Data Services 4.x Cookbook

Howtodoit…Let’sconfiguretheWF_level_1workflowobjectthatbelongstotheparentWF_rootworkflowtobeskippedpermanentlywhenthejobruns.

Theconfigurationofthisfeaturerequirestwosteps:creatingabypassingsubstitutionparameterandenablingthebypassingfeaturefortheworkflowusingacreatedsubstitutionparameter.

1. Tocreateabypassingsubstitutionparameter,followthesesteps:

1. GotoTools|SubstitutionParameterConfigurations….2. OntheSubstitutionParameterEditorwindow,youcanseethelistofdefault

substitutionparametersusedbyDataServices.3. Clickontheemptyfieldatthebottomofthelisttocreateanewsubstitution

parameter.4. Youcanchooseanynameyouwant,butrememberthatallsubstitution

parametersstartwiththedoubledollarsign.5. Callyournewsubstitutionparameter$$BypassEnabledandchoosethedefault

valueYESintheConfigurationcolumntotheright:

6. Asafinalstep,clickonOKtocreatethesubstitutionparameter.2. Now,youcan“label”anyworkflowobjectwiththissubstitutionparameterifyou

wantittobebypassedduringjobexecution.Followthesesteps:1. OpentheWF_rootworkflowwithinyourjobtoseeWF_level_1inthemain

workspacewindow.2. Right-clickontheWF_level_1workflowandchooseProperties…fromthe

contextmenutoopentheworkflowpropertieswindow.3. ClickontheBypassfieldcomboboxandchoosethenewlycreatedsubstitution

parameterfromthelist,[$$BypassEnabled].Bydefault,the{NoBypass}valueischoseninthisfield:

Page 265: SAP Data Services 4.x Cookbook

4. ClickonOK.Theworkflowbecomesmarkedwithacrossedred-circleicon.Thismeansthatduringthejobexecution,thisworkflowwillbeskipped,andthenextobjectinthesequencewillbeexecutedstraightaway:

Page 266: SAP Data Services 4.x Cookbook

Howitworks…Now,let’sseewhathappenswhenyourunthejob:

Duringjobvalidation,youcanseeawarningmessagetellingyouthataparticularworkflowwillbebypassed:

Whenthejobisexecuted,itrunstheworkflowsequenceasusual,exceptwhenitgetstothebypassedworkflowobject.Theworkflowobjectisskippedandalldependentobjects,theobjectnextinsequenceandtheparentworkflowwherethebypassedobjectresides,consideritsexecutiontobesuccessful.Ifyoutakealookatthetracelogofthejobexecution,youwillseesomethingsimilartothisscreenshot:

Page 267: SAP Data Services 4.x Cookbook

Thereismore…InDataServices,thereismorethanonewaytosetuptheworkflowobjectasbypassed.Ifyouright-clickontheworkflowobject,youwillseethattheBypassoptionisavailableinthecontextmenudirectly.ItopenstheSetBypasswindowwiththesamecomboboxlistofsubstitutionparametervaluesavailableforthisoption.

NoteYoucannotonlybypassworkflows.Dataflowobjectscanbebypassedinthesamemanner.

Page 268: SAP Data Services 4.x Cookbook

Controllingfailures–try-catchobjectsIntheCreatingcustomfunctionsrecipeinChapter3,DataServicesBasics–DataTypes,ScriptingLanguage,andFunctions,wecreatedacustomfunctionshowinganexampleofthetry-catchblockexceptionhandlinginthescriptinglanguage.Likeinthecaseofif-then-elseandwhileloop,DataServiceshasavariationofthetry-catchconstructionfortheworkflow/dataflowobjectlevelaswell.Youcanputthesequenceoftheexecutableobjects(workflows/dataflows)betweenTryandCatchobjectsandthencatchpotentialerrorsintheCatchobjectwhereyoucanputscripts,dataflows,orworkflowsthatyouwanttoruntohandlethecaughterrors.

Page 269: SAP Data Services 4.x Cookbook

Howtodoit…Thestepstodeployandenabletheexceptionhandlingblockinyourworkflowstructureareextremelyeasyandquicktoimplement.

Allyouhavetodoisplaceanobjectorsequenceofobjectsfromwhichyouwanttocatchpossibleexceptionsbetweentwospecialobjects,TryandCatch.Then,followthesesteps:

1. Openthejobfromthepreviousrecipe.2. OpenWF_rootintheworkspace.3. ChoosetheTryobjectfromtheright-sidetoolpaletteandplaceitatthebeginningof

theobjectsequence.NameitTry:

4. ChoosetheCatchobjectfromtheright-sidetoolpaletteandplaceitattheendoftheobjectsequence.NameitCatch:

5. TheTryobjectisnotmodifiableanddoesnothaveanypropertiesexceptdescription.Itsonlypurposeistomarkthebeginningofthesequenceforwhichyouwanttohandleanexception.

6. Double-clickontheCatchobjecttoopenitinthemainworkspace.Notethatallexceptiontypesareselectedbydefault.Thisway,wemakesurethatwecatchanypossiblefailuresthatcanhappenduringourcodeexecution.Ofcourse,therecanbescenarioswhenyouwanttheETLtofailanddonotwanttorunthecodeintheCatchblockforsometypesoferrors.Inthiscase,youcandeselecttheexceptiontobehandledintheCatchblock.Inourexample,wejustwantourcodetocontinuetorunputtingtheerrormessageinthetracelog.

7. Createthescriptobjectwiththefollowinglineinit:print(‘ERROR:exceptionhasbeencaughtandhandledsuccessfully’);

Page 270: SAP Data Services 4.x Cookbook

8. Saveandexecutethejob.Theexceptionyougeneratedinthescriptissuccessfullyhandledbythetry-catchconstruction,andthejobcompletessuccessfully.

Page 271: SAP Data Services 4.x Cookbook

Howitworks…Ifyoutakealookatthetracelogofyourjobrun,youcanseethattheWF_level_3_1andWF_level_1workflowsfailed:

WF_level_3_1failedastheexceptionwasraisedinthescriptinsideit,andWF_level_1failedbecauseitsexecutiondependsonthechildobjectWF_level_3_1.Youshouldrememberthatifanychildobjectswithinaworkflowfail(anotherworkflow,dataflow,orscript),theparentobjectfailsimmediately.Then,theparent’sparentobjectfailsaswell,andsoon,untiltherootlevelofthejobhierarchyisreachedandthejobitselffailsandstopsit’sexecution.

Byplacingthetry-catchsequenceinsideWF_root,wemadeitpossibletocatchallexceptionsinsideit,makingsurethatourWF_rootworkflowneverfails.

NoteTry-catchobjectsdonotpreventajobfromfailinginthecaseofthecrashofthejobserveritself.Thisis,ofcourse,becausethesuccessfulexecutionofthetry-catchlogicdependsontheworkoftheDataServicesjobserver.

Notethattheerrorlogisstillgeneratedinspiteofthesuccessfuljobexecution.Inthere,youcanseetheloggingmessagethatwasgeneratedbythelogicfromthecatchobjectandthecontextinwhichtheinitialexceptionhappened:

Page 272: SAP Data Services 4.x Cookbook

Try-catchobjectscanbeavitalpartofyourrecoverystrategy.Ifyourworkflowcontainsafewstepsthatyoucanthinkofasatransactionalunit,youwouldwanttocleanupwhensomeofthesestepsfailbeforerunningthesequenceagain.Asexplainedintherecipededicatedtotherecoverytopic,theDataServicesautomaticrecoverystrategyjustsimplyskipsthestepsthathavealreadybeenexecuted,andsometimes,thisissimplynotenough.

Italldependsonhowthoroughyouhavetobeduringyourrecovery.

Anotherveryimportantaspectistounderstandthattry-catchblockspreventthefailureoftheworkflowinwhichcontexttheyareput.Thismeansthattheerrorishiddeninsidethetry-catchandparentworkflow,andallsubsequentobjectsdowntheexecutionpathwillbeexecutedbyDataServices.

Therearesituationswhenyoudefinitelywanttofailthewholejobtopreventanyfurtherexecutionifsomeofthedataprocessinginsideitfails.Youcanstillusetry-catchblockstocatchtheerrorinordertologitproperlyordosomeextrasteps,butafterallthisisdone,theraise_exception()functionattheendofthecatchblockisputtofailtheworkflow.

Page 273: SAP Data Services 4.x Cookbook

Usecaseexample–populatingdimensiontablesInthisrecipe,wewillbuildtheETLjobtopopulatetwodimensiontablesintheAdventureWorks_DWHdatabase,DimGeographyandDimSalesTerritory,withthedatafromtheoperationaldatabaseAdventureWorks_OLTP.

Page 274: SAP Data Services 4.x Cookbook

GettingreadyForthisrecipe,youwillhavetocreatenewjob.Also,createtwonewschemasintheSTAGEdatabase:ExtractandTransform.Todothis,opentheSQLServerManagementStudio,expandDatabases|STAGE|Security|Schemas,right-clickontheSchemasfolder,andchoosetheNewSchema…optionfromthecontextmenu.Specifyyouradministratoruseraccountasaschemaowner.

Page 275: SAP Data Services 4.x Cookbook

Howtodoit…1. Inthefirststep,wewillcreateextractionprocessesusingthesesteps:

1. OpenthejobcontextandcreatetheWF_extractworkflow.2. OpentheWF_extractworkflowintheworkspaceandcreatefourworkflows:

eachforeverysourcetableweextractfromtheOLTPdatabase:WF_Extract_SalesTerritory,WF_Extract_Address,WF_Extract_StateProvince,WF_Extract_CountryRegion.Donotlinktheseworkflowobjectstomakethemruninparallel.

3. OpenWF_Extract_SalesTerritoryinthemainworkspaceareaandcreatetheDF_Extract_SalesTerritorydataflow.

4. OpenDF_Extract_SalesTerritoryintheworkspacearea.5. AddasourcetablefromtheOLTPdatastore:SalesTerritory.6. PlacetheQuerytransformafterthesourcetable,linkthem,opentheQuery

transformobjectintheworkspace,andmapallsourcecolumnstothetargetschemabyselectingthemtogetheranddraggingthemtothetargetschemaemptysection.

7. ExitQueryEditorandaddthetargettemplatetable,SalesTerritory.ChooseDS_STAGEasadatastoreobjectandExtractastheownertocreateatargetstagetableintheExtractschemaoftheSTAGEdatabase.

8. YourdataflowandQuerytransformmappingshouldlookasshowninthescreenshotshere:

9. Inthesamemanner,usingsteps3to8,createextractdataflowobjectsfortheotherOLTPtables:Address(dataflowDF_Extract_Address),StateProvince(dataflowDF_Extract_StateProvince),andCountryRegion(dataflowDF_Extract_CountryRegion).Placeeachofthecreateddataflowsinsidetheparentobjectwiththesamename,substitutingtheprefixDF_withWF_andputallextractworkflowstoruninparallelinsidetheWF_extractworkflowobject.Tonamethetargettemplatetablesinsideeachofthedataflows,choosethesame

Page 276: SAP Data Services 4.x Cookbook

nameasofthesourcetableobjectandselectDS_STAGEasadatabaseforthetabletobecreatedinandExtractastheowner/schema:

2. Now,let’screatetransformationprocessesusingthesesteps:1. GotothejobcontextlevelinyourDesignerandopentheWF_transformobject.2. Aswewillpopulatetwodimensiontables,wewillcreatetwotransformation

workflowsrunninginparallelforeachoneofthem:WF_Transform_DimSalesTerritoryandWF_Transform_DimGeography.

3. OpenWF_transform_DimSalesTerritoryandcreateanewdataflowinitsworkspace:DF_Transform_DimSalesTerritory.

4. Openthedataflowobjectanddesignitasshowninthefollowingscreenshot:

5. ItisnowimportantforthetransformationdataflowstocreatetargettemplatetablesintheTransformschemacreatedearlier.ThenameofthetargettabletemplateobjectshouldbethesameasthetargetdimensiontableinDWH.

6. TheJoinQuerytransformperformsthejoinoftwosourcetablesandmapsthecolumnsfromeachoneofthemtotheQueryoutputschema.Aswedonotmigrateimagecolumns,specifyNULLasamappingfortheSalesTerritoryImageoutputcolumn.Also,specifyNULLasamappingforSalesTerritoryKey,asitsvaluewillbegeneratedinoneoftheloadprocesses:

Page 277: SAP Data Services 4.x Cookbook

7. TocreatethetransformationprocessforDimGeography,gobacktotheWF_transformworkflowcontextlevelandcreateanewworkflowWF_Transform_DimGeographywithadataflowDF_Transform_DimGeographyinside.

8. Inthedataflow,wewillsourcethedatafromthreeOLTPtables,Address,StateProvince,andCountryRegion,topopulatethestagetransformationtablewiththetabledefinitionthatmatchesthetargetDWHDimGeographytable:

9. SpecifyjoinconditionsforallthreesourcetablesintheJoinQuerytransformandmapthesourcecolumntothetargetoutputschema:

Page 278: SAP Data Services 4.x Cookbook

10. PlaceanotherQuerytransformandnameitMapping.LinktheJoinQuerytransformtotheMappingQuerytransformandmapthesourcecolumnstothetargetschemacolumnswhichmatchthetabledefinitionoftheDWHDimGeographytable.MaponeextracolumnTERRITORYIDfromsourcetotarget:

11. IntheMappingQuerytransform,placeNULLinthemappingsectionsforthecolumnsthatwearenotgoingtopopulatevaluesfor.

3. Now,weneedtocreatefinalloadprocessesthatwillmovethedatafromthestagetransformationtablesintothetargetDWHdimensiontables.Performthesesteps:1. OpentheWF_loadworkflow,addtwoworkflowobjects

WF_Load_DimSalesTerritoryandWF_Load_DimGeography,andlinkthemtogethertorunsequentially.

2. OpenWF_Load_DimSalesTerritoryandcreateadataflowobject,DF_Load_DimSalesTerritory,insideit.

3. ThisdataflowwillperformacomparisonofsourcedatatoatargetDimSalesTerritorydimensiontabledataandwillproducethesetofupdatesfor

Page 279: SAP Data Services 4.x Cookbook

theexistingrecordswhosevalueshavechangedinthesourcesystem,orwillinsertrecordswithkeycolumnvaluesthatdonotexistinthedimensiontableyet:

4. IntheQuerytransform,simplymapallsourcecolumnsfromtheDimSalesTerritorytransformationtabletotheoutputschema.

5. InsidetheTable_Comparisonobject,definetargetDWHDimSalesTerritoryasacomparisontableandspecifySalesTerritoryAlternateKeyasakeycolumnandthreecomparecolumnsSalesTerritoryRegion,SalesTerritoryCountry,andSalesTerrtoryGroup,asshownhere:

6. Asthefinalstepinthedataflow,beforeinsertingdataintotargettableobject,theKey_GenerationtransformhelpsyoutopopulatetheSalesTerritoryKeycolumnofthetargetdimensiontablewithsequentialsurrogatekeys.SurrogatekeysarethekeysusuallygeneratedduringthepopulationofDWHtables.Surrogatekeycolumnscanidentifytheuniquenessoftherecord.Thisway,youhaveasinglecolumnwithauniqueIDthatyoucanuseinsteadofreferencingmultiplecolumnsinthetable,whichdefinestheuniquenessoftherecord:

Page 280: SAP Data Services 4.x Cookbook

7. Bydefault,alldimensiontablesintheDWHdatabaseweareusinghaveidentitycolumns.InSQLServer,theidentitycolumnsfeatureallowsyoutodelegatetheprocessofsurrogatekeyscreationtotheSQLServerdatabase.Yousimplyinserttherecordwithoutspecifyingvaluesfortheidentitycolumn,andtheSQLServerpopulatesthefieldforyouwiththesequentialuniquenumber.Inourcase,wewanttohavecontroloverthekeycreationourselvestobeabletogeneratethekeysintheETLbeforeinsertingthedata.Todothis,wehavetoenableIDENTITYINSERTbeforeinsertingtherecordsanddisableitaftertheinsert.Otherwise,youwillreceivetheerrormessagefromtheSQLServerinformingyouthatyoucannotpopulateidentitycolumnswithvaluesasitisdoneautomaticallybythedatabaseengine.

ToswitchtheabilitytoinsertsurrogatekeysinidentitycolumnsfromDataServices,openTargetTableEditoroftheDimSalesTerritorytableandpopulatethePre-LoadCommandsandPost-LoadCommandstabswiththefollowingtwocommandscorrespondingly:setidentity_insertdimsalesterritoryon

setidentity_insertdimsalesterritoryoff

8. Now,let’screatethesecondloadprocessofpopulatingtheDimGeographydimensiontable.OpentheDF_Load_DimGeographydataflowintheworkspacearea.

9. Thedataflowwillhavethesamestructureasthepreviousone,exceptthatwewilllookuptothealreadypopulatedDimSalesTerritorydimensiontableforSalesTerritoryKey:

Page 281: SAP Data Services 4.x Cookbook

10. IntheQuerytransform,mapallcolumnsfromthestageTransform.DimGeographytableandoneSalesTerritoryKeyfromtheDWHDimSalesTerritorytabletotheoutputschema.Forthejoincondition,specifythefollowingone:DIMGEOGRAPHY.TERRITORYID=DIMSALESTERRITORY.SALESTERRITORYALTERNATEKEY

11. Mappingtransformoutputschemadefinitionmatchesthetargettabledefinition,andhere,wewillfinallydroptheTERRITORYIDcolumnfromthemappings,aswedonotneeditanymore.

12. SpecifythefollowingsettingsintheTable_Comparisontransform:

13. IntheKey_Generationtransform,specifyDWH.DBO.DIMGEOGRAPHYasthetablenameandGEOGRAPHYKEYasthegeneratedkeycolumn.

14. Also,donotforgettodefinethecommandsinPre-LoadandPost-LoadtargettablesettingstoswitchonIDENTITY_INSERTandswitchitoffaftertheinsertiscomplete.Usethefollowingcommands:setidentity_insertdimgeographyon

setidentity_insertdimgeographyoff

Page 282: SAP Data Services 4.x Cookbook

Howitworks…Let’sreviewthedifferentaspectsoftheexamplewejustimplementedintheprevioussteps.

MappingBeforeyoustarttheETLdevelopmentinDataServices,youhavetodefinethemappingbetweensourcecolumnsofoperationaldatabasetables,targetcolumnsofDataWarehousetables,andtransformationrulesforthemigrateddata,ifrequired.Atthisstep,youalsohavetoidentifydependenciesbetweensourcedatastructurestocorrectlyidentifytypesofjoinrequiredtoextractthecorrectdataset.

Targetcolumn Sourcetable Sourcecolumn Transformationrule

SalesTerritoryKey NULL GeneratedsurrogatekeyinDWH

SalesTerritoryAlternateKey SalesTerritory TerritoryID Directmapping

SalesTerritoryRegion SalesTerritory Name Directmapping

SalesTerritoryCountry CountryRegion Name Directmapping

SalesTerritoryGroup SalesTerritory Group Directmapping

SalesTerritoryImage Notmigrating

Table1:MappingsfortheDimSalesTerritorydimension

Here,youcanfindthemappingtablefortheDimGeographydimension:

Targetcolumn Sourcetable Sourcecolumn Transformationrule

GeographyKey NULL GeneratedsurrogatekeyinDWH

City Address City Directmapping

StateProvinceCode StateProvince StateProvinceCode Directmapping

StateProvinceName StateProvince Name Directmapping

Page 283: SAP Data Services 4.x Cookbook

CountryRegionCode CountryRegion CountryRegionCode Directmapping

EnglishCountryRegionName CountryRegion Name Directmapping

SpanishCountryRegionName NULL Notmigrated

FrenchCountryRegionName NULL Notmigrated

PostalCode Address PostalCode Directmapping

SalesTerritoryKey DimSalesTerritory SalesTerritoryKey Lookup

IpAddressLocator NULL Notmigrated

Table2:MappingsforDimGeographydimension

Themajorityaredirectmappings,whichmeansthatwedonotchangethemigrateddataandmoveitasisfromsourcetotarget.TheinformationinthesemappingtablesisusedprimarilyintheQuerytransformsinsidethedataflowstojointhesourcetabletogetherandmapsourcecolumnsfromsourcetotargetschema:

Dependencies

ThenextstepistodefinethedependenciesbetweenpopulatedtargettablestounderstandinwhichorderETLprocessesloadingdataintothemshouldbeexecuted.TheprecedingdiagramshowsthatSalesTerritoryKeyfromtheDimSalesTerritorydimensiontableisusedasareferencekeyintheDimGeographydimensiontable.ThismeansthatETLprocessespopulatingeachofthesetablescannotbeexecutedinparallelandshouldrunsequentially,aswhenwepopulatetheDimGeographytable,wewillrequiretheinformationinDimSalesTerritorytobealreadyupdated.

Page 284: SAP Data Services 4.x Cookbook

Development

AfterdefiningthemappingsandtransformationrulesandmakingthedecisionabouttheexecutionorderofETLelements,youcanfinallyopentheDesignerapplicationandstartdevelopingtheETLjob.

NoteThenamingconventionsfortheworkflow,dataflow,scripts,anddifferenttransformationobjectsaswellasforstagingtableobjectsisveryimportant.ItallowsyoutoeasilyreadtheETLcodeandunderstandwhatresidesinonetableoranotherandwhattypeofoperationisperformedbyaspecificdataflowortransformationobjectwithinadataflow.

OurETLjobcontainsthreemainstagesthataredefinedbythreeworkflowobjectscreatedinthejob’sworkspace.Eachoftheseworkflowsplaysaroleofthecontainerfortheunderlyingworkflowobjectscontainingdataflows:

ThefirstworkflowcontainerWF_extractcontainstheprocessingunitsthatextractthedatafromtheOLTPsystemintotheDWHstagingarea.Therearedifferentadvantagesofthisapproachratherthanextractingandtransformingdatawithinthesamedataflow.Themainreasonisthatbycopyingthedataasisinthestagingarea,youaccesstheproductionOLTPsystemonlyonce,creatingaconsistentsnapshotoftheOLTPdataatspecifictime.Youcanqueryextractedtablesinstagingasmanytimesasyouwant,withoutaffectingtheliveproductionsystem’sperformance.Wedonotapplyanytransformationsormappinglogicintheseextractionprocessesandaresimplycopyingthecontentsofthesourcetablesasis.ThesecondworkflowcontainerWF_transformselectsthedatafromthestagetables,assemblesit,andtransformstomatchthetargettabledefinition.Atthisstage,wewillleaveallsurrogatekeycolumnsemptyandNULL-outthecolumnsforwhichwearenotgoingtomigratevalues.

NoteIntheDF_Transform_DimGeographydataflow,thetargettemplatetabledoesnotexactlymatchtheDWHtable’sDimGeographydefinition.WewillkeeponeextracolumnfromthesourceTERRITORYIDtoreferenceanotherdimensiontableDimSalesTerritoryattheloadstage.Withoutthiscolumn,wewouldnotbeabletolinkthesetwodimensiontablestogether.

Thethirdworkflowcontainer,WF_load,loadsthetransformeddatasetsintothetargetDWHdimensiontables.Anotherimportantoperationthisstepperformsisgeneratingsurrogatekeysforthenewrecordstobeinsertedintothetargetdimensiontable.

AnotherimportantdecisionyouhavetomakewhenyoupopulatedimensiontablesusingtheTable_Comparisontransformiswhichsetofkeysdefineanewrecordinthetargetdimensiontableandwhichcolumnsyouarecheckingforupdatedvalues.

Inthisexample,wemadeadecisiontoselectonlytwocomparisoncolumns,PostalCodeandSalesTerritoryKey.Wheneverthereisanewlocation(City+State+Country),therecordisinserted,andifthelocationexists,DataServicescheckswhetherthesource

Page 285: SAP Data Services 4.x Cookbook

recordcomingfromtheOLTPsystemcontainsnewvaluesinthePostalCodeorSalesTerritoryKeycolumn.Ifyes,thentheexistingrecordinthetargetdimensiontablewouldbeupdated.

NoteNotethatinthetransformationprocesseswedeveloped,wedidnotgenerateDWHsurrogatekeysforournewrecords.Themaingoalofthetransformationprocessistoassemblethedatasetforittomatchthetargettabledefinitionandapplyallrequiredtransformationifthesourcedatadonotcomplywiththedatawarehouserequirements.

Executionorder

Allthreestepsorthreeworkflows,WF_extract,WF_transform,andWF_load,runsequentiallyoneafteranother.Thenextworkflowstartsexecutiononlyaftersuccessfulcompletionofthepreviousone.

ChildobjectsofbothWF_extractandWF_transformruninparallelasatthosestages,wearenottryingtolinkthemigrateddatasetstoeachotherwithreferencekeys.

Atthefinalloadstage,WF_Load,containstwoworkflowobjectsthatrunsequentially.First,wewillfullypopulateandupdatetheDimSalesTerritorydimension,andthenafterit’sdone,wecansafelyreferenceitwhenpopulatingtheDimGeographytable.

TestingETL

ThebestwaytotestETListomakechangestothesourcesystem,runtheETLjob,andcheckthecontentsofthetargetdatawarehousetables.PreparingtestdatatopopulateDimSalesTerritory

Let’smakesomechangestothesourcedata.WewilladdanewsalesterritoryintheSales.SalesTerritorytableandanewstateinthePerson.StateProvincetable.RunthefollowingcodeintheSQLServerManagementStudio:—InsertnewrecordsintosourceOLTPtablestotestETL

—populatingDimSalesTerritory

USE[AdventureWorks_OLTP]

GO

—Insertnewsalesterritory

INSERTINTO[Sales].[SalesTerritory]

([Name],[CountryRegionCode],[Group],[SalesYTD],[SalesLastYear]

,[CostYTD],[CostLastYear],[rowguid],[ModifiedDate])

VALUES

(‘Russia’,‘RU’,‘Russia’,9000000.00,0.00

,0.00,0.00,NEWID(),GETDATE());

—Insertnewstate

INSERTINTO[Person].[StateProvince]

([StateProvinceCode],[CountryRegionCode],[IsOnlyStateProvinceFlag]

,[Name],[TerritoryID],[rowguid],[ModifiedDate])

VALUES

(‘CR’,‘RU’,1,‘Crimea’,12,NEWID(),GETDATE());

Page 286: SAP Data Services 4.x Cookbook

GO

PreparingtestdatatopopulateDimGeography

Toupdatethesourcetables,runthefollowingscriptintheSQLServerManagementStudio.ThisshouldcreateanewaddresswithanewcitywhichdoesnotyetexistintheDimGeographydimension.Youcouldskipthisstepas,bydefault,theOLTPdatabasehasmultipleaddressrecordsthatdonothavecorrespondentrowsinthetargetDWHdimension,buttomakethetestmoretransparent,itisrecommendedthatyoucreateyourownnewrecordinthesourcesystem:—InsertnewrecordsintosourceOLTPtablestotestETL

—populatingDimGeographydimension

USE[AdventureWorks_OLTP]

GO

—Insertnewaddress

INSERTINTO[Person].[Address]

([AddressLine1],[AddressLine2],[City],[StateProvinceID]

,[PostalCode],[SpatialLocation],[rowguid],[ModifiedDate])

VALUES

(‘10SuvorovaSt.’,NULL,‘Sevastopol’,182,‘299011’,NULL,NEWID(),GETDATE());

GO

Now,executethejobandquerybothdimensiontables.ThereisonenewrowinsertedinDimSalesTerritorywithSalesTerritoryKey=12andmultiplerecordswereinsertedintoandupdatedintheDimGeographytable.

AmongthenewrecordsinDimGeography,youshouldbeabletoseetherecordforthenewcityofSevastopolthatweinsertedmanuallywiththehelpoftheprecedingscript.

NoteIfyourunthejobagainwithoutmakingchangestothesourcesystem’sdata,itshouldnotcreateorupdateanyrecordsinthetargetdimensiontables,asallchangeshavealreadybeenpropagatedfromOLTPtoDWHbythefirstjobrun.ThemainobjectinourETLdrivingthechangestrackingistheTable_Comparisontransform.

UsingacontinuousworkflowInthisrecipe,wewilltakeacloselookatoneoftheworkflowobjectfeaturesthatcontrolshowtheworkflowrunswithinajob.

Howtodoit…

1. CreateajobwithasingleworkflowinsidenamedWF_continuous.Createasingle

globalvariable$g_countoftheintegertypeatthejoblevelcontext.2. Opentheworkflowpropertiesbyright-clickingontheworkflowobjectandselecting

theProperties…optionfromthecontextmenuandchangetheworkflowexecutiontypetoContinuousontheGeneralworkflowpropertiestab:

Page 287: SAP Data Services 4.x Cookbook

3. ExittheworkflowpropertiesbyclickingonOK.SeehowtheiconoftheworkflowobjectchangeswhenitsexecutiontypeischangedfromRegulartoContinuous:

4. GotoLocalObjectLibrary|CustomFunctions.5. Right-clickontheCustomFunctionslistandselectNewfromthecontextmenu.6. Namethecustomfunctionfn_check_flagandclickonNexttoopenthecustom

functioneditor.7. Createthefollowingparametersandvariables:

Variable/parameter Description

$p_DirectoryInputparameterofthevarchar(255)typetostorethedirectorypathvalue

$p_FileInputparameterofthevarchar(255)typetostorethefilenamevalue

$l_existLocalvariableoftheintegertypetostoretheresultofthefile_exists()function

8. Addthefollowingcodetothecustomfunctionbody:$l_exist=file_exists($p_Directory||$p_File);

if($l_exist=1)

begin

print(‘Check:fileexists’);

Return0;

end

else

begin

print(‘Check:filedoesnotexist’);

Return1;

End

Yourcustomfunctionshouldlooklikethis:

Page 288: SAP Data Services 4.x Cookbook

9. OpentheworkflowpropertiesagaintoeditthecontinuousoptionsusingtheContinuousOptionstab.

10. OntheContinuousOptionstab,tickthecheckboxwhentheresultofthefunctioniszerointheStopsectionatthebottomandinputthefollowinglineintheemptybox:fn_check_flag($l_Directory,$l_File).

11. ClickonOKtoexittheworkflowpropertiesandsavethechanges.12. Opentheworkflowinthemainworkspaceandcreatetwolocalvariablesinthe

VariablesandParameterswindow:$l_Directoryofthevarchar(255)typeand$l_Fileofthevarchar(255)type.

13. Createasinglescriptobjectwithinaworkflowandaddthefollowingcodeinit:$l_Directory=‘C:\AW\Files\’;

$l_File=‘flag.txt’;

$g_count=$g_count+1;

print(‘Execution#’||$g_count);

print(‘Starting’||workflow_name()||’…’);

sleep(10000);

print(‘Finishing’||workflow_name()||’…’);

14. Saveandvalidatethejobtomakesurethattherearenoerrors.15. Runthejobandafterfewworkflowexecutioncyclesaddtheflag.txtfileinthe

C:\AW\Files\directorytostopthecontinuousworkflowexecutionsequenceandthejobitself.

Howitworks…

Page 289: SAP Data Services 4.x Cookbook

Continuousexecutiontypeallowsyoutoruntheworkflowobjectanindefinitenumberoftimesinaloop.Therearemanyrestrictionsofusingthecontinuousworkflowexecutionmode.Someofthemareasfollows:

YoucannotnestcontinuousworkflowinanotherworkflowobjectSomedataflowtransformsarenotavailableforusewhenplacedunderacontinuousworkflowhierarchystructureAcontinuousworkflowobjectcanbeusedonlyinthebatchjob

Themainpurposeofthecontinuousworkflowisnottosubstitutethewhileloopasyoumighthavethoughtatafirstglance,buttosavememoryandprocessingresourcesforthetasksthathavetobeexecutedagainandagain,indefinitelyinthenon-stopmodeorforaverylongperiodoftime.DataServicesissavingresourcesbyinitializingandoptimizingforexecutionalltheunderlyingstructuressuchasdataflows,datastores,andmemorystructuresrequiredfordataflowprocessingonlyoncethecontinuousworkflowobjectisexecutedforthefirsttime.

ThereleaseresourcessectioninsidetheContinuousoptioncontrolshowoftenresourcesusedbytheunderlyingobjectsarereleasedandreinitialized.

Itisnotpossibletospecifytheexactnumberofcyclesforthecontinuousworkflowdirectly.Theonlyoptiontoaddthestoplogicistowriteacustomfunctionthatisexecutedaftereverycycle,andifitreturnszero,thevaluestopsthecontinuousworkflowexecutionsequence.

Intheprecedingrecipe,wecreatedacustomfunctionthatchecksthepresenceofthefileinthespecifiedfolder.Ifthefileappearsinthere,itreturns0.Thejobwillberunningindefinitelyuntilthefileappearsinthefolder,orthejobitselfiskilledmanually,orthejobservercrashes.

Tochecktheexistenceofthefile,thefile_exists()functionisused.Itreturns1ifthefileexistsand0ifitdoesnot.Thefunctionacceptsasingleparameter:afullfilenamethatincludesthepath.Asinourcase,weareinterestedinstoppingcontinuousworkflowexecution.Whenthefunctionreturns0,wehadtoinvertthereturnedvalueofthefunctionandcreatedacustomfunctionforthat.

Weaddedthesleep()functiontoimitatetheexecutionoftheworkflowsothatitwouldbeeasytoplacethefilewhiletheexecutioncycleisstillrunning.TheSleep()functionacceptsintegerparametersinmilliseconds,so10000isequalto10seconds.

Theglobalvariable$g_countwasaddedtocontrolthenumberofcyclesthatwereexecutedinthecontinuousworkflowsequence.

Anotherinterestingfactabouthowcontinuousworkflowbehavesisthatitalwaysexecutesanothercycleafterthestopfunctionreturnsthezerovalue.Lookatthefollowingscreenshot:

Page 290: SAP Data Services 4.x Cookbook

Seethatinspiteofthefactthatweplacedtheflag.txtfileduringthethirdexecutioncycleandthestopfunctionfounditandreturnedazerovalue(seetheCheck:fileexistsprintmessageinthetracelog),thefourthcyclestillwasexecuted.

Let’stryanothertesttoconfirmthis.Placetheflag.txtfilebeforethejobisexecutedandthenrunit.Thisiswhatyouseeinthetracelogfile:

Youcanseethatafterthecustomfunctionreturned0afterthefirstcycle,thecontinuousworkflowwasexecutedthesecondtime.

Thereismore…

Youhavetounderstandthatcontinuousworkflowusageisverylimitedinreallifebecauseoffunctionalrestrictionsandalsobecauseofthenatureoftheloopinwhichtheworkflowisexecuted.Inthemajorityofcases,thewhileloopobjectisapreferableoptiontoruntheworkfloworunderlyingprocessingsequenceofobjects.

Page 291: SAP Data Services 4.x Cookbook

Peekinginsidetherepository–parent-childrelationshipsbetweenDataServicesobjectsWiththeintroductionofworkflowobjects,whichallowthenestingandgroupingofobjects,youcanseethatETLcodeexecutedwithintheDataServicesjobisahierarchicalstructureofobjectsthatcanbequitecomplex.Justimagineifreal-lifejobshavehundredsofworkflowsintheirstructureandtwiceasmanydataflows.

Inthisrecipe,wewilllookunderthehoodofDataServicestoseehowitstorestheobjectinformation(ourETLcode)inthelocalDataServicesrepository.TechniqueslearnedinthisrecipecanhelpyoubrowsethehierarchyofobjectswithinyourlocalrepositorywiththehelpofthedatabaseSQLlanguagetoolset.Thisoftenprovestobeaveryconvenientmethodtouse.

Gettingready

YouwillnotcreateanyjobsorotherobjectsintheDataServicesDesigneraswearejustgoingtobrowsetheETLcodeandrunafewqueriesintheSQLServerManagementStudio.

Howtodoit…

FollowthesesimplestepstoaccessthecontentsoftheDataServiceslocalrepositoryinthisrecipe:

1. StarttheSQLServerManagementStudioandconnecttotheDS_LOCAL_REPO

databasecreatedinChapter2,ConfiguringtheDataServicesEnvironment.2. Querythedbo.AL_PARENT_CHILDtableforreferencesbetweenDataServicesobjects

andadditionalinfo.3. Querythedbo.AL_LANGTEXTtableforextraobjectpropertiesandscriptobject

contents.

Howitworks…

Queryingobject-relatedinformationfromtheDataServicesrepositorycouldbeusefulifyouwanttobuildthereportonETLmetadatathatdoesnotexistoutoftheboxinDataServices.ItalsocouldbeusefulwhentroubleshootingpotentialproblemswithyourETLcode.Wewilltakealookatthedifferentscenariosandbrieflyexplaineachcase.GetalistofobjecttypesandtheircodesintheDataServicesrepository

Usethefollowingquery:select

descen_obj_type,descen_obj_r_type,count(*)

from

dbo.al_parent_child

groupby

descen_obj_type,descen_obj_r_type;

ThemaintableofthereferenceistheAL_PARENT_CHILDtable.Itcontainsthefullhierarchyoftheobjectsstartingfromthejobobjectlevelandfinishingwiththetableobjectlevel.TheprecedingqueryshowsallthepossibleobjecttypesthatDataServices

Page 292: SAP Data Services 4.x Cookbook

registersintherepository.DisplayinformationabouttheDF_Transform_DimGeographydataflow

Usethequerytogetthisinformation:select*

from

dbo.al_parent_child

where

descen_obj=‘DF_Transform_DimGeography’;

Allcolumnsandtheirvaluesareexplainedinthistable:

Columnname Value Description

PARENT_OBJ WF_Transform_DimGeography

ThisisthenameoftheparentobjectDF_Transform_DimGeography

belongsto.Seethefollowingfigure.

PARENT_OBJ_TYPE WorkFlow Thisisthetypeoftheparentobject.

PARENT_OBJ_R_TYPE 0Thisisthetypecodeoftheparentobject.

PARENT_OBJ_DESC Nodescriptionavailable

Thisisthedescriptionoftheparentobject.ThisiswhatyouinputintheDescriptionfieldinsidetheworkflowpropertieswindowintheDesigner.Ifempty,DataServicesuses“Nodescriptionavailable”intherepotable.

PARENT_OBJ_KEY 175Thisistheinternalparentobjectkey(ID).

DESCEN_OBJ DF_Transform_DimGeographyThisistheobjectnamewearelookingupinformationfor.

DESCEN_OBJ_TYPE DataFlow Thisisthetypeoftheobject.

DESCEN_OBJ_R_TYPE 1 Thisisthetypecodeoftheobject.

DESCEN_OBJ_DESC Nodescriptionavailable

ThisisthecontentsoftheDescriptionfieldofdataflowpropertiesintheDesigner.Itisemptyforthisspecificdataflow.

Page 293: SAP Data Services 4.x Cookbook

DESCEN_OBJ_USAGE NULLThisindicateswhethertheobjectisasourceoratargetwithinadataflow.Astheobjectitselfisadataflow,thisfieldisnotpopulated.

DESCEN_OBJ_KEY 174 Thisistheinternalobjectkey(ID).

DESCEN_OBJ_DS NULL

Thisindicateswhatthedatastoreobjectbelongsto.Astheobjectwearelookingupisadataflow.thisfieldisnotpopulated.

DESCEN_OBJ_OWNER NULL

Thisisthedatabaseowneroftheobject.Itisnotapplicabletodataflowobjectseither.

DisplayinformationabouttheSalesTerritorytableobject

Usethefollowingquery:select

parent_obj,descen_obj_desc,descen_obj_usage,descen_obj_key,descen_obj_ds,descen_obj_owner

from

dbo.al_parent_child

wheredescen_obj=‘SALESTERRITORY’;

Theresultisinthefollowingscreenshot:

Fromtheprecedingscreenshot,youcanseethattwodifferentobjectswiththesamenameSALESTERRITORYexistintheDataServicesrepositorywithuniquekeys37and38.

TheonewithOBJ_KEYas37isimportedintheOLTPdatastoreandbelongstotheSalesschema.ItisusedonlyDF_Extract_SalesTerritoryasithasonlyonerecordwiththeparentobjectofthatname.

TheSALESTERRITORYobjectwithOBJ_KEYas38isastageareatableandisimportedintotheDS_STAGEdatastoreandbelongstotheExtractdatabaseschema.Ithastwodifferentparentobjects,asinDesigner,itwasplacedintotwodifferentdataflows:asatargettableobjectinDF_Extract_SalesTerritory(youcanseeitfromtheDESCEN_OBJ_USAGE

Page 294: SAP Data Services 4.x Cookbook

column)andasasourcetableobjectinDF_Transform_DimSalesTerritory.Seethecontentsofthescriptobject

TheonethingyouhaveprobablynoticedalreadyfromtheresultoftheveryfirstqueryinthisrecipeisthatDataServicesdoesnothaveascriptobjecttype.

Asyouprobablyremember,scriptobjectsdonothavetheirowncontextinDataServicesandoperateinthecontextoftheworkflowobjecttheybelongto.Thatiswhy,youhavetoquerytheinformationaboutworkflowpropertiesusinganothertableAL_LANGTEXTtofindtheinformationaboutscriptcontentsintheDataServicesrepository.

Usethefollowingquery:select*

fromdbo.al_langtexttxt

JOINdbo.al_parent_childpc

ontxt.parent_objid=pc.descen_obj_key

where

pc.descen_obj=‘WF_continuous’;

WeareextractinginformationaboutthescriptobjectcreatedintheWF_continuousworkflow.

Allworkflowpropertieswiththecontentsofallscriptsthatbelongtoitarestoredinaplaintextformat.

Inthistable,weareonlyinterestedintwocolumnsSEQNUM,whichrepresentsthenumberofpropertiestextrow,andTEXTVALUE,whichstoresthepropertiestextrowitself.

SeetheconcatenatedversionofinformationstoredintheTEXTVALUEcolumnoftheAL_LANGTEXTrepositorytablehere:AlGUIComment(“ActaName_1”=‘RSavedAfterCheckOut’,“ActaName_2”=‘RDate_created’,“ActaName_3”=‘RDate_modified’,“ActaValue_1”=‘YES’,“ActaValue_2”=‘SatJul0416:52:332015’,“ActaValue_3”=‘SunJul0511:18:022015’,“x”=’-1’,“y”=’-1’)

CREATEPLANWF_continuous::‘7bb26cd4-3e0c-412a-81f3-b5fdd687f507’()

DECLARE

$l_DirectoryVARCHAR(255);

$l_FileVARCHAR(255);

BEGIN

AlGUIComment(“UI_DATA_XML”=’<UIDATA><MAINICON><LOCATION><X>0</X>

<Y>0</Y></LOCATION><SIZE><CX>216</CX><CY>-179</CY></SIZE></MAINICON>

<DESCRIPTION><LOCATION><X>0</X><Y>-190</Y></LOCATION><SIZE><CX>200</CX>

<CY>200</CY></SIZE><VISIBLE>0</VISIBLE></DESCRIPTION></U

IDATA>’,“ui_display_name”=‘script’,“ui_script_text”=’$l_Directory='C:\\AW\\Files\\';

$l_File='flag.txt';

$g_count=$g_count+1;

print('Execution#'||$g_count);

print('Starting'||workflow_name()||'…');

sleep(10000);

print('Finishing'||workflow_name()||'…');’,“x”=‘116’,“y”=’-175’)

BEGIN_SCRIPT

$l_Directory=‘C:\AW\Files\’;$l_File=‘flag.txt’;$g_count=($g_count+1);print((‘Execution#’||$g_count));print(((‘Starting’||workflow_name())||’…’));sleep(10000);print(((‘Finishing’||workflow_name())||’…’));END

END

SET(“loop_exit”=‘fn_check_flag($l_Directory,$l_File)’,“loop_exit

_option”=‘yes’,“restart_condition”=‘no’,“restart_count”=‘10’,“restart_count_option”=‘yes’,“workflow_type”=‘Continuous’)

Page 295: SAP Data Services 4.x Cookbook

ThefirsthighlightedsectionoftheprecedingcodeisthedeclarationsectionoflocalworkflowvariablescreatedforWF_continuous.Thesecondhighlightedsectionismarkingthetextthatbelongstotheunderlyingscriptobject.YoucanseethatthescriptobjectisnotconsideredbyDataServicesasaseparateobjectentityandisjustapropertyoftheparentworkflowobject.Tocompare,takealookathowthescriptcontentslooklikeinDesigner:$l_Directory=‘C:\AW\Files\’;

$l_File=‘flag.txt’;

$g_count=$g_count+1;

print(‘Execution#’||$g_count);

print(‘Starting’||workflow_name()||’…’);

sleep(10000);

print(‘Finishing’||workflow_name()||’…’);

YoucanseethatformattingofthesameinformationstoredintheTEXTVALUEfieldisabitdifferent.So,becarefulwhenextractingandparsingthisdatafromthelocalrepository.

Finally,thethirdhighlightedsectionmarkstheworkflowpropertiesconfiguredwiththeProperties…contextmenuoptioninDesigner.

NoteThereisanotherversionoftheAL_LANGTEXTtablethatcontainsthesamepropertiesinformationbutintheXMLformat.ItistheAL_LANGXMLTEXTtable.

Page 296: SAP Data Services 4.x Cookbook

Chapter6.Job–BuildingtheETLArchitectureInthischapter,wewillcoverthefollowingtopics:

Projectsandjobs–organizingETLUsingobjectreplicationMigratingETLcodethroughthecentralrepositoryMigratingETLcodewithexport/importDebuggingjobexecutionMonitoringjobexecutionBuildinganexternalETLauditandauditreportingUsingbuilt-inDataServicesETLauditandreportingfunctionalityAutoDocumentationinDataServices

Page 297: SAP Data Services 4.x Cookbook

IntroductionInthischapter,wewillgouptothejoblevelandreviewthestepsinthedevelopmentprocessthatmakeasuccessfulandrobustETLsolution.Allrecipespresentedinthischaptercanfallintooneofthethreecategories:ETLdevelopment,ETLtroubleshooting,andETLreporting.ThesecategoriesincludedesigntechniquesandprocessesusuallyimplementedandexecutedsequentiallyinorderwithintheETLlifecycle.

Here,youcanseewhichtopicsfallunderwhichcategory.

DevelopingETL:

Projectsandjobs–organizingETLUsingobjectreplicationMigratingETLcodethroughthecentralrepositoryMigratingETLcodewithexport/import

ThedevelopingcategorydiscussesissuesfacedbyETLdevelopersonadailybasiswhentheyworkondesigningandimplementinganETLsolutioninDataServices.

TroubleshootingETL:DebuggingjobexecutionMonitoringjobexecution

ThetroubleshootingcategoryexplainsindetailthetroubleshootingtechniquesthatcanbeusedinDataServicesDesignertotroubleshoottheETLcode.

ReportingonETL:BuildingexternalETLauditandauditreportingUsingbuilt-inDataServicesETLauditandreportingfunctionalityAutoDocumentationinDataServices

ThereportingcategoryreviewsthemethodsusedtoreportonETLmetadataandalsoexplainstheAutoDocumentationfeatureavailableinDataServicestoquicklygenerateandexportdocumentationforthedevelopedETLcode.

Page 298: SAP Data Services 4.x Cookbook

Projectsandjobs–organizingETLProjectsareasimpleandgreatmechanismtogroupyourETLjobstogether.TheyarealsomandatorycomponentsofETLcodeorganizationforvariousDataServicesfeatures,suchasAutoDocumentationandbatchjobconfigurationavailableintheDataServicesManagementConsole.

Page 299: SAP Data Services 4.x Cookbook

GettingreadyTherearenopreparationsteps.Youhaveeverythingyouneedinyourlocalrepositorythathasalreadybeencreated.Inthisrecipe,wewilluseJob_DWH_DimGeographydevelopedinChapter5,Workflow–ControllingExecutionOrder,topopulatetheDWHdimensiontablesDimSalesTerritoryandDimGeography.

Page 300: SAP Data Services 4.x Cookbook

Howtodoit…TocreateaprojectobjectinDataServices,followthesesteps:

1. OpentheLocalObjectLibrarywindowandchoosetheProjectstab.2. Right-clickintheemptyspaceoftheProjectstabandselectNewfromthecontext

menu.TheProject–Newwindowappearsonthescreen.3. InputtheprojectnameasDWH_DimensionsintheProjectNamefield.4. OpentheProjectAreawindowusingtheProjectAreabuttononthetoolbaratthe

top:

5. GotoProjectArea|Designer.Youwillonlyseethecontentsofoneselectedproject.ToselecttheprojectormakeitvisibleintheProjectArea|Designerwindow,gotoLocalObjectLibrary|Projectsandeitherdouble-clickontheprojectyouareinterestedin(inourcase,ithasonlyoneprojectcreated)orchooseOpenfromthecontextmenuoftheselectedproject.

6. Toaddthejobintheproject,draganddroptheselectedjobfromLocalObjectLibrary|JobsintotheProjectArea|Designertabwindoworright-clickonthejobobjectinLocalObjectLibraryandchoosetheAddToProjectoptionfromthecontextmenu.AddJob_DWH_DimGeographycreatedinthepreviousrecipetotheDWH_Dimensionsproject:

Page 301: SAP Data Services 4.x Cookbook

Howitworks…Thisisallyouneedtodotocreateaprojectandplacejobsinit.Itisaverysimpleprocessthat,infact,bringsyouafewextraadvantagesthatyoucanuseinETLdevelopment.TheprocessalsorevealsnewfunctionalitynotaccessibleotherwiseinDataServices.Let’stakealookatsomeofthem.

HierarchicalobjectviewAvailableintheProjectArea|Designer,thisviewallowsyoutoquicklyaccessanychildobjectwithinajob.Inthefollowingscreenshot,theexpandingtreeshowsworkflow,dataflow,andtransformationobjects;byclickingonanyofthem,youopentheminthemainworkspacewindow:

HistoryexecutionlogfilesTheselogfilesareavailableonlyifthejobwasassignedtoaproject.TheProjectArea|Logtaballowsyoutoseeandaccessallavailablelogfiles(trace,performance,anderrorlogs)keptbyDataServicesforspecificjobs:

Executing/schedulingjobsfromtheManagementConsole

Page 302: SAP Data Services 4.x Cookbook

Yes,thisoptionisavailableonlyforjobsthatbelongtoaproject.

Usehttp://localhost:8080/DataServicestostartyourDataServicesManagementConsole.

LogintotheManagementConsoleusingtheetluseraccountcreatedintheConfiguringuseraccessrecipeofChapter2,ConfiguringtheDataServicesEnvironment.ItisthesameuseryouusetoconnecttoDataServicesDesigner.

GotoAdministrator|Batch|DS4_REPO.

IfyouopentheBatchJobConfigurationtab,youwillseethatonlyJob_DWH_DimGeographyisavailableforbeingexecuted/scheduled/exportedforexecution,asitwastheonlyjobinourlocalrepositorythatweaddedtoacreatedproject:

Asyoucansee,projectsarethecontainersforyourjobs,allowingyoutoorganizeanddisplayyourETLcodeandperformadditionaltasksfromtheManagementConsoleapplication.Keepinmindthatyoucannotaddanythingelseexceptthejobobjectdirectlyintotheprojectlevel.

Page 303: SAP Data Services 4.x Cookbook

UsingobjectreplicationDataServicesallowsyoutoinstantlycreateanexactreplicaofalmostanyobjecttypeyouareusinginETLdevelopment.Thisfeatureisusefultocreatenewversionsofanexistingworkflowordataflowtotestorjusttocreatebackupsattheobjectlevel.

Page 304: SAP Data Services 4.x Cookbook

Howtodoit…Wewillreplicateajobobjectusingthesesteps:

1. GotoLocalObjectLibrary|Jobs.2. Right-clickontheJob_DWH_DimGeographyjobandselectReplicatefromthecontext

menu:

3. CopyofthejobwiththenewnameiscreatedintheLocalObjectLibrary:

Page 305: SAP Data Services 4.x Cookbook

Howitworks…AllobjectsinDataServicescanbeidentifiedaseitherreusableornotreusable.

Areusableobjectcanbeusedinmultiplelocations,thatis,atableobjectimportedinadatastorecanbeusedasasourceortargetobjectindifferentdataflows.Nevertheless,allthesedataflowswillreferencethesameobject,andifchangedinoneplace,itwouldchangeeverywhereitisused.

Notreusableobjectsrepresenttheinstancesofaspecificobjecttype.Forexample,ifyoucopyandpastethescriptobjectfromoneworkflowtoanother,thesetwocopieswillbetwodifferentobjects,andbychangingoneofthem,youarenotmakingchangestoanother.

Let’stakeanotherexampleofadataflowobject.Dataflowsarereusableobjects.Ifyoucopyandpastetheselecteddataflowobjectintoanotherworkflow,youwouldcreateareferencetothesamedataflowobject.

Tobeabletomakeacopyofareusableobjectsothatthecopydoesnotreferencetheoriginalobject,ithasbeencopiedfromthereplicationfeatureusedinDataServices.Notethatthereplicatedobjectcannothavethesamenameastheoriginalobjectithasbeenreplicatedfrom.Thatisbecauseforreusableobjectssuchasworkflowsanddataflows,theirnamesuniquelyidentifytheobject.

NoteTheruleofthumbforcheckingwhetheranobjecttypeisreusableornot,istocheckifitexistsintheLocalObjectLibrarypanel.AllobjectsthatcanbefoundonLocalObjectLibrarypaneltabsarereusableobjects,exceptProjects,asitisnotpartofexecutableETLcode.Instead,itisalocationfolderthatisusedtoorganizejobobjects.Nevertheless,youcannotcreatetwoprojectswiththesametoollikeyoucanwiththescriptobjects.

ThefollowingtableshowswhichobjecttypecanbereplicatedinDataServicesandhowthereplicationprocessbehavesforeachoneofthem.Allthesearereusableobjecttypes.

Job NewobjectautomaticallycreatedinLocalObjectLibrarynamedasCopy_<ID>_<originaljobname>

Workflow NewobjectautomaticallycreatedinLocalObjectLibrarynamedasCopy_<ID>_<originalworkflowname>

Dataflow NewobjectautomaticallycreatedinLocalObjectLibrarynamedasCopy_<ID>_<originaldataflowname>

Fileformat

NewFileFormatEditorwindowisopened.ThenewnameisalreadydefinedasCopy_<ID>_<OriginalFileFormatname>,butyoucanchangeitbyaddinganewvalueintothenamefield

Customfunctions

NewCustomerFunctionwindowisopened.Youhavetoselectanewnameforthereplicatedfunction

Page 306: SAP Data Services 4.x Cookbook

Thereplicationprocessisaconvenientandeasywaytoperformobject-levelbackups.AllyouhavetodotocreateacopyoftheobjectbeforeeditingitistoclickontheReplicateoptionfromthecontextmenuoftheobjectyouarereplicating.

ItisalsoaneasywaytotestthecodechangesbeforeyoudecidetoupdatetheproductionversionoftheETL.

Forexample,ifyouwanttoseehowyourdataflowobjectbehavesafteryouchangethepropertiesoftheTable_Comparisontransforminsideit,youcanperformthefollowingsequenceofsteps:

1. Replicatethedataflowandsetituptorunseparatelywithinatestjob.2. Runthetestjobandtesttheoutputdatasettomakesurethatitgeneratestheexpected

result.3. Renametheoriginaldataflowbyaddingthe_archiveor_oldprefixtoit.4. Renamethenewreplicatedversiontotheoriginaldataflowname.5. Replacethearchivedataflowobjecteverywhereitisusedwithanewversion.

Toseeallparentobjectsthespecificobjectbelongstoyou.Inotherwords,toseeallthelocationswherethespecificobjectwasplaced,youcanuseoneofthefollowingsteps:

1. ChoosetheDIMGEOGRAPHYobjectfromtheDWHdatastoreinLocalObjectLibrary.

Right-clickonitandchoosetheViewWhereUsedoptionfromthecontextmenu.

TheparentobjectsthatthetableobjectbelongstoaredisplayedintheInformationtaboftheOutputwindow:

Youcanalsoseethenumberofparentobjects(locations)fortheobjectrightawayinLocalObjectLibraryintheUsagecolumnavailablenexttotheobjectname.Thisisusefulinformationthatcanhelpyouidentifyunusedor“orphaned”objects.

2. Picktheobjectofinterestintheworkspacearea(forexample,adataflowplaced

Page 307: SAP Data Services 4.x Cookbook

withinaworkflowworkspaceortableobjectplacedinthedataflow),right-clickonit,andchooseViewWhereUsedfromthecontextmenu.ThelistofparentobjectswillappearintheOutput|Informationwindow:

3. Finally,itispossibletocheckwherethecurrentlyopenedobjectisused.Whenyouhavetheobjectopenedintheworkspaceareaanddonothavetheabilitytoright-clickonit,insteadofgoingtotheLocalObjectLibrarylistsinordertofindtheobject,trytojustclickontheViewWhereUsedbuttonfromthetoptoolmenupanel:

NoteRememberthatitdisplaystheusedlocationslistfortheobjectcurrentlydisplayedontheactivetabofthemainworkspacearea.

Page 308: SAP Data Services 4.x Cookbook

MigratingETLcodethroughthecentralrepositoryInthisrecipe,wewilltakeabrieflookattheaspectsofworkinginthemultiple-userdevelopmentenvironmentandhowDataServicesaccommodatestheneedtomigratetheETLcodebetweenlocalrepositoriesbelongingtodifferentETLdevelopers.

Page 309: SAP Data Services 4.x Cookbook

GettingreadyTouseallfunctionalityavailableinDataServicestoworkinamultiuserdevelopmentenvironment,wemissaveryimportantcomponent:theconfiguredcentralrepository.So,togetready,andbeforeweexplorethisfunctionality,wehavetocreateanddeploythecentralrepositoryintoourDataServicesenvironment.

Performallthefollowingstepstocreate,configure,anddeploythecentralrepository:

1. OpentheSQLServerManagementStudioandconnecttotheSQLEXPRESSserver

engine.2. Right-clickonDatabasesandchoosetheNewDatabase…optionfromthecontext

menu.3. NamethenewdatabaseasDS_CENTRAL_REPOandkeepallitsparameterswithdefault

values.4. StarttheSAPDataServicesRepositoryManagerapplication.5. ChooseRepositorytypeasCentralandspecifyconnectivitysettingstothenew

databaseDS_CENTRAL_REPO.Whenyoufinish,clickontheCreatebuttontocreatecentralDataServicesrepositoryobjectsintheselecteddatabase:

6. Theprocessofcreatingarepositorycantakeafewminutes.Ifitissuccessful,youshouldseethefollowingoutputonthescreen:

7. Now,weneedtoregisterournewlycreatedcentralrepositorywithinDataServicesandtheInformationPlatformServices(IPS)configuration.StarttheCentralManagementConsolewebapplicationbygoingtohttp://localhost:8080/BOE/CMCandlogintotheadministratoraccount.ItisthesameaccountthatwascreatedduringtheinstallationofDataServices(seeChapter2,ConfiguringtheDataServicesEnvironment,fordetails).

8. ChoosetheDataServiceslinkonthehomescreentoopentheDataServicesrepositoryconfigurationarea.

9. Right-clickontheRepositoriesfolderorintheemptyareaofthemainwindowand

Page 310: SAP Data Services 4.x Cookbook

choosetheConfigurerepositoryoptionfromthecontextmenu:

10. NamethenewlyconfiguredrepositoryasDS4_CENTRALandinputconnectivitysettings.Afterthat,clickonTestConnectiontoseethesuccessfulconnectionmessage:

11. Closetherepositorypropertieswindow.Youshouldseethenewnon-securedcentralrepository,DS4_CENTRAL,displayedonthescreenalongwithlocalrepositoryDS4_REPO:

12. Right-clickonDS4_CENTRALandchoosetheUserSecurityoptionfromthecontextmenu.

13. ChooseDataServicesAdministratorUsersandclickontheAssignSecurity

Page 311: SAP Data Services 4.x Cookbook

button.14. OntheAssignSecuritywindow,gototheAdvancedtabandclickonthe

Add/RemoveRightslink.15. OntheAdd/RemoveRightswindow,chooseApplication|DataServices

Repositoryandselect/grantthefollowingoptionsintheright-handsideundertheSpecificRightsforDataServicesRepositorysection:

16. ClickonOKtosavethechangesandclosetheUserSecuritywindow.17. ThefinalstepofconfigurationistospecifythecentralrepositoryinyourDesigner

configurationsettings.ThiscanbeconfiguredontheDesignerOptionwindow,oryoucanopentheCentralRepositoryConnectionssectionbygoingtoTools|CentralRepositories…fromthetopmenu.

18. IntheCentralRepositoryConnectionssection,clickontheAddbuttontoopenthelistofrepositoriesavailableandselectDS4_CENTRAL.

19. TheActivatebuttonactivatesthecentralrepositoryfromthelist(ifyouaddmultipleones,onlyoneofthemcanbeactiveatatime).YoucanalsospecifytheReactivateautomaticallyflagforthecentralrepositorytoreactivateautomaticallywhentheDesignerapplicationrestarts:

20. Afterperformingallthesesteps,youshouldbeabletoactivatetheCentralObjectLibrarywindow(seethetoptoolpanel),whichlooksalmostexactlylikeLocalObjectLibrary:

Page 312: SAP Data Services 4.x Cookbook

Theprecedingstepsshowedyouhowtocreate,configure,anddeploythecentralrepositoryinDataServices.Next,wewillseehowyoucanactuallyusethecentralrepositorytomigratetheETLbetweendifferentlocalrepositories.

Page 313: SAP Data Services 4.x Cookbook

Howtodoit…ThecentralrepositoryorCentralObjectLibraryisalocationsharedbydifferentETLdeveloperstoexchangeandsynchronizetheETLcode.Inthisrecipe,wewillcopytheexistingjobintoCentralObjectLibraryandseewhichoperationsareavailableinDataServicesontheobjectsstoredthere.Followthesesteps:

1. GotoLocalObjectLibrary|Jobs.2. Right-clickontheJob_DWH_DimGeographyjobobjectandgotoAddtoCentral

Repository|ObjectandDependentsfromthecontextmenu.3. OpenCentralObjectLibraryandseethatthejobobjectandalldependentobjects,

workflows,anddataflowsappearedontheCentralObjectLibrarytabsections.TheETLcodeforJob_DWH_DimGeographyhasbeensuccessfullymigratedtothecentralrepository.

4. Now,gototheLocalObjectLibrary|Dataflows,findtheDF_Load_DimGeographydataflowobject,anddouble-clickonittoopenitintheworkspaceareaforediting.

5. RenamethefirstQuerytransformfromQuerytoJoinandsavethedataflow.6. NowthatyouhavechangedtheETLcodemigratedfromlocaltocentralrepository,

youcancomparethetwoversionsofyourjobandseethedifferencesdisplayedinDifferencesViewer.Right-clickonthejobinLocalObjectLibraryandgotoCompare|ObjectanddependentstoCentralfromthecontextmenu:

7. WheninCentralObjectLibrary,youcandothesamethingbyclickingonaspecificobjectandchoosingthepreferableoptionfromtheComparecontextmenu.

8. Togettheversionoftheobjectfromthecentralrepositorytoalocalone,selecttheDF_Load_DimGeographydataflowobjectintheCentralObjectLibrary,right-clickonit,andgotoGetLatestVersion|Objectfromthecontextmenu.

9. Ifyoucomparethelocalobjectversiontotheonestoredinthecentralrepositorynow,youwillseethatthereisnodifference,asthecentralobjectversionhasoverwrittenthelocalobjectversion.

Page 314: SAP Data Services 4.x Cookbook

Howitworks…ThepurposeofthecentralrepositoryistoprovideacentralizedlocationtostoreETLcode.

TheCentralObjectLibraryrepresentsthecontentsofthecentralrepositoryinthesamewaythattheLocalObjectLibraryrepresentsthecontentsofthelocalrepository.

TheETLcodestoredinthecentralrepositorycannotbechangeddirectlyasinthelocalrepository.So,itprovidesalevelofsecuritytomakesurethatthecentralrepositorychangescanbetracked,andthehistoryofalloperationsperformedonitsobjectscanbedisplayed.

AddingobjectstoandfromtheCentralObjectLibraryIftheobjectdoesnotexistinthecentralrepository,youcanadditusingtheAddtoCentralRepositoryoptionfromtheobjectscontextmenu.

Iftheobjectalreadyexistsinthecentralrepository,thereareafewextrastepsrequiredtoupdateitwithanewerversionfromthelocalone.Wewilltakeacloselookatthisfunctionalityintheupcomingchapters.

Gettingtheobjectfromthecentraltothelocalrepositoryismuchmoresimple.AllyouneedtodoisusetheGetLatestVersionoptionfromtheobjectscontextmenuinCentralObjectLibrary.Itdoesnotmatteriftheobjectexistsornotinthelocalrepository—itwillbecreatedoroverwritten.Thismeansthatitwillbedeletedandcopiedfromthecentralrepository.

Anotherimportantaspectofcopyinganobjectinto,andfrom,thecentralrepositoryistheavailabilityofthreemodes:Object,Objectanddependents,andWithfiltering:

Object:Inthismode,itdoesnotmatterwhichoperationyouperform,whetheritisgettingthelatestobjectversionfromcentraltolocal,comparingobjectversionsbetweencentralandlocal,orjustplacingobjectsfromlocaltocentral.Theoperationisperformedonthisobjectonly.Objectanddependents:Thisoperationaffectsallthechildobjectsbelongingtotheselectedobject,theirchildobjects,theirchildobjects,andsoonuntilthelowestlevel

Page 315: SAP Data Services 4.x Cookbook

downthehierarchy(whichisusuallyatable/fileformatlevel).Withfiltering:ThismodeisbasicallythesameasObjectanddependents,butwiththeabilitytoexcludethespecificobjectfromtheaffectedobjects.Whenchosen,thenewwindowopens,allowingyoutoexcludespecificobjectsfromthehierarchytree.HereistheresultofchoosingAddtoCentralRepository|WithfilteringfortheJob_DWH_DimGeographyobject:

ComparingobjectsbetweentheLocalandCentralrepositoriesDesignerhasaveryusefulComparefunctionavailableforallobjectsstoredinthelocalorcentralrepositories.Whenselectedfromthecontextmenuoftheobjectstoredinacentralrepositorylocation,therearetwoComparemethodsavailable:ObjecttoLocalandObjectwithdependentstoLocal.

Whenselectedfromthecontextmenuoftheobjectstoredinalocalrepositorylocation,therearetwoComparemethodsavailable:ObjecttoCentralandObjectwithdependentstoCentral.

TheresultispresentedintheDifferenceViewerwindow,whichopensinthemainworkspaceareainaseparatetabandlookssimilartothefollowingscreenshot:

Page 316: SAP Data Services 4.x Cookbook

ThisisanexampleoftheDifferenceViewerwindow.NotehowwehaveonlyrenamedtheQuerytransform,yetDifferenceViewershowsthewholestructureoftheJoinQueryobjectasdeleted,andontheCentraltab,itshowsthenewQueryQuerytransformstructure.TheMappingandLinkssectionsoftheupdateddataflowarealsoaffected,asyoucanseeintheprecedingscreenshot.

Page 317: SAP Data Services 4.x Cookbook

Thereismore…Ihavenotdescribedoneofthemostimportantconceptsofthecentralrepository:theabilitytocheckoutandcheckinobjectsandviewthehistoryofchangesinthemultiuserdevelopmentenvironment.Ihaveleftitformoreadvancedchapters,anditwillbeexplainedfurtherinthebook.

Page 318: SAP Data Services 4.x Cookbook

MigratingETLcodewithexport/importDataServicesDesignerhasvariousoptionstoimport/exportETLcode.

Inthisrecipe,wewillreviewallpossibleimport/exportscenariosandtakeacloserlookatthefileformatsusedforimport/exportinDataServices:ATLfiles(themainexportfileformatfortheDataServicescode)andXMLstructures.

Page 319: SAP Data Services 4.x Cookbook

GettingreadyTocompletethisrecipe,youwillneedanotherlocalrepositorycreatedinyourenvironment.RefertothefirsttwochaptersofthebooktocreateanotherrepositorynamedDS4_LOCAL_EXTinthenewdatabase,DS_LOCAL_REPO.DonotforgettoassignthepropersecuritysettingsforDataServicesAdministratorusersinCMCafterregisteringthenewrepository.

Page 320: SAP Data Services 4.x Cookbook

Howtodoit…DataServiceshastwomainimport/exportoptions:

UsingATL/XMLexternalfilesDirectimportintoanotherlocalrepository

Import/ExportusingATLfilesInthefollowingsteps,IwillshowyouanexampleofhowtoexportETLcodefromtheDataServicesDesignerintoanATLfile.

1. ExportJob_DWH_DimGeographyintoanATLfile.Right-clickonthejobobjectin

LocalObjectLibrary|JobsandselectExportfromthecontextmenu.TheExportwindowopensinthemainworkspacearea.Lookatthefollowingscreenshot:

2. Usingthecontextmenubyright-clickingonthespecificobjectorobjectsintheExportwindow,youcanexcludeselectedobjectswiththeExcludeoptionorselectedobjectswithalltheirdependenciesusingtheExcludeTreeoption.ExcludetheDF_Extract_SalesTerritorydataflowandallitsdependenciesfromtheexport,

Page 321: SAP Data Services 4.x Cookbook

asshowninthefollowingscreenshotusingtheExcludeTreeoption:

3. Objectsexcludedfromtheexportaremarkedwithredcrosses.SeeboththeObjectstoexportandDatastorestoexportareasontheExporttabfortheobjectsexcludedbytheExcludeTreecommandexecutedinthepreviousstep:

4. Toexecutetheexportoperation,right-clickinanyareaoftheExportworkspacetabandchoosetheExporttoATLfile…optionfromthecontextmenu.OntheopenedSaveAsscreen,choosethenameoftheATLfile,export.atl,anditslocation.Then,clickonOKandspecifythesecuritypassphrasefortheATLfile.

Page 322: SAP Data Services 4.x Cookbook

5. Exportcouldtakeanythingfromafewsecondsuptoafewminutes,dependingonthenumberofobjectsyouareexporting.Whenitisfinished,youwillseethefollowingoutputintheOutput|Informationwindow.Ifyoucheckthechosenlocation,youshouldseethattheexport.atlfilewascreated:

6. Now,logintothesecondlocalrepositorywithDesigner.Forthis,exittheDesignertorestarttheapplication.Onthelogonscreen,choosetoconnecttoanotherlocalrepository:

7. Thenewlocalrepositoryiscompletelyempty.Wewillusetheexport.atlfile

Page 323: SAP Data Services 4.x Cookbook

createdintheprevioussteptoimportthejobanditsdependentobjectsintothisnewrepository.SelecttheImportFromFile…optionfromthetopToolsmenulist.Then,selecttheexport.atlfileandclickonOK,thusagreeingtoimportallobjectsfromthefileintothecurrentlyopenlocalrepository.

8. Asweexportedthejobobjectanditsdependents,itdoesnotbelongtoanyprojectinanewrepository.CreateanewprojectcalledTESTandplacethejobinittoexpanditsstructure:

SeethatDF_Extract_SalesTerritoryandthetablesbelongingtoitaremissingfromthejobstructure,althoughDataServiceskeepsreferenceforWF_Extract_SalesTerritory.Ifthedataflowisimportedinthefuture,itwouldautomaticallybeassignedasachildobjecttotheworkflowandwouldfitintothejobstructure.

DirectexporttoanotherlocalrepositoryLet’sperformadirectexportofthemissingDF_Extract_SalesTerritoryobjectanditsdependentsfromtheDS4_REPOtoDS4_LOCAL_EXTrepository:

1. LogintoDS4_REPO,right-clickontheDF_Extract_SalesTerritorydataflowobject

intheLocalObjectLibrary,andselectExportfromthecontextmenutoopentheExporttabinthemainworkspacearea.Bydefault,theselectedobjectsandallitsdependentsareaddedtotheExporttab.

2. Right-clickontheExporttabandchoosetheExporttorepository…menuitemdisplayedwithboldtext.SelectDS4_LOCAL_EXTasthetargetrepository:

Page 324: SAP Data Services 4.x Cookbook

3. OntheExportConfirmationwindow,whichopensnext,excludeallobjectsthatalreadyexistinthetargetrepository.ThesearethedatastoreobjectsOLTPandDS_STAGE:

4. TheoutputofthedirectexportcommandisdisplayedintheOutput|Informationwindow:(14.2)07-13-1521:06:51(1000:6636)JOB:Exported1DataFlows

(14.2)07-13-1521:06:51(1000:6636)JOB:Exported2Tables

(14.2)07-13-

1521:06:51(1000:6636)JOB:CompletedExport.Exported3objects.

5. Now,exittheDesignerandreopenitbyconnectingtotheDS4_LOCAL_EXTrepository.ExpandthefullprojectTESTstructuretoseethatallmissingdependentobjectswereimportedintothestructureoftheJob_DWH_DimGeographyjob:

Page 325: SAP Data Services 4.x Cookbook
Page 326: SAP Data Services 4.x Cookbook

Howitworks…ManipulatingobjectsontheExporttabisapreparationstepthatallowsyoutoexcludetheobjectsthatyoudonotwanttoexporttotheATLfileordirectlytoanotherlocalrepository.AfterpreparingtheETLstructureforexportbyexcludingspecificobjectsthatyoudonotwanttoexport,incaseyoudonotwanttooverwriteversionsofthesameobjectsinthetargetrepositoryorarejustnotinterestedinmigratingthem,youhavethreeoptions:

Directexportintoanotherlocalrepository(acomparisonwindowopens,allowingyouexcludeobjectsfrombeingexportedandshowingwhichobjectexistsinthetargetrepository)ExporttoanATLfileExporttoanXMLfile(thisisexactlythesameasthepreviousoption,exceptthatadifferentflatfileformatisusedtostoretheETLcode)

AnATLfileisastructuredfilethatcontainsproperties,links,andreferencesfortheobjectsexported.

AnATLfilecanbeopenedinanytexteditor.Itcanbeusefultobrowseitscontentsifyouwanttocheckwhichspecificobjectincludedintheexportfile.Forfunctionobjects,itiseasytoseethetextoftheexportedfunctionifyouwanttocheckitsversionandsoon.

Forexample,ifyouopentheexport.atlfilegeneratedinthisrecipewithNotepadandsearchforDF_Load_DimGeography,youwillseethatitcanbefoundintwoplaceswithinafile:

Thefirstsectiondefinesthepropertiesoftheobject,andtheseconddefinesitsplacewithinanexecutionstructure.

Page 327: SAP Data Services 4.x Cookbook

DebuggingjobexecutionHere,IwillexplaintheuseofDataServicesInteractiveDebugger.Inthisrecipe,IwilldebugtheDF_Transform_DimGeographydataflow.

ThedebuggingprocessistheprocessofdefiningthepointsintheETLcode(dataflowinparticular)thatyouwanttomonitorcloselyduringjobexecution.Bymonitoringitclosely,Imeantoactuallyseetherowspassingthroughoreventohavecontroltopausetheexecutionatthosepointstoinvestigatethecurrentpassingrecordmoreclosely.

Thosepointsincodearecalledbreakpoints,andtheyareusuallyplacedbeforeandafterparticulartransformobjectsinordertoseetheeffectmadebyparticularatransformationonthepassingrow.

Page 328: SAP Data Services 4.x Cookbook

Gettingready…Theeasiestwaytodebugaspecificdataflowistocopyitinaseparatetestjob.CreateanewjobcalledJob_DebugandcopyDF_Transform_DimGeographyinitfromtheworkflowworkspacethatit’scurrentlylocatedin,orjustdraganddropthedataflowobjectintheJob_DebugworkspacefromLocalObjectLibrary|Dataflows.

Page 329: SAP Data Services 4.x Cookbook

Howtodoit…Herearethestepstocreateabreakpointandexecutethejobinthedebugmode:

1. First,definethebreakpointinsideadataflow.Todothis,double-clickonthelink

connectingthetwotransformobjects,JoinandMapping:

2. Createdbreakpointsaredisplayedasreddotsonthelinksbetweentransformobjects.Youcantogglethemon/offusingtheShowFilters/Breakpointsbuttonfromthetopinstrumentpanel:

3. GototheJob_DebugcontextandchooseDebug|StartDebug…fromthetopmenu,orjustclickontheStartDebug…(Ctrl+F8)buttononthetopinstrumentpanel:

Page 330: SAP Data Services 4.x Cookbook

4. TheDebugPropertieswindowopens,allowingyoutospecifyorchangethedebugproperties.Donotchangethem—thedefaultvaluesaresuitableformostdebuggingcases:

5. Inthedebuggingmode,thejobexecutesinthesamemannerasinthenormalexecutionmode,exceptthatitispossibletopauseitatanymomenttobrowsethedatabetweentransforms.Inourcase,thejobpausedautomaticallyassoonasthefirstpassingrowmeetsthespecifiedbreakpointcondition.Toviewthedatasetpassedbetweenthetransforms,clickonthemagnifyingglassicononthelinkbetweenthetransformobjects:

Page 331: SAP Data Services 4.x Cookbook

6. Whenpausedorrunning,thetop-levelinstrumentpanelchangestheactivatingdebuggingbuttons,allowingyoutostop/continuedebugging:

Alternatively,stepthroughthepassingrowsonebyonewhenviewingthedatasetbetweentransforms:

7. Alongwiththebreakpoints,youcandefinethefilterinthesamewindow:

Page 332: SAP Data Services 4.x Cookbook

Thefilterisdisplayedwithadifferenticoninthedataflowandallowsyoutofilterdatasetspassingthroughthedataflowinthedebuggingmode.

Page 333: SAP Data Services 4.x Cookbook

Howitworks…Two-stepprocess:

1. Definethebreakpointswhereyouwantthejobexecutiontopause.2. Runthejobinthedebuggingmode.

Breakpointsallowyoutopausejobexecutiononaspecificconditionsothatyouareabletoinvestigatethedataflowingthroughyoudataflowprocess.Inthedebuggingmode,itispossibletoseeallrecordspassedbetweentransformobjectsinsideadataflow.Youcanseehowaspecificrecordextractedfromthesourceobjectistransformedandchangedwhileitismakingitswayintothetargetobject.ItisalsoeasytodetectwhentherecordisfilteredbytheWHEREclausecondition,asitwillnotappearaftertheQuerytransformthatfiltersitout.

Managingfilters/breakpointswiththeFilters/Breakpoints…(Alt+F9)buttonfromtheinstrumentpanel.

Filtersappliedtolinksbetweentransformobjectsareconsideredonlywhenthejobisexecutedinthedebuggingmode.FiltersaswellasbreakpointsarenotvisiblefortheDataServicesenginewhenthejobisexecutedinthenormalexecutionmode.

NoteFiltersareagreatwaytodecreasethenumberofrecordspassingthroughthedataflowwhenyourunajobinthedebuggingmode.Ifyouareinterestedindebugging/seeingthetransformationbehaviorforasmall,specificamountofrecordsthatcanbedefinedwithfilteringconditions,thenitcouldsignificantlydecreasethedebuggingexecutiontime.

Page 334: SAP Data Services 4.x Cookbook

MonitoringjobexecutionInthisrecipe,wewilltakeacloserlookatthejobexecutionparameters,tracingoptions,andjobmonitoringtechniques.

Page 335: SAP Data Services 4.x Cookbook

GettingreadyWewillusethejobwedevelopedinthepreviouschapters,Job_DWH_DimGeography,toseehowthejobexecutioncanbetracedandmonitored.

Let’sperformminorchangestothejobtopreparethejobfortherecipeexamplesusingthesesteps:

1. Onthejob-levelcontext,createaglobalvariable,$g_RunDate,ofthedatedatatype

andassignthesysdate()functiontoitasavalue.2. Atthesamejoblevel,beforethesequenceofworkflows,placeanewscriptobject

withthefollowingcodeandlinkittothefirstworkflow.Thisscriptwillbethefirstobjectexecutedwithinajob:print(‘*************************************************’);

print(‘INFO:Job’||job_name()||’startedon’||$g_RunDate);

print(‘*************************************************’);

Page 336: SAP Data Services 4.x Cookbook

Howtodoit…ClickontheExecute…buttontoexecutethejob.Beforethejobruns,theExecutionPropertieswindowopens,allowingyoutosetupexecutionoptions,configurethetracingofthejob,orchangethepredefinedvaluesoftheglobalvariablesforthatparticularjobruntoadifferentone.

Let’stakeacloserlookatthetabsavailableonthiswindow:

ClickontheExecutionOptionstab.

Herearetheoptionsavailableonthistab:

Printalltracemessages:Thisoptiondisplaysallthepossibletracemessagesfromallcomponentsparticipatinginthejobexecution:objectparametersandoptions,internalsystemqueriesandinternallyexecutedcommands,loaderparameters,thedataitself,andmanyotherdifferentkindsofinformation.Theloggeneratedissoenormousthatwedonotrecommendthatyouusethisoptionifyouhaveafewworkflow/dataflowobjectsinsideyourjoborifthedatapassingyourdataflowsisbigenoughtonotwanttoseeeveryrowofitpassingthroughthetransformations.

ThisoptionliterallyshowswhatishappeningineveryDataServicesinternalcomponentparticipatinginthedataprocessing,andallthisinformationisdisplayedforeveryrowpassingthosecomponents.

Monitorsamplerate:Thisoptiondefineshowoftenyourlogsgetupdatedwhenthejobruns.Thedefaultis5seconds.

Page 337: SAP Data Services 4.x Cookbook

Collectstatisticsforoptimization:Thisoptioncollectsoptimizationstatistics,allowingDataServicestochooseoptimalcachetypesforvariouscomponentswhenexecutingdataflows.Wewilltalkaboutitinmoredetailintheupcomingchapters.Collectstatisticsformonitoring:Ifset,DataServiceswilldisplaycachesizesinthetracelogwhenthejobruns.Usecollectedstatistics:ThismakesDataServicesusethestatisticscollectedwhenthejobwasexecutedpreviouslywiththeCollectstatisticsforoptimizationoptionsetup.

ClickonthesecondTracetab.

Thistabhasalistofvarioustraceoptions.Settingupeachoftheseoptionsaddsextrainformationtothecontentsofthetracelogfilewhenthejobruns:

Bydefault,onlyTraceSession,TraceWorkFlow,andTraceDataFlowareenabled.SwitchtheirvaluestoNoandenableonlyTraceRowbychangingitsvaluetoYes.Afteryouexecutethejob,youwillseethefollowingtracelog:

Youcanseethatyoudonotseeinformationaboutthestatusesoftheworkflowanddataflowexecutionthatyounormallysee.Thetracelogfilenowdisplaysonlytheoutputoftheprint()functionsfromuserscriptobjectsandrowspassingthroughthedataflows.Beextracareful—thisisalotofdata.Avoidusingthisoptionunlessyouarespecificallyinadesigntestenvironmentwithjustafewrowsredfromthesourcetable.

ClickonthethirdGlobalVariabletab.

Page 338: SAP Data Services 4.x Cookbook

Thistabdisplaysthelistofallglobalvariablescreatedwithinthejob,allowingyoutomodifytheirvaluesforthisspecificjobexecutionwithoutchangingthesevaluesinthejobcontextlevel:

Tochangethevalue,justdouble-clickontheValuefieldofthespecificglobalvariablerowandinputthenewvalue.Rememberthatthischangeappliesonlytothiscurrentjobexecution.Whenyourunthejobnexttimeandopenthistab,theglobalvariableswillhavetheirdefaultvaluesdefinedagain.

LogintotheDataServicesManagementConsoletomonitorjobexecutionandgotoAdministrator|Batch|DS4_REPO.

TheManagementConsolenotjustallowstheWebaccesstothesamethreelogfiles,trace,log,andmonitor,butalsotoanotherone,PerformanceMonitor:

Thetop-levelsectionallowseasyaccesstothepreviousversionsofthelogfilesforaspecificjob.ItdoesnotmatterwhetherthejobhasbeenplacedintheProjectfolderornot.

Intheprecedingscreenshot,wedisplayedalllogfilesforthelast5daysfortheJob_DWH_DimGeographyjob.

ClickonthePerformanceMonitorlinkofthelastjobexecutiontoopenthePerformanceMonitorpage:

Page 339: SAP Data Services 4.x Cookbook

ThefirstpageofPerformanceMonitordisplaysthelistofdataflowsfromthejobstructure.Whenclickingonthespecificdataflow,itispossibletodrillinonthedataflowcomponentsleveltoseehowmanyrecordspassedthroughthespecificdataflowcomponentsandtheexecutiontimeofeachthem.

Infact,theinformationdisplayedinPerformanceMonitorisbasedonthesamedataastheinformationdisplayedintheMonitorlog.Itisjustpresenteddifferently,inmakingitsometimesmoreconvenientforanalysis.

Page 340: SAP Data Services 4.x Cookbook

Howitworks…Itissimplyamatterofpersonalchoicewhendecidingwhattousetomonitorjobexecution:thewebapplicationofDataServicesManagementConsoleortheDesignerclient.Sometimes,duetorestrictedaccesstotheenvironment,theWeboptionismorepreferable.Itisalsoeasiertouseifyouneedtofindanyoldlogfilesofaspecificjobforanalysis,performancecomparison,orsimplyneedtocopyandpastefewrowsfromthetracelogfile.

Page 341: SAP Data Services 4.x Cookbook

BuildinganexternalETLauditandauditreportingInthisrecipe,wewillimplementtheexternaluser-builtETLauditmechanism.OurETLauditwillincludeinformationaboutthestartandstoptimesoftheworkflowsrunningwithinthejob,theirstatuses,names,andinformationaboutwhichjobtheybelongto.

Page 342: SAP Data Services 4.x Cookbook

Gettingready…WeneedtocreateanETLaudittableinourdatabasewherewewillstoretheauditresults.

ConnecttotheSTAGEdatabaseusingtheSQLServerManagementStudioandexecutethefollowingstatementtocreatetheETLaudittable:createtabledbo.etl_audit(

job_run_idinteger,

workflow_statusvarchar(50),

job_namevarchar(255),

start_dtdatetime,

end_dtdatetime,

process_namevarchar(255)

);

Page 343: SAP Data Services 4.x Cookbook

Howtodoit…First,weneedtochooseobjectsforauditing.Thefollowingstepsshouldbeimplementedforeveryworkflowordataflowthatyouwanttocollectauditinginformationabout.Inthisparticularexample,wewillenableETLauditingforthejobobjectitself.

1. Createextravariablesforthejobobject:

$v_process_namevarchar(255)

$v_job_run_idinteger

2. Addthefollowingcodeinthescriptthatstartsthejobexecution:$v_process_name=job_name();

$v_job_run_id=job_run_id();

#Insertauditrecord

sql(‘DS_STAGE’,

’insertintodbo.etl_audit(job_run_id,workflow_status,job_name,start_dt,end_dt,process_name)’||

’values(‘||$v_job_run_id||’,’||’'STARTED'’||’,'’||job_name()||’',SYSDATETIME(),NULL,'’||$v_process_name||’')’

);

3. Createanewscript,ETL_audit_update,attheendoftheexecutionsequenceinsidethejobcontextandputthefollowingcodeinit:#UpdateETLauditrecord

sql(‘DS_STAGE’,

’updatedbo.etl_audit’||

’setworkflow_status=’||’'COMPLETED'’||’,end_dt=SYSDATETIME()’||

’wherejob_run_id=’||$v_job_run_id||’andprocess_name='’||$v_process_name||’'’

);

4. Thejobcontenthasnowbeenwrappedintheauditinginsert/updatecommandsplacedintheinitialandfinalscripts:

5. ImplementtheprecedingstepsforWF_Extract_SalesTerritory,whichcanbefoundintheWF_extractworkflowcontainertoenabletheETLauditforthatobjectaswell.

Page 344: SAP Data Services 4.x Cookbook

Theonlychangeisthatintheinitialscript,the$v_process_namevariablevalueshouldbechangedtotheworkflow_name()functioninsteadofthejob_name()function,howitwasdoneforthejob:

Page 345: SAP Data Services 4.x Cookbook

Howitworks…Now,ifyouexecutethejobandquerythecontentsoftheETLaudittableinafewseconds,youshouldseesomethinglikethis:

Afewsecondslater,afterthejobsuccessfullycompletes,yourETLaudittablewilllooklikethis:

Asimpleanalysisofthistablecananswerthefollowingquestions:

Whichobjectsarerunningwithinthecurrentlyrunningjob?Thisisveryusefulinformation,especiallyifyourjobcontainshundredsofworkflows,with20ofthemrunninginparallel.Inthiscase,itishardtoobtainthisinformationfromthetracelog.Whatwasthestatusoftheobjectwhenitwasexecutedlasttime?Tobeprecise,youalsohavetoimplementanotherpieceoflogic,thethirdupdatethatchangesthestatusoftheworkflowto“ERROR”ifsomethingunexpectedhappensandtheworkflowcannotbeconsideredassuccessfullycompleted.Thisthirdupdateusuallygoesintothecatchsectionofthetry-catchblock.Whatwastheexecutiontimeforthespecificobject?Theanswerspeaksforitself.Whatwastheexecutionorderoftheobjects?Youcancomparetheexecutiontimes.Ifyouknowwhentheobjectsstartedandended,youcaneasilyderivetheexecutionorder.Whencomparableworkflowsarenotdirectlylinkedandrunwithindifferentbranchesoflogic,itissometimesusefultoknowwhichonestartedorfinishedearlier.

Theadvantageoftheexternaluser-builtETLauditisthatyoucanbuildaflexiblesolutionthatgathersanyinformationthatyouwantittogather.

NoteNotethatwithinsert/updateETLauditstatements,youcandefinethelogicalbordersofasuccessfulobjectcompletion.Theoretically,aworkflowobjectandthejobitselfcanstillfailrightafteritsuccessfullyexecutesthesql()commandandupdatesitsstatusintheETLaudittableassuccessful.However,thisisoftenagoodthingasitisexactlywhatyouareinterestedinwhenyoumakethedecisionofifyoushouldrerunaspecificworkflowornot–hastheworkflowcompletedtheworkitwassupposedto?

InformationinETLaudittablescanbeutilizednotonlyinthereportsshowingthe

Page 346: SAP Data Services 4.x Cookbook

executionstatisticsofyourjobsbutalsotoimplementexecutionlogicinsidethejob.

Forexample,ifyouwanttorunthespecificworkflowonlyonceaweekbutitisbeingexecutedwithinadailyjob,youcouldaddthescriptobjectsinyourworkflow.YoucouldcheckfromETLaudittableswhentheworkflowwasrunthelasttimeandskipitifitwasexecutedandsuccessfullycompletedlessthanaweekago.

Finally,itisevenpossibletonotonlyauditaDataServicesobject(dataflow,workflow,job,orscriptobject)buttoauditanypieceofcode—partofthescriptorasinglebranchofthelogic.Youcanwrapanythingintheinsert/updatestatementssenttoanexternaltabletostoreauditinformation.

ThatisthetruepowerofcustomETLauditing.YoucancollectalltheinformationyouwantandeasilyquerythisinformationfromETLitselftomakevariousdecisions.

Page 347: SAP Data Services 4.x Cookbook

Usingbuilt-inDataServicesETLauditandreportingfunctionalityDataServicesprovidesETLreportingfunctionalitythroughtheManagementConsolewebapplication.ItisavailableintheformoftheOperationalDashboardapplicationonthemainManagementConsoleHopepage.

Page 348: SAP Data Services 4.x Cookbook

GettingreadyYoudonothavetoconfigureorpreparetheoperationaldashboardsfeature.Itisavailablebydefault,andallyouhavetodotoaccessitisstarttheDataServicesManagementConsole.

Page 349: SAP Data Services 4.x Cookbook

Howtodoit…Let’sreviewwhichETLreportingcapabilitiesareavailableinDataServices.Performthesesteps:

1. StarttheDataServicesManagementConsole.2. ChoosetheOperationalDashboardapplicationfromthehomepage:

3. ThemaininterfaceofOperationalDashboardincludesthreesections.Itincludesthepiechartofthegeneraljobstatusstatisticsperintervalforaselectedrepository.Greenshowsthenumberofsuccessfullycompletedjobsforaspecificperiodoftime,yellowshowsjobssuccessfullycompletedwithwarningmessages,andredshowsfailedjobs:

Page 350: SAP Data Services 4.x Cookbook

4. Thesectionbelowshowsmoredetailedjobexecutionstatisticsintheformofaverticalbarchartforspecificdaysorintervalofdays.Trytohoveryourmousecursoroverthebarstoseetheactualnumbersbehindthegraph.Theverticallineshowsthenumberofjobsexecutedonspecificdayswithdifferentstatuses:successfullywithnoerrors(green),successfullywithwarningmessages(yellow),andfailed(red):

5. Attheright-handside,youcanseethelistofjobswhoseexecutionstatisticsarerepresentedbygraphsontheleft-handside.Byclickingonaspecificrow,youcandrilldowntoseethelistofexecutionsforthisspecificjob.Themostusefulinformationhereistheexecutiontimedisplayedinseconds,therunIDofthejob,andstatusofthejob,asyoucanseeinthefollowingscreenshot:

Page 351: SAP Data Services 4.x Cookbook
Page 352: SAP Data Services 4.x Cookbook

Howitworks…OperationalDashboardreportingcanbeusedtoprovidejobexecutionhistorydata,analyzethepercentageoffailedjobsforaspecifictimeinterval,andcomparethosenumbersbetweendifferentdaysortimeintervals.

Thatisprettymuchit.Todomore,youwouldhavetobuildyourownETLmetadatacollectionandbuildyourownreportingfunctionalityontopofthisdata.

Page 353: SAP Data Services 4.x Cookbook

AutoDocumentationinDataServicesThisrecipewillguideyouthroughtheAutoDocumentationfeatureavailableinDataServices.LikeOperationalDashboard,thisfeatureisalsopartoffunctionalityavailableintheDataServicesManagementConsole.

Page 354: SAP Data Services 4.x Cookbook

Howtodoit…ThesestepswillcreateaPDFdocumentcontaininggraphicalrepresentation,descriptions,andrelationshipsbetweenallunderlyingobjectsoftheJob_DWH_DimGeographyjobobject:

1. LogintotheDataServicesManagementConsolewebapplication.2. Onthehomepage,clickontheAutoDocumentationicon:

3. Inthefollowingscreen,expandtheprojecttreeandleft-clickonthejobobject.Youcanseewhichobjectisdisplayedascurrentbycheckingtheobjectnameinthetoptabnameontheright-handsideofthewindow:

Page 355: SAP Data Services 4.x Cookbook

4. Then,clickonthesmallprintericonlocatedatthetopofthewindow:

5. Inthepop-upwindow,justclickonthePrintbutton,leavingalloptionswithdefaultvalues.

6. DataServices,bydefault,generatesaPDFdocumentinthebrowser’sdefaultDownloadsfolder:

Page 356: SAP Data Services 4.x Cookbook

Howitworks…Asyouhaveprobablynoticed,theAutoDocumentationfeatureisonlyavailableforthejobsincludedinprojectsasitdisplaystheobjecttreestartingfromtherootProjectlevel.JobsthatwerecreatedintheLocalObjectLibraryandwerenotassignedtoaspecificprojectwillnotbevisibleforauto-documenting.

AutoDocumentationexportisavailableintwoformats:PDFandMicrosoftWord(seethefollowingscreenshot):

Onthesamescreen,youcandisplaytypesofinformationtobeincludedinthedocumentationfile.

NoteNotethatdataflowdocumentationincludesmappingofeachandeverycolumnfromsourcetotargetthroughalldataflowtransformations.Thisisaverydetailedlevel,andevenourdataflowinsideJob_DWH_DimGeographyisnotatallcomplex.Thedatasetswearemigratingarerelativelysmall,butwestillgeta34-pagesdocument.So,youcanseethatthedocumentationlevelisextremelydetailed.

AnotherextremelyusefulfeatureofDataServicesAutoDocumentationistheTableUsagetab:

Itallowsustoseewhichsourceandtargettableobjectsareusedwithinthe

Page 357: SAP Data Services 4.x Cookbook

Job_DWH_DimGeographyobjecttree.

InformationlikethisaboutrelationshipsbetweenobjectswithinETLisextremelyusefulasduringdevelopment,someobjectsoftenchange,andyouneedtoevaluatehowitimpactstheETLcode.Ifthetablecolumnischanged(renamedanddatatypechanged)onthedatabaselevelandyouhavetoapplythesamechangestoyourETLcode.Otherwise,itwillfailthenexttimeitruns,asDataServicesisnotawareofthetablechangesandstilloperateswitholdversionofthetable.

TableobjectdependenciescanalsobevisualizedwithanotherDataServicesfeature:ImpactandLinageAnalysis.ThisfunctionalitywillbediscussedinChapter12,IntroductiontoInformationSteward.

Page 358: SAP Data Services 4.x Cookbook

Chapter7.ValidatingandCleansingDataHerearetherecipespresentedinthischapter:

CreatingvalidationfunctionsUsingvalidationfunctionswiththeValidationtransformReportingdatavalidationresultsUsingregularexpressionsupporttovalidatedataEnablingdataflowauditDataQualitytransforms–cleansingyourdata

Page 359: SAP Data Services 4.x Cookbook

IntroductionThischapterintroducestheconceptsofvalidatingmethodsthatcanbeappliedtothedatapassingthroughETLprocessesinordertocleanseandconformitaccordingtothedefinedDataQualitystandards.Itincludesvalidationmethodsthatconsistofdefiningvalidationexpressionswiththehelpofvalidationfunctionsandthensplittingdataintotwodatasets:validandinvaliddata.Invaliddatathatdoesnotpassthevalidationfunctionconditionsusuallygetsinsertedintoaseparatetargettableforfurtherinvestigation.

Anothertopicdiscussedinthischapterisdataflowaudit.ThisfeatureofDataServicesallowsthecollectionofexecutionalstatisticalinformationaboutthedataprocessedbythedataflowandevencontrolstheexecutionalbehaviordependingonthenumberscollected.

Finally,wewilldiscusstheDataQualitytransforms—thepowerfulsetofinstrumentsavailableinDataServicesinordertoparse,categorize,andmakecleansingsuggestionsinordertoincreasethereliabilityandqualityofthetransformeddata.

Page 360: SAP Data Services 4.x Cookbook

CreatingvalidationfunctionsOneofthewaystoimplementthedatavalidationprocessinDataServicesistousevalidationfunctionsalongwiththeValidationtransforminyourdataflowtosplittheflowofdataintotwo:recordsthatpassthedefinedvalidationruleandthosethatdonot.Thosevalidationrulescanbecombinedintovalidationfunctionobjectsforyourconvenienceandtraceability.

Inthisrecipe,wewillcreateastandardbutquitesimplevalidationfunction.Wewilldeployitinourdataflow,whichextractstheaddressdatafromthesourcesystemintoastagingarea.TheValidationfunctionwillchecktoseewhetherthecityinthemigratedrecordhasParisasavalue,andifitdoes,itwillsendtherecordstoaseparaterejecttable.

Page 361: SAP Data Services 4.x Cookbook

GettingreadyFirst,weneedtocreateanotherschemainourSTAGEdatabasetocontainrejecttables.CreatingtheRejectschematostorethesetablesallowsthekeepingoftheoriginaltablenames;thatmakeswritingqueriesandreportingagainstthosetablesaswellaslocatingthemmucheasier.

1. OpenSQLServerManagementStudio.2. GotoSTAGE|Security|SchemasintheObjectExplorerwindow.3. Right-clickonthelistandchooseNewSchema…inthecontextmenu.4. ChooseRejectforschemanameanddboasschemaowner.5. ClickonOKtocreatetheschema.

Page 362: SAP Data Services 4.x Cookbook

Howtodoit…Followthesestepstocreateavalidationfunction:

1. LogintoDataServicesDesignerandconnecttothelocalrepository.2. GotoLocalObjectLibrary|CustomFunctions.3. Right-clickonValidationFunctionsandselectNewfromthecontextmenu.4. Inputthefunctionnamefn_Check_Paris,checkValidationfunction,asshownin

thefollowingscreenshot,andpopulatethedescriptionfield.

5. ClickonNextandinputthefollowingcodeinthemainsectionofSmartEditor:#Validationfunctiontocheckifthepassedvalueequals

#to‘Paris’

#Wrapthefunctioninthetry-catchblock.Wedonotwant

#tofailthedataflowprocess

#ifthefunctionitselffails.

try

begin

#Assigninputparametervaluetoalocalvariable

$l_City=$p_City;

$l_AddressID=$p_AddressID;

#Default“Success”resultstatus

$l_Result=1;

if($l_City=‘Paris’)

begin

#Changeto“Failure”resultstatus

$l_Result=0;

end

#Returningresultstatus

Return$l_Result;

end

catch(all)

begin

#writinginformationaboutthefailureinthe

#tracelog

print(‘Validationfunctionfn_check_Paris()failedwitherror:’||

Page 363: SAP Data Services 4.x Cookbook

error_message()||’whileprocessingAddressID={$l_AddressID}withCity={$l_City}’);

#Returningtheresultstatus

Return$l_Result;

end

6. InthesameSmartEditorwindow,createlocalvariables$l_AddressIDint,$l_Cityvarchar(100),and$l_Resultintandthefunction’sinputparameters,$p_Cityvarchar(100)and$p_AddressIDint.

7. ClickontheValidatebuttontovalidatethefunctionandclickOKtocloseSmartEditorandsaveallchanges.

Page 364: SAP Data Services 4.x Cookbook

Howitworks…Thefunction’sbodyiswrappedintry-catchblocktopreventourmaindataflowprocessesfromfailingifsomethinggoeswrongwiththevalidationfunction.Thevalidationfunctionisexecutedforeachrowpassingthrough,soitwouldbeineffectiveatallowingtheexecutionofthefunctiontodeterminetheexecutionbehaviorofthemainprocess.

Trytoimagineasituationwhenyouyourdataflowprocess2millionrecordsfromthesourcetableand50ofthemmakethefunctionfailforsomereasonorother.Toprocessall2millionrecordsinonego,youwouldneedtowrapthelogicoftheentirefunctionintry-catchandoutputextrainformationintothetracelogorintoanexternaltableinthecatchsectiontoperformfurtheranalysisofthedataafterprocessingisdone.

Inourexample,weonlypasstheAddressIDfieldfortraceabilitypurposes,soitwouldbeeasytofindtheexactrowonwhichthefunctionfailed.

Thevalidationfunctionshouldreturneither1or0.Thevalue1meansthattheprocessedrowagainstwhichthevalidationfunctionwasexecutedsuccessfullypassedthevalidation;0meansfailure.

Seeinthefollowingscreenshotthat,inLocalObjectLibrary,validationfunctionsaredisplayedseparatelyfromcustomfunctions:

Page 365: SAP Data Services 4.x Cookbook

UsingvalidationfunctionswiththeValidationtransformThisrecipewilldemonstratehowvalidationfunctionsaredeployedandconfiguredwithinadataflow.Asthevalidationfunctionthatwecreatedinthepreviousrecipevalidatescityvalues,wewilldeployitintheDF_Extract_AddressdataflowobjecttoperformthevalidationofdataextractedfromtheAddresstablelocatedinthesourceOLTPdatabase.

Page 366: SAP Data Services 4.x Cookbook

GettingreadyOpenthejob-containingdataflow,DF_Extract_Address,alreadycreatedintheUsecaseexample–populatingdimensiontablesrecipeinChapter5,Workflow–ControllingExecutionOrder,andcopyitintoanewjobtobeabletoexecuteitasastandaloneprocess.

Page 367: SAP Data Services 4.x Cookbook

Howtodoit…1. OpenDF_Extract_Addressinthemainworkspaceforediting.2. GotoLocalObjectLibrary|Transform,findtheValidationtransformunder

Platform,anddragitintotheDF_Extract_AddressdataflowrightaftertheQuerytransform.

3. LinktheoutputoftheQuerytransformtotheValidationtransformanddouble-clickontheValidationtransformtoopenitforediting.

4. OpentheValidationtransformintheworkspaceandseehowValidationsplitstheflowintothreeoutputschemas:Validation_Pass,Validation_Fail,andValidation_RuleViolation:

TheValidation_PassandValidation_Failoutputschemasareidentical,exceptthatValidation_Failcontainsthreeextracolumns:DI_ERRORACTION,DI_ERRORCOLUMNS,andDI_ROWID.

5. InsidetheValidationtransform,clickontheAddbuttonlocatedontheValidationRulestabtocreatethefirstvalidationrule.ChoosetheValidationFunctionoptionforthecreatedruleandmapcolumnssentfromtheprevioustransformoutputtotheinputparameters,alsochoosingSendToFailasthevalueforActiononFail.Donotforgettospecifythevalidationrulenameanddescription.

Page 368: SAP Data Services 4.x Cookbook

6. ClickonOKtocreatethevalidationrule.ItisnowdisplayedintheValidationtransform.

7. NowclosetheValidationtransformeditorwindowandaddthreeQuerytransforms,oneforeachvalidationschemaoutput.NamethemValidation_Pass,Validation_Fail,andValidation_Rules.LinktheValidationtransformoutputtoallQuerytransformschoosingthecorrectlogicbrancheachtimeDataServicesasksyouto.

8. MapallinputschemacolumnstotheoutputschemasinallcreatedQuerytransformswithoutmakinganychangestothemappings.

Page 369: SAP Data Services 4.x Cookbook

9. CreatetwoadditionaltemplatetargettablestooutputdatafromtheRulesandFailtransforms.SpecifytheREJECTownerschemaforbothofthemasfollows:

TheADDRESStemplatetablefortheFailoutputTheADDRESS_RULEStemplatetablefortheRulesoutput

10. Yourfinaldataflowversionshouldlookliketheoneinthefollowingscreenshot:

11. Saveandexecutethejob.12. Aftertheexecutionisfinished,openthedataflowagainandviewthedatainthe

REJECT.ADDRESSandREJECT.ADDRESS_RULEStables:

NoteNotethattherowswherethevalueofCITYequalsParisarenotpassedtotheTransform.ADDRESSstagetableanymore.

Page 370: SAP Data Services 4.x Cookbook

Howitworks…Usually,theValidationtransformisdeployedrightbeforethetargetobjecttoperformthevalidationofdatachangedbyprevioustransformations.

ThePassoutputschemaoftheValidationtransformisusedtooutputrecordsthathavesuccessfullypassedthevalidationruledefinedbyeithervalidationfunction(s)orcolumncondition(s).

Notethatyoucandefineasmanyvalidationfunctionsorcolumnconditionrulesasyoulike,andDataServicesisveryflexibleinallowingyoutodefinedifferentActiononFailoptionsfordifferentfunctions.Thismakesitpossibletosendsome“failed”recordstobothPassandFailoutputsorothersonlytotheFailoutput,dependingontheseverityofthevalidationrule.

Let’sreviewanotherfeatureoftheValidationtransform—theabilitytomodifythevaluesofthepassingrowsdependingontheresultofvalidationrule.Followthesesteps:

1. OpentheValidationtransformforeditinginthemainworkspace.2. Aswearevalidatingthecityname,let’schangethebehavioroftheValidation

transformtosendtherowswhichdidnotpassvalidationtobothPassandFail.However,intherowssenttothePassoutput,changethecitynamevaluefromParistoNewParis.TodothatinthesectionlocatedatthebottomoftheValidationtransformeditor,choosetheQuery.CITYcolumnandspecify‘NewParis’intheexpressionfield,asshownhere:

3. Saveandexecutethejob.4. OpenthedataflowagainandviewthedatafromboththeTransform.ADDRESSand

Reject.ADDRESStables.YouwillseethatrecordswiththesameADDRESSIDfieldwereinsertedinboththetables,butinthemainstagingtable,thevaluesforcitynameweresubstitutedwithNewParis.

Page 371: SAP Data Services 4.x Cookbook

SeethefollowingtableforadescriptionoftheextracolumnsfromtheFailandRuleViolationValidationtransformoutputschemas:

DI_ERRORACTIONThisshowswheretheoutputforthespecificrulewassent:Bmeans“both”,Fmeans“fail”,andPmeans“pass”.

DI_ERRORCOLUMNS

Thisshowsthespecificcolumnsthatwerevalidated(aspartofinputvaluesforthevalidationfunctionorsimplyasasourceforcolumnvalidation).

DI_ROWID Thisistheuniqueidentifierofthefailedrow.

DI_RULENAME Thisisthenameoftherulewhichgeneratedthefailedrow.

DI_COLUMNNAME

Thisisthevalidatedcolumn(partofthevalidationfunctioninputvaluesorsourceforcolumnvalidationinthevalidationrule).NotethatinADDRESS_RULEoutput,onerowisgeneratedforeachvalidatedcolumnseparately.So,ifyourvalidationfunctionwasusingfivecolumnsfromthesourceobject,allfiveofthemareconsideredtobevalidatedcolumns,andincaseoffailure,fiverowswillbecreatedintheADDRESS_RULEtableforeachcolumnwiththesameROWID(seethefigureshowingthecontentsoftheADDRESS_RULEtableinthefirstexampleofjobexecutioninthisrecipe).

Page 372: SAP Data Services 4.x Cookbook

ReportingdatavalidationresultsOneoftheadvantagesofusingtheValidationtransformisthatDataServicesprovidesthereportingfunctionalitywhichisbasedonvalidationstatisticsandcollectedsampledataduringvalidationprocesses.

ValidationreportscanbeviewedintheDataServicesManagementConsole.Inthisrecipe,wewilllearnhowtocollectdataforvalidationreportsandaccessthemintheDataServicesManagementConsole.

Page 373: SAP Data Services 4.x Cookbook

GettingreadyUsethesamejobanddataflow,DF_Extract_Address,updatedwiththeValidationtransformasinthepreviousrecipesofthecurrentchapter.

Page 374: SAP Data Services 4.x Cookbook

Howtodoit…1. OpenthedataflowDF_Extract_Addressanddouble-clickontheValidation

transformobjecttoopenitforediting.

NoteTobeabletouseDataServicesvalidationreports,thevalidationstatisticscollectionhastobeenabledfirstforaValidationtransformobjectintheETLcodestructurethatyouwanttocollectthereportingdatafor.

2. OpentheValidationTransformOptionstabintheValidationtransformeditor.3. Tickbothcheck-boxes,CollectdatavalidationstatisticsandCollectsampledata.

4. SaveandrunthejobtocollectthedatavalidationstatisticsforthedatasetprocessedbyDF_Extract_Address.MakesurethatyoudonothavetheoptionDisabledatavalidationstatisticscollectionselectedonthejob’sExecutionPropertieswindow:

Page 375: SAP Data Services 4.x Cookbook

5. LaunchtheDataServicesManagementConsoleandlogintoit.6. OntheHomepage,clickontheDataValidationlinktostarttheDataValidation

dashboardwebapplication:

Page 376: SAP Data Services 4.x Cookbook

7. Experimentandhoveryourmouseoverthepiecharttoseethedetailedinformationaboutpassedandfailedrecordsforyourvalidationrule.

8. Clickonaspecificareainthepiecharttodrilldownintoanotherbarchartreportshowingvalidationrules.AsweonlyhaveonevalidationruledefinedinourValidationtransformandinallrepository,thereisonlyonebardisplayedfortheCity_not_Parisvalidationrule.

Page 377: SAP Data Services 4.x Cookbook

Howitworks…TheoptionsCollectDataValidationStatisticsandCollectsampledataenableDataServicestocollectexecutionstatisticsforspecificValidationtransformrules.Inourcase,wedefinedone,sothereisnotmuchdiversityinthepresenteddashboardreportsthatyoucanseeintheDataServicesManagementConsole.

Hereisthepiechartyouseeafterimplementingsteps7-8ofthisrecipe:

Byclickingontheobjectinthebarchart,youcandrilldowntotheactualdatasampleofthefailedrowscollectedbytheValidationtransformduringjobexecution.

Theinformationpresentedinthesedashboardreportsisaveryusefulgraphicalrepresentationofthequalityofthedatawhichpassesthroughdataflowobjectsandgetsvalidated.Youcaneasilyseewhatpercentageofdatadoesnotpassthevalidationrules,thecomparisonofvalidationstatisticsbetweendifferentperiodsoftime,andeventheactualrowsthatdidnotpassthespecificvalidationrulewithoutrunningSQLqueriesonyourdatabasetables,orusinganyotherapplicationexceptDataServicesManagementConsole.

Page 378: SAP Data Services 4.x Cookbook

UsingregularexpressionsupporttovalidatedataInthisrecipe,wewillseehowyoucanuseregularexpressionstovalidateyourdata.WewilltakeasimpleexampleofvalidatingphonenumbersextractedfromthesourceOLTPtablePERSONPHONElocatedinthePERSONschema.Thevalidationrulewouldbetoidentifyallrecordswhichhavephonenumbersdifferentfromthispattern:ddd-ddd-dddd(dbeinganumeral).Let’ssaythatwedonotwanttorejectanydata.Ourgoalistogenerateadashboardreportshowingthepercentageofrecordsinthesourcetablewhichdonotcomplywiththespecifiedrequirementforthephonenumberpattern.

Page 379: SAP Data Services 4.x Cookbook

GettingreadyMakesurethatyouhavethePERSON.PERSONPHONEtableimportedintotheOLTPdatastore.Wewillcreateanewjobandnewdataflow,DF_Extract_PersonPhone,whichwillbemigratingPersonPhonerecordsfromOLTPtoaSTAGEdatabase,atthesametimeasvalidatingthem.

Page 380: SAP Data Services 4.x Cookbook

Howtodoit…1. Createanewjobwithanewdataflow,DF_Extract_PersonPhone,designedasa

standardextractdataflowwithadeployedValidationtransform,asshowninthefollowingfigure:

2. YoushouldalsocreatetargettablesfortheRuleViolationandFailoutputschemasintheRejectschemaoftheSTAGEdatabase.

3. Toconfigurethevalidationrule,opentheValidationtransformforeditinginthemainworkspace.UseColumnValidationinsteadofValidationFunctionandputthefollowingcustomconditionintoQuery.PHONENUMBER:match_regex(Query.PHONENUMBER,’^\d{3}-\d{3}-\d{4}$’,NULL)=1

Thevalidationruleconfigurationshouldlooklikeinthefollowingscreenshot:

Page 381: SAP Data Services 4.x Cookbook

NoteNotethatforActiononFail,wesetupSendToBothaswedonotwantourvalidationprocessaffectingthemigrateddataset.

4. ClickonOKtocreateandsavethevalidationrule.5. Nowgotothesecondtab,ValidationTransformOptions,andcheckallthree

options:Collectdatavalidationstatistics,Collectsampledata,andCreatecolumnDI_ROWIDonValidation_Fail.

6. YourValidationtransformshouldlooklikethisnow:

7. Saveandexecutethejobtoextracttherecordsintothestagingtableandcollectthevalidationdataforthedashboardreport.

Page 382: SAP Data Services 4.x Cookbook

Howitworks…Regularexpressionsareapowerfulwaytovalidatethedatapassingthrough.Thematch_regex()functionusedinthisrecipereturns1ifthevalueintheinputcolumnmatchesthepatternspecifiedasthesecondinputparameter.

DataServicessupportsstandardPOSIXregularexpressions.Seethematch_regexsection(section6.3.96)inChapter6,FunctionsandProcedures,oftheDataServices4.2ReferenceGuideforfullsyntaxandregularexpressionsupportdetails.

Notethatinthisrecipe,wedidnotrejecttherecordswhichfailedthevalidationrule.Asourgoalwastosimplyevaluatethenumberofrecordswhichdonotcomplywiththephonenumberstandard,bothfailedandpassedrecordswereforwardedtothetargetmainstagingtable.

Let’sseehowthedashboardvalidationreportforourjobexecutionlooks:

1. LaunchtheDataServicesManagementConsoleandloginintoit.2. OpentheDataValidationapplicationonthemainHomepage.3. Bydefault,DataServicesshowsthedatavalidationstatisticsforallfunctionalareas

forthecurrentdate(startingfrommidnight).4. Hoveryourmousepointerandclickonthefailedredsectionofthepiecharttosee

thefollowingdetails:thepercentageandnumberofrowswhichdidnotpassthevalidationrule.

5. Ifyoudidnotrunanyjobsgatheringvalidationstatisticstoday,thepiechartforDF_Extract_PersonPhonecreatedandexecutedinthisrecipeshowsthat9,188records(46%)inthePERSONPHONEtablehaveaphonenumberinapatterndifferentfromddd-ddd-dddd,and10,784records(54%)havephonenumbersmatchingthispattern.

Page 383: SAP Data Services 4.x Cookbook

EnablingdataflowauditAuditinginDataServicesallowsthecollectionofadditionalinformationaboutthedatamigratedfromthesourcetothetargetbyaspecificdataflowonwhichtheauditisenabled,andevenallowsmakingdecisionsaccordingtotherulesappliedontheauditdata.Inthisrecipe,wewillseehowauditcanbeenabledandutilizedduringtheextractionofdatafromthesourcesystem.

Page 384: SAP Data Services 4.x Cookbook

GettingreadyForthisrecipe,youcanusethedataflowDF_Extract_Addressfromthepreviousrecipesofthischapter.

Page 385: SAP Data Services 4.x Cookbook

Howtodoit…Performthefollowingstepstoenabletheauditingforthespecificdataflow.

1. OpenDF_Extract_AddressintheworkspacewindowandselectTools|Auditfrom

thetop-levelmenu.

2. Inthenewlyopenedwindow,selecttheLabeltab,right-clickintheemptyspace,andchooseShowAllObjectsfromthecontextmenu.

3. TheLabeltabdisplaysthelistofobjectsfromwithinadataflow.EnableauditingontheQueryandPassQuerytransformobjectsbyright-clickingonthemandselectingtheCountoptionfromthecontextmenu.

4. Anotherwaytoenableauditingonspecificobjectsfromwithinadataflowistoright-clickonitandselectthePropertiesoptionfromthecontextmenu.

5. Then,gototheAudittabinthenewlyopenSchemaPropertieswindowandselecttherespectiveauditfunctionfromthecomboboxmenu.Inourcase,bothauditpoints

Page 386: SAP Data Services 4.x Cookbook

wereenabledforQuerytransforms,andtheonlyauditoptionavailableinthiscaseisCount.

6. DataServicescreatestwovariableswhichareusedtostoretheauditvalue.ForthePassQuerytransform,twovariableswerecreatedbydefault:$Count_Pass,tostorethenumberofsuccessfullypassedrecords,and$CountError_Passtostorethenumberofincorrectorrejectedrecords.

7. Let’schangethedefaultauditvariablenamesfortheQueryobjectbyopeningitspropertiesandselectingtheAudittabontheSchemaPropertieswindow.

8. Specifytheauditvariablenamestobe$Count_Extractand$CountError_Extract.Then,closethewindowbyclickingontheOKbutton.

9. Now,closetheAudit:DF_Extract_AddresswindowbyclickingontheClosebutton.

10. Ifyoutakealookatthedataflowobjectsintheworkspacewindow,youcanseethatthecreatedauditpointsweremarkedwithsmallgreenicons.Toaccessthedataflowauditconfiguration,youalsocanjustclickontheAuditbuttoninthetoolsmenu.

Page 387: SAP Data Services 4.x Cookbook
Page 388: SAP Data Services 4.x Cookbook

Howitworks…Atthispoint,youhaveconfiguredtheauditcollectionforrowspassingtwoQueryobjectsintheDF_Extract_Addressdataflow.Auditing,ifenabledattheobjectlevel,allowsonlysingle-auditfunctionusage:countauditfunction.Thisauditfunctionsimplykeepstrackofthenumberofrecordspassingthespecificobjectinsidethedataflow.

Auditingcanalsobeenabledonthecolumnlevelinsidetheobjectwhichresidesinsidethedataflow,usuallyonthecolumnsintheQuerytransforms.Inthatcase,threeadditionalauditfunctionsareavailable—Sum,Average,andChecksum—ifthecolumnisofnumericdatatypeandonlyChecksumisavailableifthecolumnisofthevarchardatatype.Asyoumighthaveguessed,thesefunctionsallowyoutostoreeitherthesummaryortheaverageofvaluesinthespecificcolumnsforallpassingrecordsorcalculatethechecksum.

Asyoucansee,thecollectedauditdatacanlaterbeaccessedfromtheOperationalDashboardtabintheDataServicesManagementConsole.However,themostusefulpurposeoftheauditfeatureistheabilitytodefinetherulesonthecollectedauditdataandperformtheactionsdependingontheresultoftheauditruleimplemented.

Herearethestepsshowingyouhowtoimplementtheruleoncollectedauditdata:

1. OpenDF_Extract_AddressintheworkspaceandclickontheAuditbuttontoopen

theAuditconfigurationwindowforthisdataflow.2. GototheRuletab.3. ClickontheAddbuttontoaddanewauditrule.4. ChoosetheCustomoptiontodefineacustomauditrule.5. Inputthecustomfunctionshowninthefollowingscreenshot:

6. ChecktheoptionRaiseexceptionintheActiononfailuresection.OtheroptionsareEmailtolistandScript.

TheEmailtolistoptionallowsyoutosendnotificationsaboutruleviolationstospecificemailrecipients.Notethattousethisfunctionality,youhavetospecifySMTPserverdetailsinyourDataServicesconfiguration.

Page 389: SAP Data Services 4.x Cookbook

TheScriptoptionallowsyoutoexecutescriptswritteninastandardDataServicesscriptinglanguage.

7. Therulethatwespecifiedisappliedattheveryendofthedataflowexecutionandchecksthatthepercentageofrowswhichpassedthevalidationruletakenfromthetotalamountofrowsextractedfromthesourcetableishigherthan80percent.RememberthatourvalidationrulechecksandrejectsallParisrecords.WeknowthatthenumberofrecordswithacityvalueequaltoParisissignificantlylessthan20percentoftherows,whichshouldberejectedduringvalidationtofailthedefinedauditrule.So,ifyourunyourdataflownow,nothingwillhappen;theauditrulewillnotbeviolatedandthejobwillbesuccessfullycompleted.Tomaketheauditrulefail,let’schangeourvalidationfunctiontorejectallrecordswithacityvaluenotequaltoParis,asshowninthefollowingscreenshot:

8. Asthefinalstepforutilizingauditfunctionalityonthejob’sExecutionPropertieswindow,youshouldchecktheEnableauditingoption.Ifthisisnotchecked,auditdatawillnotbecollectedandauditruleswillnotwork.

Page 390: SAP Data Services 4.x Cookbook

9. Saveandexecutethejob.Dataflowexecutionfailsandrelevantinformationisdisplayedintheerrorlog,asshownhere:

NoteRememberthatalthoughthedataflowDF_Extract_Addressfails,theauditrulecheckhappensafteritcompletesallthepreviousstepsandthedataissuccessfullyinsertedintoalltargets.

Page 391: SAP Data Services 4.x Cookbook

There’smore…CollectedauditnumberscanbeaccessedviatheOperationalDashboardtabfromtheDataServicesManagementConsole.

Toaccessit,opentheOperationalDashboardtabandselectspecificjobstoopenJobExecutionDetails.Byclickingonthejobexecutioninstancesfurther,youcanopenaJobDetailsview,whichwillcontaininformationaboutalldataflowsexecutedwithinajob.Ifthedataflowhasauditenabledforitcolumns,ContainsAuditDatawillshowyouthat.

ByclickingontheViewAuditDatabutton,youcanopenthenewwindowshowingvaluescollectedduringauditingandtheauditruleresultfortheselectedjobinstanceexecution.

Page 392: SAP Data Services 4.x Cookbook

DataQualitytransforms–cleansingyourdataDataQualitytransformsareavailableintheDataQualitysectionoftheLocalObjectLibraryTransformstab.Thesetransformshelpyoutobuildacleansingsolutionforyourmigrateddata.

ThesubjectofimplementingDataQualitysolutionsinETLprocessesissovastthatitprobablyrequiresawholechapter,orevenawholebook,dedicatedtoit.ThatiswhywewilljustscratchthesurfaceinthisrecipebyshowingyouhowtousethemostpopularofDataQualitytransforms,Data_Cleanse,toperformthesimplestdatacleansingtask.

Page 393: SAP Data Services 4.x Cookbook

GettingreadyTobuildadatacleansingprocess,itwouldbeidealifwehadsourcedatawhichrequiredcleansing.Unfortunately,ourOLTPdatasource,andespeciallyDWHdatasource,alreadycontainprettyconformedandcleandata.Therefore,wearegoingtocreatedirtydatabyconcatenatingmultiplefieldstogethertoseehowDataServicescleansingpackageswillautomaticallyparseandcleansethedataoutoftheconcatenatedtextfield.

Asapreparationstep,makesurethatyouhaveimportedthesethreetablesinyourOLTPdatastore:PERSON,PERSONPHONE,andEMAILADDRESS(allofthemarefromthePERSONschemaoftheSQLServer’sAdventureWorks_OLTPdatabase).

Page 394: SAP Data Services 4.x Cookbook

Howtodoit…1. Asthefirststep,createanewjobwithanewdataflowobjectinit.Namethedataflow

DF_Cleanse_Person_Details.2. Importthreetables—PERSON,PERSONPHONE,andEMAILADDRESS—fromtheOLTP

datastoreasasourcetableinsidethedataflow.3. JointhesetablesusingtheQuerytransformwiththejoinconditions,asshowninthe

followingscreenshot:

4. IntheoutputschemaoftheQuerytransform,createtwocolumns:ROWIDofthedatatypeinteger,withthefollowingfunctionasamapping:gen_row_num(),andDATAcolumnofthedatatypevarchar(255),withthefollowingmapping:PERSON.FIRSTNAME||’’||PERSON.MIDDLENAME||’’||PERSON.LASTNAME||’’||PERSONPHONE.PHONENUMBER||’’||EMAILADDRESS.EMAILADDRESS

5. Now,whenwehavepreparedthesourcefieldthatwewillbecleansing,let’simportandconfiguretheData_Cleansetransformsthemselves.DraganddroptheData_CleansetransformobjectsfromLocalObjectLibrary|Transforms|DataQualitytoyourdataflow.PleaserefertothefollowingstepsaseachData_Cleansetransformobjectwillbeimportedandconfigureddifferently.

6. ThefirstData_CleanseobjectwillbeparsingourDATAcolumntoextracttheemailaddressoftheperson.Whenimportingthetransformobjectintothedataflow,choosetheBase_DataCleanseconfiguration.

Page 395: SAP Data Services 4.x Cookbook

7. RenametheimportedData_CleansetransformtoEmail_DataCleanseandjointheQuerytransformoutputtoit.

8. OpentheEmail_DataCleansetransformeditorintheworkspacetoconfigureit.9. OntheInputtab,selectEMAIL1intheTransformInputFieldNamecolumnand

mapittotheDATAsourcefield.10. OntheOptionstab,choosePERSON_FIRMasacleansingpackagenameand

configuretherestoftheoptions,asshowninthefollowingscreenshot:

11. OntheOutputtab,selecttheEMAILfield(ofthePARSEDfieldclassrelatedtotheEMAIL1parentcomponent)tobeproducedbytheEmail_DataCleansetransform.ThatwillcreatetheEMAIL1_EMAIL_PARSEDcolumnintheoutputschemaoftheEmail_DataCleansetransform.PropagatethesourceRO0057IDcolumnaswell,whichwillbeusedtojointhecleanseddatasetstogetherinthelatersteps.

Page 396: SAP Data Services 4.x Cookbook

12. ClosetheEmail_DataCleanseeditorandimportthesecondData_CleansetransformwiththesameBase_DataCleanseconfiguration.RenametheimportedtransformobjecttoPhone_DataCleanse,joinittotheQuerytransformoutput,andopenitinthemainworkspaceforediting.

13. SelectthesametransformoptionsontheOptionstabasfortheEmail_DataCleansetransformexamplewejustsaw.

14. ChoosePHONE1astheinputparsingcomponent(TransformInputFieldName)andmapittothesourceDATAcolumnfromtheQuerytransformoutput.

15. OntheOutputtabofthePhone_DataCleansetransformeditor,choosethefollowingoutputfieldsfromthelist:

PARENT_COMPONENT FIELD_NAME FIELD_CLASS

NORTH_AMERICAN_PHONE1 NORTH_AMERICAN_PHONE PARSED

NORTH_AMERICAN_PHONE1 NORTH_AMERICAN_PHONE_EXTENSION PARSED

NORTH_AMERICAN_PHONE1 NORTH_AMERICAN_PHONE_LINE PARSED

NORTH_AMERICAN_PHONE1 NORTH_AMERICAN_PHONE_PREFIX PARSED

PHONE1 PHONE PARSED

16. Alsopropagatetwosourcefields,ROWIDandDATA,intotheoutputschemaofthePhone_DataCleansetransform.Closeittofinishediting.

17. WhenimportingthethirdData_Cleansetransform,selectthepredefinedEnglishNorthAmerica_DataCleanseconfigurationandrenamethetransformtoName_DataCleanse.

18. Openthetransformintheworkspaceforediting.YoudonothavetoconfigureanythingontheOptionstabthistime.So,selectthecomponentNAME_LINE1ontheInputtabandthefollowingfieldsontheOutputtab:

PARENT_COMPONENT FIELD_NAME FIELD_CLASS

Page 397: SAP Data Services 4.x Cookbook

PERSON1 FAMILY_NAME1 PARSED

PERSON1 GENDER STANDARDIZED

PERSON1 GIVEN_NAME1 PARSED

PERSON1 GIVEN_NAME2 PARSED

PERSON1 PERSON PARSED

19. ClosetheName_DataCleansetransformeditorandjoinallthreeData_CleanseoutputswithasingleJoinQuerytransform.UsetheROWIDcolumntojointhedatasetstogetherandremapthedefaultData_Cleanseoutputnamestomoremeaningfulnames,asshowninthefollowingscreenshot:

20. SpecifyPhone_DataCleanse.DATAISNOTNULLasajoinfilterintheJoinQuerytransformtoexcludetheemptyrecordsfromthemigration.

21. ImportthetargettemplatetableCLEANSE_RESULTstoredintheSTAGEdatastoretosavethecleansingresultsin.

22. Finally,yourdataflowshouldlooklikethis:

Page 398: SAP Data Services 4.x Cookbook

23. SaveandexecutethejobtoseethecleansingresultsintheCLEANSE_RESULTtable.

Page 399: SAP Data Services 4.x Cookbook

Howitworks…Inthefirstfewstepsoftheprecedingsequence,byconcatenationofthemultiplefieldsfromthesourceOLTPdatabase,wepreparedour“dirty”datacolumn,DATA,whichwasusedasasourcecolumnforallthreeData_Cleansetransforms.

WhenimportingtheData_Cleansetransform,DataServicesoffersyoutheoptiontochoosefromoneofthepredefinedconfigurations.TheBase_DataCleanseconfigurationrequiresyoutoconfigurethemandatoryoptionsmanuallyoryourimportedtransformobjectwillnotwork.

TheData_Cleansetransformisameremappingtooltomapyourinputcolumnstotherequiredparsingrulesanddesiredoutput.Parsingrulesandreferencedataaredefinedinthecleansingpackage,whichcouldbedevelopedandconfiguredbytheInformationStewardCleansingPackageBuildertool.Thistoolprovidesagraphicaluserinterfaceforthistask.Inthisrecipe,weareusingthedefaultcleansingpackagePERSON_FIRMavailableinDataServiceswithouttheneedtohaveInformationStewardinstalled.

NoteThedefaultPERSON_FIRMcleansingpackageallowsyoutoparseandstandardizedates,emails,firmdata,personnames,socialsecuritynumbers,andphonenumbers.

TheInputtaballowsyoutochoosethetypeofcomponentyouwouldliketoparsefromtheinputdataset.Pleasenotethatyoucannotspecifythesamefieldasasourceofdataformultiplecomponents.ThatiswhywehavetocreatethreedistinctData_CleansetransformobjectstoparsethesameDATAcolumnforemail,personname,andphonedata.Eachhasitsownconfigurationandmappingsfrominputcomponentstoadesiredsetofoutputfields.

ThesetoffieldsavailableontheOutputtabdependonwhichcomponentyouhavechosentoberecognizedandparsedontheInputtab,butitbasicallyincludesallpossibleinformationthatcanbeextractedforaselectedcomponent.Forexample,ifitisaPersonnamecomponent,outputdatacleansefieldsincludegivenname,secondgivenname,lastname,gender,andsimilarothers.

PropagationofanartificialROWIDcolumnallowsustojointhesplitdatasetstogetheraftertheyareprocessedbyData_Cleansetransforms.

ToviewtheresultdatausetheViewdataoptiononthetargettableobjectinthedatafloworopenSQLServerManagementStudioandrunthefollowingquerytoseetheparsedresults:selectDATA,EMAIL,PHONE,GIVEN_NAME,GIVEN_NAME_2ND,FAMILY_NAME,

GENDER_STANDARDIZED

fromdbo.CLEANSE_RESULT

Asyoucanseeinthefollowingscreenshot,Data_CleansetransformsdidaprettygoodjobofparsingtheinputDATAfield:

Page 400: SAP Data Services 4.x Cookbook

AninterestingresultisstoredintheGENDER_STANDARDIZEDcolumn.Basedontheparsingrulesandreferencedataavailable,DataServicessuggestshowaccuratethedeterminationofgendercouldbebasedsolelyontheavailablegivenandlastnames.

Page 401: SAP Data Services 4.x Cookbook

There’smore…Asmentionedbefore,DataServiceshasgreatDataQualitycapabilities.Thisisahugetopicfordiscussion,andwe’vejustscratchedthesurfacebyshowingyouonetransformfromthistoolset.ThispowerfulfunctionalityworksbestwhenDataServicesisintegratedwithInformationSteward.Youcanbuildyourowncleansingpackagestoparsethemigrateddatamoreefficientlyandaccurately.PleaserefertoChapter12,IntroductiontoInformationSteward,formoredetails.

Page 402: SAP Data Services 4.x Cookbook

Chapter8.OptimizingETLPerformanceIfyoutriedallthepreviousrecipesfromthebook,youcanconsideryourselffamiliarwiththebasicdesigntechniquesavailableinDataServicesandcanperformprettymuchanyETLdevelopmenttask.Startingfromthischapter,wewillbeginusingadvancedevelopmenttechniquesavailableinDataServices.ThisparticularchapterwillhelpyoutounderstandhowtheexistingETLprocessescanbeoptimizedfurthertomakesurethattheyrunquicklyandefficiently,consumingaslesscomputerresourcesaspossiblewiththeleastamountofexecutiontime.

Optimizingdataflowexecution–push-downtechniquesOptimizingdataflowexecution–theSQLtransformOptimizingdataflowexecution–theData_TransfertransformOptimizingdataflowreaders–lookupmethodsOptimizingdataflowloaders–bulk-loadingmethodsOptimizingdataflowexecution–performanceoptions

Page 403: SAP Data Services 4.x Cookbook

IntroductionDataServicesisapowerfuldevelopmenttool.Itsupportsalotofdifferentsourceandtargetenvironments,allofwhichworkdifferentlywithregardtoloadingandextractingdatafromthem.Thisiswhyitisrequiredofyou,asanETLdeveloper,tobeabletoapplydifferentdesignmethods,dependingontherequirementsofyourdatamigrationprocessesandtheenvironmentthatyouareworkingwith.

Inthischapter,wewillreviewthemethodsandtechniquesthatyoucanusetodevelopdatamigrationprocessesinordertoperformtransformationsandmigratedatafromthesourcetotargetmoreeffectively.Thetechniquesdescribedinthischapterareoftenconsideredasbestpractices,butdokeepinmindthattheirusagehastobejustified.Theyallowyoutomoveandtransformyourdatafaster,consumingfewerprocessingresourcesontheETLengine’sserverside.

Page 404: SAP Data Services 4.x Cookbook

Optimizingdataflowexecution–push-downtechniquesTheExtract,Transform,andLoadsequencecanbemodifiedtoExtract,Load,andTransformbydelegatingthepowerofprocessingandtransformingdatatothedatabaseitselfwherethedataisbeingloadedto.

Weknowthattoapplytransformationlogictoaspecificdatasetwehavetofirstextractitfromthedatabase,thenpassitthroughtransformobjects,andfinallyloaditbacktothedatabase.DataServicescan(andmostofthetime,should,ifpossible)delegatesometransformationlogictothedatabaseitselffromwhichitperformstheextract.ThesimplestexampleiswhenyouareusingmultiplesourcetablesinyourdataflowjoinedwithasingleQuerytransform.Insteadofextractingeachtable’scontentsseparatelyontoanETLboxbysendingmultipleSELECT*FROM<table>requests,DataServicescansendthegeneratedsingleSELECTstatementwithproperSQLjoinconditionsdefinedintheQuerytransform’sFROMandWHEREtabs.Asyoucanprobablyunderstand,thiscanbeveryefficient:insteadofpullingmillionsofrecordsintotheETLbox,youmightendupwithgettingonlyafew,dependingonthenatureofyourQueryjoins.SometimesthisprocessshortenstoacompletezeroprocessingontheDataServicesside.Then,DataServicesdoesnotevenhavetoextractthedatatoperformtransformations.WhathappensinthisscenarioisthatDataServicessimplysendstheSQLstatementinstructionsintheformofINSERTINTO…SELECTorUPDATE…FROMstatementstoadatabasewhenallthetransformationsarehardcodedinthoseSQLstatementsdirectly.

ThescenarioswhenDataServicesdelegatesthepartsoforalltheprocessinglogictotheunderlyingdatabasearecalledpush-downoperations.

Inthisrecipe,wewilltakealookatdifferentkindsofpush-downoperations,whatrulesyouhavetofollowtomakepush-downworkfromyourdesignedETLprocesses,andwhatpreventspush-downsfromhappening.

Page 405: SAP Data Services 4.x Cookbook

GettingreadyAsastartingexample,let’susethedataflowdevelopedintheLoadingdatafromtabletotable–lookupsandjoinsrecipeinChapter4,Dataflow–Extract,Transform,andLoad.Pleaserefertothisrecipetorebuildthedataflowif,forsomereason,youdonothaveitinyourlocalrepositoryanymore.

Push-downoperationscanbeoftwodifferenttypes:

Partialpush-downs:Apartialpush-downiswhenOptimizersendstheSELECTqueryjoiningmultiplesourcetablesusedinadatafloworsendsoneSELECTstatementtoextractdatafromaparticulartablewithmappinginstructionsandfilteringconditionsfromtheQuerytransformhardcodedinthisSELECTstatement.Fullpush-downs:Afullpush-downiswhenalldataflowlogicisreformedbyOptimizerinasingleSQLstatementandsenttothedatabase.ThemostcommonstatementsgeneratedinthesecasesarecomplexINSERT/UPDATEandMERGEstatements,whichincludeallsourcetablesfromthedataflowjoinedtogetherandtransformationsintheformofdatabasefunctionsappliedtothetablecolumns.

Page 406: SAP Data Services 4.x Cookbook

Howtodoit…1. TobeabletoseewhatSQLquerieshavebeenpusheddowntothedatabase,openthe

dataflowintheworkspacewindowandselectValidation|DisplayOptimizedSQL….

2. TheOptimizedSQLwindowshowsallqueriesgeneratedbyDataServicesOptimizerandpusheddowntothedatabaselevel.Inthefollowingscreenshot,youcanseetheELECTqueryandpartofthedataflowlogicwhichthisstatementrepresents:

3. Let’strytopushdownlogicfromtherestoftheQuerytransforms.Ideally,wewouldliketoperformafullpush-downtothedatabaselevel.

4. TheLookup_PhoneQuerytransformcontainsafunctioncallwhichextractsthePHONENUMBERcolumnfromanothertable.ThislogiccannotbeincludedasisbecauseOptimizercannottranslateinternalfunctioncallsintoSQLconstruction,whichcouldbeincludedinthepush-downstatement.

5. Let’stemporarilyremovethisfunctioncallbyspecifyingahardcodedNULLvalueforthePHONENUMBERcolumn.Justdeleteafunctioncallandcreateanewoutputcolumninsteadofthevarchar(25)datatype.

Page 407: SAP Data Services 4.x Cookbook

6. ValidateandsavethedataflowandopentheOptimizedSQLwindowagaintoseetheresultofthechanges.Straightaway,youcanseehowlogicfromboththeLookup_PhoneandDistinctQuerytransformswereincludedintheSELECTstatement:thedefaultNULLvalueforanewcolumnandDISTINCToperatoratthebeginningofthestatement:

7. Whatremainsforthefullpush-downistheloadingpartwhenalltransformationsandselecteddatasetsareinsertedintothetargettablePERSON_DETAILS.Thereasonwhythisdoesnothappeninthisparticularexampleisbecausethesourcetablesandtargettablesresideindifferentdatastoreswhichconnecttothedifferentdatabases:OLTP(AdventureWorks_OLTP)andSTAGE.

8. SubstitutethePERSON_DETAILStargettablefromtheDS_STAGEdatastorewithanewtemplatetable,PERSON_DETAILS,createdintheDBOschemaofOLTP.

9. Asachange,youcanseethatOptimizernowfullytransformsdataflowlogicintoapushed-downSQLstatement.

Page 408: SAP Data Services 4.x Cookbook
Page 409: SAP Data Services 4.x Cookbook

Howitworks…DataServicesOptimizerwantstoperformpush-downoperationswheneverpossible.Themostcommonreasons,aswedemonstratedduringtheprecedingsteps,forpush-downoperationsnotworkingareasfollows:

Functions:WhenfunctionsusedinmappingscannotbeconvertedbyOptimizertosimilardatabasefunctionsingeneratedSQLstatements.Inourexample,thelookup_ext()functionpreventspush-downfromhappening.Oneoftheworkaroundsforthisistosubstitutethelookup_ext()functionwithanimportedsourcetableobjectjoinedtothemaindatasetwiththehelpoftheQuerytransform(seethefollowingscreenshot):

Transformobjects:WhentransformobjectsusedinadataflowcannotbeconvertedbyOptimizertorelativeSQLstatements.Sometransformsaresimplynotsupportedforpush-down.Automaticdatatypeconversions:Thesecansometimespreventpush-downfromhappening.Differentdatasources:Forpush-downoperationstoworkforthelistofsourceortargetobjects,thoseobjectsmustresideinthesamedatabaseormustbeimportedintothesamedatastore.Iftheyresideindifferentdatabases,dblinkconnectivityshouldbeconfiguredonthedatabaselevelbetweenthosedatabases,anditshouldbeenabledasaconfigurationoptioninthedatastoreobjectproperties.AllDataServicescandoissendaSQLstatementtoonedatabasesource,soitislogicalthatifyouwanttojoinmultipletablesfromdifferentdatabasesinasingleSQLstatement,youhavetomakesurethatconnectivityisconfiguredbetweendatabases,andthenyoucanrunSQLdirectlyonthedatabaselevelbeforeevenstartingtodeveloptheETLcodeinDataServices.

Page 410: SAP Data Services 4.x Cookbook

WhatisalsoimportanttorememberisthatDataServicesOptimizercapabilitiesdependonthetypeofunderlyingdatabasethatholdsyoursourceandtargettableobjects.Ofcourse,ithastobeadatabasethatsupportstheSQLstandardlanguageasOptimizercansendthepush-downinstructionsonlyintheformofSQLstatements.

Sometimes,youactuallywanttopreventpush-downsfromhappening.Thiscanbethecaseif:

ThedatabaseisbusytotheextentthatitwouldbequickertodotheprocessingontheETLboxside.Thisisararescenario,butstillsometimesoccursinreallife.Ifthisisthecase,youcanuseoneofthemethodswejustdiscussedtoartificiallypreventthepush-downfromhappening.YouwanttoactuallymakerowsgothroughtheETLboxforauditingpurposesortoapplyspecialDataServicesfunctionswhichdonotexistatthedatabaselevel.Inthesecases,thepush-downwillautomaticallybedisabledandwillnotbeusedbyDataServicesanyway.

Page 411: SAP Data Services 4.x Cookbook

Optimizingdataflowexecution–theSQLtransformSimplyput,theSQLtransformallowsyoutospecifySQLstatementsdirectlyinsidethedataflowtoextractsourcedatainsteadofusingimportedsourcetableobjects.Technically,ithasnothingtodowithoptimizingtheperformanceofETLasitisnotagenerallyrecommendedpracticetosubstitutethesourcetableobjectswiththeSQLtransformcontaininghard-codedSELECTSQLstatements.

Page 412: SAP Data Services 4.x Cookbook

Howtodoit…1. TakethedataflowusedinthepreviousrecipeandselectValidation|Display

OptimizedSQL…toseethequerypusheddowntothedatabaselevel.WearegoingtousethisquerytoconfigureourSQLtransformobject,whichwillsubstituteallsourcetableobjectsontheleft-handsideofthedataflow.

2. OntheOptimizedSQLwindow,clickonSaveAs…tosavethispush-downquerytothefile.

3. Drag-and-droptheSQLtransformfromLocalObjectLibrary|Transforms|Platformintoyourdataflow.

Page 413: SAP Data Services 4.x Cookbook

4. Nowyoucanremoveallobjectsontheleft-handsideofthedataflowpriortotheLookup_PhoneQuerytransform.

5. OpentheSQLtransformforeditinginaworkspacewindow.ChooseOLTPasadatastoreandcopyandpastethequerysavedpreviouslyfromyourfileintotheSQLtextfield.TocompletetheSQLtransformconfiguration,createoutputschemafieldsofappropriatedatatypeswhichmatchthefieldsreturnedbytheSELECTstatement.

6. ExittheSQLtransformeditorandlinkittothenextLookup_PhoneQuerytransform.OpenLookup_Phoneandmapthesourcecolumnstotarget.

7. Pleasenotethatthedataflowdoesnotperformanynativepush-downqueriesanymore,andwillgiveyouthefollowingwarningmessageifyoutrytodisplayoptimizedSQL:

8. Validatethejobbeforeexecutingittomakesuretherearenoerrors.

Page 414: SAP Data Services 4.x Cookbook

Howitworks…Asyoucansee,thestructureoftheSQLtransformisprettysimple.Therearenotmanyoptionsavailableforconfiguration.

Datastore:ThisoptiondefineswhichdatabaseconnectionwillbeusedtopasstheSELECTquerytoDatabasetype:ThisoptionprettymuchduplicatesthevaluedefinedforthespecifieddatastoreobjectCache:ThisoptiondefineswhetherthedatasetreturnedbythequeryhastobecachedontheETLboxArrayfetchsize:ThisoptionbasicallycontrolstheamountofnetworktrafficgeneratedduringdatasettransferfromdatabasetoETLboxUpdateschema:ThisbuttonallowsyoutoquicklybuildthelistofschemaoutputcolumnsfromtheSQLSELECTstatementspecifiedintheSQLtextfield

ThetwomostcommonreasonswhywouldyouwanttouseSQLtransforminsteadofdefiningsourcetableobjectsareasfollows:

Simplicity:Sometimes,youdonotcareaboutanythingelseexceptgettingthingsdoneasfastaspossible.SometimesyoucangettheextractrequirementsintheformofaSELECTstatement,orifyouwanttouseatestedSELECTqueryinyourETLcodestraightaway.ToutilizedatabasefunctionalitywhichdoesnotexistinDataServices:ThisisusuallyapoorexcuseasexperiencedETLdeveloperscandoprettymuchanythingwithstandardDataServicesobjects.However,somedatabasescanhaveinternalnon-standardSQLfunctionsimplementedwhichcanperformcomplextransformations.Forexample,inNetezzayoucanhavefunctionswritteninC++,whichcanbeutilizedinstandardSQLstatementsand,mostimportantly,willbeusingthemassive-parallelprocessingfunctionalityoftheNetezzaengine.Ofcourse,DataServicesOptimizerisnotawareofthesefunctionsandtheonlywaytousethemistorundirectSELECTSQLstatementsagainstthedatabase.IfyouwanttocallaSQLstatementlikethisfromDataServices,themostconvenientwaytodoitfromwithinadataflowistousetheSQLtransformobjectinsidethedataflow.Performancereasons:Onceinawhile,youcangetasetofsourcetablesjoinedtoeachinadataflowforwhichOptimizer—forsomereasonorother—doesnotperformapush-downoperation.Youareveryrestrictedinthewaysyoucancreateandutilizedatabaseobjectsinthisparticulardatabaseenvironment.Insuchcases,usingahard-codedSELECTSQLstatementcanhelpyoutomaintainanadequatelevelofETLperformance.

Asageneralpractice,IwouldrecommendthatyouavoidSQLtransformsasmuchaspossible.Theycancomeinhandysometimes,butwhenusingthem.younotonlylosetheadvantageofutilizingDataServices,theInformationStewardreportingfunctionality,andabilitytoperformauditingoperations,youalsopotentiallycreatebigproblemsfor

Page 415: SAP Data Services 4.x Cookbook

yourselfintermsofETLdevelopmentprocess.TablesusedintheSELECTstatementscannotbetracedwiththeViewwereusedfeature.Theycanbemissingfromyourdatastores,whichmeansyoudonothaveacomprehensiveviewofyourenvironmentandunderlyingdatabaseobjectsutilizedbyhidingsourcedatabasetablesinsidetheETLcoderatherthanhavingthemondisplayinLocalObjectLibrary.

ThisobviouslymakesETLcodehardertomaintainandsupport.NottomentionthatmigrationtoanotherdatabasebecomesaproblemasyouwouldmostlikelyhavetorewriteallthequeriesusedinyourSQLtransforms.

NoteTheSQLtransformpreventsthefullpush-downfromhappening,sobecareful.OnlytheSELECTqueryinsidetheSQLtransformispusheddowntodatabaselevel.TherestofthedataflowlogicwillbeexecutedontheETLboxevenifthefullpush-downwasworkingbefore,whenyouhadsourcetableobjectsinsteadoftheSQLtransform.

Inotherwords,theresultdatasetfortheSQLtransformalwaystransferredtotheETLbox.ThatcanaffectthedecisionsaroundETLdesign.Fromtheperformanceperspective,itispreferabletospendmoretimebuildingadataflowbasedonthesourceobjecttablesbutforwhichDataServicesperformsthefullpush-down(producingtheINSERTINTO…SELECTstatement),ratherthanquicklybuildingthedataflowwhichwilltransferdatasetsbackandforthtothedatabase,increasingtheloadtimesignificantly.

Page 416: SAP Data Services 4.x Cookbook

Optimizingdataflowexecution–theData_TransfertransformThetransformobjectData_Transferisapureoptimizationtoolhelpingyoutopushdownresource-consumingoperationsandtransformationslikeJOINandGROUPBYtothedatabaselevel.

Page 417: SAP Data Services 4.x Cookbook

Gettingready1. TakethedataflowfromtheLoadingdatafromaflatfilerecipeinChapter4,Dataflow–Extract,Transform,andLoad.ThisdataflowloadstheFriends_*.txtfileintoaSTAGE.FRIENDStable.

2. ModifytheFriends_30052015.txtfileandremovealllinesexcepttheonesaboutJaneandDave.

3. Inthedataflow,addanothersourcetable,OLTP.PERSON,andjoinittoasourcefileobjectintheQuerytransformbythefirst-namefield.PropagatethePERSONTYPEandLASTNAMEcolumnsfromthesourceOLTP.PERSONtableintotheoutputQuerytransformschema,asshownhere:

Page 418: SAP Data Services 4.x Cookbook

Howtodoit…OurgoalwillbetoconfigurethisnewdataflowtopushdowntheinsertofthejoineddatasetofdatacomingfromthefileanddatacomingfromtheOLTP.PERSONtabletoadatabaselevel.

BycheckingtheOptimizedSQLwindow,youwillseethattheonlyquerysenttoadatabasefromthisdataflowistheSELECTstatementpullingallrecordsfromthedatabasetableOLTP.PERSONtotheETLbox,whereDataServiceswillperformanin-memoryjoinofthisdatawithdatacomingfromthefile.It’seasytoseethatthistypeofprocessingmaybeextremelyinefficientifthePERSONtablehasmillionsofrecordsandtheFRIENDStablehasonlyacoupleofthem.ThatiswhywedonotwanttopullallrecordsfromthePERSONtableforthejoinandwanttopushdownthisjointothedatabaselevel.

Lookingatthedataflow,wealreadyknowthatforthelogictobepusheddown,thedatabaseshouldbeawareofallthesourcedatasetsandshouldbeabletoaccessthembyrunningasingleSQLstatement.TheData_TransfertransformwillhelpustomakesurethattheFriendsfileispresentedtoadatabaseasatable.Followthesestepstoseehowitcanbedone:

1. AddtheData_TransferobjectfromLocalObjectLibrary|Transforms|Data

Integratorintoyourdataflow,puttingitbetweenthesourcefileobjectandtheQuerytransform.

2. EdittheData_Transferobjectbyopeningitinaworkspacewindow.SetTransfertypetoTableandspecifythenewtransfertableintheTableoptionssectionwithSTAGE.DBO.FRIENDS_FILE.

3. ClosetheDataTransfertransformeditorandselectValidation|DisplayOptimizedSQL…toseethequeriespusheddowntoadatabase.YoucanseethattherearenowtwoSELECTstatementsgeneratedtopulldatafromtheOLTP.PERSONandSTAGE.FRIENDS_FILEtables.

Page 419: SAP Data Services 4.x Cookbook

ThejoinbetweenthesetwodatasetshappensontheETLbox.ThenthemergeddatasetissentbacktothedatabasetobeinsertedintotheDS_STAGE.FRIENDStable.

4. AddanotherData_TransfertransformationbetweenthesourcetablePERSONandtheQuerytransform.IntheData_Transferconfigurationwindow,setTransfertypetoTableandspecifyDS_STAGE.DBO.DT_PERSONasthedatatransfertable.

5. ValidateandsavethedataflowanddisplaytheOptimizedSQLwindow.

Nowyoucanseethatwesuccessfullyimplementedafullpush-downofdataflowlogic,

Page 420: SAP Data Services 4.x Cookbook

insertingmergeddatafromtwosourceobjects(oneofwhichisaflatfile)intoastagingtable.Intheprecedingscreenshot,logicinthesectionmarkedasredisrepresentedbyaSQLstatementINSERTpusheddowntothedatabaselevel.

Page 421: SAP Data Services 4.x Cookbook

Howitworks…Underthehood,Data_Transfertransformcreatesasubprocessthattransfersthedatatothespecifiedlocation(fileortable).Simplyput,Data_Transferisatargetdataflowobjectinthemiddleofadataflow.Ithasalotofoptionssimilartowhatothertargettableobjectshave;inotherwords,youcansetupabulk-loadingmechanism,runPre-LoadCommandsandPost-LoadCommands,andsoon.

ThereasonwhyIcalledData_TransferapureoptimizationtoolisbecauseyoucanredesignanydataflowtodothesamethingthatData_Transferdoeswithoutusingit.Allyouhavetodoistosimplysplityourdataflowintwo(orthree,forthedataflowinourexample).InsteadofforwardingyourdataintoaData_Transfertransform,youforwardittoanormaltargetobjectandthen,inthenextdataflow,youusethisobjectasasource.

NoteWhatData_Transferstilldoes,whichcannotbedoneeasilywhenyouaresplittingdataflows,isautomaticallycleanuptemporarydatatransfertables.

Itiscriticaltounderstandhowpush-downmechanismsworkinDataServicestobeabletoeffectivelyusetheData_Transfertransform.Puttingittouseatthewrongplaceinadataflowcandecreaseperformancedrastically.

WhyweusedasecondData_TransfertransformobjectOurgoalwastomodifythedataflowinsuchawayastogetafullpush-downSQLstatementtobegenerated:INSERTINTOSTAGE.FRIENDSSELECT<joinedPERSONandFRIENDSdatasets>.

Aswerememberfromthepreviousrecipe,therecouldbemultiplereasonswhyfullpush-downdoesnotwork.Oneofthesereasons,whichiscausingtroubleinourcurrentexample,isthatthePERSONtableresidesinadifferentdatabase,whileourdatatransfertable,FRIENDS_FILE,andtargettable,FRIENDS,resideinthesameSTAGEdatabase.

Tomakethefullpush-downwork,wehadtouseasecondData_TransfertransformobjecttotransferdatafromtheOLTP.PERSONtableintoatemporarytablelocatedinaSTAGEdatabase.

WhentouseData_TransfertransformWheneveryouencounterasituationwhereadataflowhastoperformavery“heavy”transformation(saytheGROUPBYoperation,forexample)orjointwoverybigdatasetsandthisoperationishappeningonanETLbox.Inthesecases,itismuchquickertotransfertherequireddatasetstothedatabaselevelsothattheresource-intensiveoperationcanbecompletedtherebythedatabase.

Page 422: SAP Data Services 4.x Cookbook

There’smore…OneofthegoodexamplesofausecasefortheData_TransfertransformiswhenyouhavetoperformtheGROUPBYoperationinaQuerytransformrightbeforeinsertingdataintoatargettableobject.ByplacingData_TransferrightbeforetheQuerytransformattheendofthedataflow,youcanquicklyinsertthedatasetprocessedbydataflowlogicbeforetheQuerytransformwiththeGROUPBYoperationandthenpushdowntheINSERTandGROUPBYoperationsinasingleSQLstatementtoadatabaselevel.

Whenyouperformthetransformationsondatasetswhichincludemillionsofrecords,usingtheData_Transfertransformcansaveyouminutes,andsometimeshours,dependingonyourenvironmentandthenumberofprocessedrecords.

Page 423: SAP Data Services 4.x Cookbook

Optimizingdataflowreaders–lookupmethodsTherearedifferentwaysinwhichtoperformthelookupofarecordfromanothertableinDataServices.Thethreemostpopularonesare:atablejoinwithaQuerytransform,usingthelookup_ext()function,andusingthesql()function.

Inthisrecipe,wewilltakealookatallthesemethodsanddiscusshowtheyaffecttheperformanceofETLcodeexecutionandtheirimpactonadatabaseusedtosourcedatafrom.

Page 424: SAP Data Services 4.x Cookbook

GettingreadyWewillbeusingthesamedataflowasinthefirstrecipe,theonewhichpopulatesthePERSON_DETAILSstagetablefrommultipleOLTPtables.

Page 425: SAP Data Services 4.x Cookbook

Howtodoit…WewillperformalookupforthePHONENUMBERcolumnofapersonfromtheOLTPtablePERSONPHONEinthreedifferentways.

LookupwiththeQuerytransformjoin1. Importthelookuptableintoadatastoreandaddthetableobjectasasourceinthe

dataflowwhereyouneedtoperformthelookup.2. UsetheQuerytransformtojoinyourmaindatasetwiththelookuptableusingthe

BUSINESSENTITYIDreferencekeycolumn,whichresidesinbothtables.

Lookupwiththelookup_ext()function1. RemovethePERSONPHONEsourcetablefromyourdataflowandclearoutthejoin

Page 426: SAP Data Services 4.x Cookbook

conditionsintheLookup_PhoneQuerytransform.2. Asyouhaveseenintherecipesinpreviouschapters,thelookup_ext()functioncan

beexecutedasafunctioncallintheQuerytransformoutputcolumnslist.Theotheroptionistocallthelookup_ext()functioninthecolumnmappingsection.Forexample,saythatwewanttoputanextraconditiononwhenwewanttoperformalookupforspecificvalue.

InsteadofcreatinganewfunctioncallforlookingupthePHONENUMBERcolumnforallmigratedrecords,let’sputintheconditionthatwewanttoexecutethelookup_ext()functiononlywhentherowhasnonemptyADDRESSLINE1,CITY,andCOUNTRYcolumns;otherwise,wewanttousethedefaultvalueUNKNOWNLOCATION.

3. InsertthefollowinglinesintheMappingsectionofthePHONENUMBERcolumninsidetheLookup_PhoneQuerytransform:ifthenelse(

(Get_Country.ADDRESSLINE1ISNULL)OR

(Get_Country.CITYISNULL)OR

(Get_Country.COUNTRYISNULL),‘UNKNOWNLOCATION’,

lookup_ext()

)

4. Nowdouble-clickonthelookup_ext()texttohighlightonlythelookup_extfunctionandright-clickonthehighlightedareaforthecontextmenu.

5. Fromthiscontextmenu,selectModifyFunctionCalltoopentheLookup_extparameterconfigurationwindow.ConfigureittoperformalookupforaPHONENUMBERfieldvaluefromthePERSONPHONEtable.

Afterclosingthefunctionconfigurationwindow,youcanseethefullcodegeneratedbyDataServicesforthelookup_ext()functionintheMappingsection.

Page 427: SAP Data Services 4.x Cookbook

Whenselectingtheoutputfield,youcanseeallsourcefieldsusedinitsMappingsectionhighlightedintheSchemaInsectionontheleft-handside.

Lookupwiththesql()function1. OpentheLookup_PhoneQuerytransformforeditingintheworkspaceandclearout

allcodefromthePHONENUMBERmappingsection.2. PutthefollowingcodeintheMappingsection:

sql(‘OLTP’,‘selectPHONENUMBERfromPerson.PERSONPHONEwhereBUSINESSENTITYID=

Page 428: SAP Data Services 4.x Cookbook

Howitworks…QuerytransformjoinsTheadvantagesofthismethodare:

Codereadability:Itisveryclearwhichsourcetablesareusedintransformationwhenyouopenthedataflowinaworkspace.Push-downlookuptothedatabaselevel:ThiscanbeachievedbyincludingalookuptableinthesameSELECTstatement.Yes,assoonasyouhaveplacedthesourcetableobjectinthedataflowandjoineditproperlywithotherdatasourcesusingtheQuerytransform,thereisachancethatitwillbepusheddownasasingleSQLSELECTstatement,allowingthejoiningofsourcetablesatthedatabaselevel.DSmetadatareportfunctionalityandimpactanalysis:Themaindisadvantageofthismethodcomesnaturallyfromitsadvantage.Ifarecordfromthemaindatasetreferencesmultiplerecordsinthelookuptablebythekeycolumnused,theoutputdatasetwillincludemultiplerecordswithallthesevalues.ThatishowstandardSQLqueryjoinswork,andtheDataServicesQuerytransformworksinthesameway.Thiscouldpotentiallyleadtoduplicatedrecordsinsertedintoatargettable(duplicatedbykeycolumnsbutwithdifferentvaluesinthelookupfield,forexample).

lookup_ext()TheoppositeofaQuerytransform,thisfunctionhidesthesourcelookuptableobjectfromthedeveloperandfromsomeoftheDataServicesreportingfunctionality.Asyouhaveseen,itcanbeexecutedasafunctioncallorusedinthemappinglogicforaspecificcolumn.

Thisfunction’smainadvantageisthatitwillalwaysreturnasinglevaluefromthelookuptable.Youcanevenspecifythereturnpolicy,whichwillbeusedtodeterminethesinglevaluetoreturn—MAXorMIN—withtheabilitytoorderthelookuptabledatasetbyanycolumn.

sql()Similartothelookup_ext()functioninthepresentedexample,itisrarelyusedthatwayaslookup_ext()fetchesrowsfromthelookuptablemoreefficiently,ifallyouwanttodoistoextractvaluesfromthelookuptablereferencingkeycolumns.

Atthesametime,thesql()functionmakespossibletheimplementationofverycomplexandflexiblesolutionsasitallowsyoutopassanySQLstatementthatcanbeexecutedonthedatabaseside.Thiscanbetheexecutionofstoredprocedures,thegenerationofthesequencenumbers,runninganalyticalqueries,andsoon.

Asageneralrule,though,theusageofthesql()functioninthedataflowcolumnmappingsisnotrecommended.Themainreasonforthisisperformance,asyouwillseefurtheron.DataServiceshasarichsetofinstrumentstoperformthesametaskbutwithapropersetofobjectsandETLcodedesign.

Page 429: SAP Data Services 4.x Cookbook

PerformancereviewLet’squicklyreviewdataflowexecutiontimesforeachoftheexplainedmethods.

Thefirstmethod:ThelookupwiththeQuerytransformtook6.4seconds.

Thesecondmethod:Thelookupwiththelookup_ext()functiontook6.6seconds.

Thethirdmethod:Thisusedthesql()functionandtook73.3seconds.

Thefirsttwomethodslooklikethemethodswithsimilareffectiveness,butthatisonlybecausethenumberofrowsandthesizeofthedatasetusedisverysmall.Thelookup_ext()functionallowstheusageofthedifferentcachemethodsforthelookupdataset,whichmakesitpossibletotuneandconfigureitdependingonthenatureofyourmaindataandthatofthelookupdata.ItcanalsobeexecutedasaseparateOSprocess,increasingtheeffectivenessoffetchingthelookupdatafromthedatabase.

Thethirdfigureforthesql()function,onthecontrary,showstheperfectexampleofextremelypoorperformancewhenthesql()functionisusedinthecolumnmappings.

Page 430: SAP Data Services 4.x Cookbook

Optimizingdataflowloaders–bulk-loadingmethodsBydefault,allrecordsinsideadataflowcomingtoatargettableobjectaresentasseparateINSERTcommandstoatargettableatthedatabaselevel.IfmillionsofrecordspassthedataflowandtransformationhappensontheETLboxwithoutpush-downs,theperformanceofsendingmillionsofINSERTcommandsoverthenetworkbacktoadatabaseforinsertioncouldbeextremelyslow.Thatiswhyitispossibletoconfigurethealternativeloadmethodsonthetargettableobjectinsideadataflow.Thesetypesofloadsarecalledbulk-loadloads.Bulk-loadmethodsaredifferentinnature,butallofthemhavethemainprincipleandachievethesamegoal—theyavoidtheexecutionofmillionsofINSERTstatementsforeachmigratedrecord,providingalternativewaysofinsertingdata.

Bulk-loadmethodsexecutedbyDataServicesforinsertingdataintoatargettablearecompletelydependentonthetypeoftargetdatabase.Forexample,OracleDatabaseDataServicescanimplementbulk-loadingthroughthefilesorthroughtheOracleAPI.

Bulk-loadingmechanismsforinsertingdataintoNetezzaorTeradataarecompletelydifferent.YouwillnoticethisstraightawayifyoucreatedifferentdatastoresconnectingtodifferenttypesofdatabasesandcomparethetargettableBulkLoaderOptionstabtothetargettableobjectfromeachofthesedatastores.

Fordetailedinformationabouteachbulk-loadmethodavailableforeachdatabase,pleaserefertoofficialSAPdocumentation.

Page 431: SAP Data Services 4.x Cookbook

Howtodoit…Toseethedifferencebetweenloadingdatainnormalmode—rowbyrow—andbulkloading,wehavetogeneratequiteasignificantnumberofrows.Todothis,takethedataflowfromapreviousrecipe,Optimizingdataflowexecution–theSQLtransform,andreplicateittocreateanothercopyforusinginthisrecipe.NameitDF_Bulk_Load.

Openthedataflowintheworkspacewindowforediting.

1. AddanewRow_GenerationtransformfromLocalObjectLibrary|Transforms|

Platformasasourceobjectandconfigureittogenerate50rows,startingwithrownumber1.

2. TheRow_Generationtransformisusedtomultiplythenumberofrowscurrentlybeingtransformedbythedataflowlogic.Previously,thenumberofrowsreturnedbythePerson_OLTPSQLtransformwasapproximately19,000.ByperformingaCartesianjoinoftheserecordsto50artificiallygeneratedrecords,wecangetalmost1millionrecordsinsertedinatargetPERSON_DETAILStable.ToimplementCartesianjoin,usetheQuerytransformbutwithoutspecifyinganyjoinconditionsandleavingthesectionempty.

3. Yourdataflowshouldlooklikethis:

4. Totestthecurrentdataflowexecutiontime,saveandrunthejob,whichincludesthisdataflow.Yourtargettable’sBulkLoaderOptiontabshouldbedisabled,andontheOptionstab,theDeletedatafromtablebeforeloadingflagshouldbeselected.

5. Theexecutiontimeofthedataflowis49seconds,andasyoucansee,ittook42secondsforDataServicestoinsert9,39,900recordsintothetargettable.

Page 432: SAP Data Services 4.x Cookbook

6. Toenablebulkloading,openthetargettableconfigurationintheworkspaceforediting,gototheBulkLoaderOptionstab,andcheckBulkload.Afterthat,setModetotruncateandleaveotheroptionsattheirdefaultvalues.

7. Saveandexecutethejobagain.8. Thefollowingscreenshotshowsthattotaldataflowexecutiontimewas27seconds,

andittook20secondsforDataServicestoloadthesamenumberofrecords.ThatistwotimesfasterthanloadingrecordsinnormalmodeintotheSQLServerdatabase.YourtimecouldbeslightlydifferentdependingonthehardwareyouareusingforyourDataServicesanddatabaseenvironments.

Page 433: SAP Data Services 4.x Cookbook
Page 434: SAP Data Services 4.x Cookbook

Howitworks…Availabilityofthebulk-loadmethodsistotallydependentonwhichdatabaseyouuseasatarget.DataServicesdoesnotperformanymagic;itsimplyutilizesbulk-loadingmethodsavailableinadatabase.

Thesemethodsaredifferentfordifferentdatabases,buttheprincipleofbulkloadingisusuallyasfollows:DataServicessendstherowstothedatabasehostasquicklyaspossible,writingthemintoalocalfile.Then,DataServicesusestheexternaltablemechanismavailableinthedatabasetopresentthefileasarelationaltable.Finally,itexecutesafewUPDATE/INSERTcommandstoquerythisexternaltableandinsertdataintoatargettablespecifiedasatargetobjectinaDataServicesdataflow.

TorunoneINSERT…SELECTFROMcommandismuchfasterthantoexecute1millionINSERTcommands.

Somedatabasesperformthesesmallinsertoperationsquiteeffectively,whileforothersthiscouldbeareallybigproblem.Inalmostallcases,ifwetalkaboutasignificantnumberofrecords,thebulk-loadingmethodwillalwaysbethequickerwaytoinsertdata.

Whentoenablebulkloading?Youhaveprobablynoticedthatassoonasyouenablebulkloadingintargettableconfiguration,theOptionstabbecomesgrayedout.Unfortunately,byenablingbulkloading,youloseallextrafunctionalityavailableforloadingdata,suchasautocorrectload,forexample.Thishappensbecauseofthenatureofthebulk-loadoperation.DataServicessimplypassesthedatatothedatabaseforinsertionandcannotperformextracomparisonoperations,whichareavailableforrow-by-rowinserts.

Theotherreasonfornotusingbulkloadingisthatenabledbulkloadingpreventsfullpush-downsfromoccurring.Ofcourse,inmostofthecasespush-downisthebestpossibleoptionintermsofexecutionperformance,soyouwouldneverthinkaboutenablingbulkloadingifyouhavefullpush-downworking.Forpartialpush-downs,whenyoupushdownonlySELECTqueriestogetdataontotheETLboxfortransformation,bulkloadingisperfectlyvalid.Youstillwanttosendrecordsbacktothedatabaseforinsertionandwanttodoitasquicklyaspossible.

Mostofthetime,bulkloadingdoesaperfectjobwhenyouarepassingabignumberofrowsforinsertionfromtheETLboxanddonotutilizeanyextraloadingoptionsavailableinDataServices.

Thebestadviceintermsofmakingdecisionstoenableornotenablebulkloadingonyourtargettableistoexperimentandtrydifferentwaysofinsertingdata.Thisisadecisionwhichshouldtakeintoaccountallparameters,suchasenvironmentconfiguration,workloadonaDataServicesETLbox,workloadonadatabase,andofcourse,thenumberofrowstobeinsertedintoatargettable.

Page 435: SAP Data Services 4.x Cookbook

Optimizingdataflowexecution–performanceoptionsWewillreviewafewextraoptionsavailablefordifferenttransformsandobjectsinDataServiceswhichaffectperformanceand,sometimes,thewayETLprocessesandtransformsdata.

Page 436: SAP Data Services 4.x Cookbook

GettingreadyForthisrecipe,usethedataflowfromtherecipeOptimizingdataflowreaders–lookupmethodsinthischapter.Pleaserefertothisrecipeifyouneedtocreateorrebuildthisdataflow.

Page 437: SAP Data Services 4.x Cookbook

Howtodoit…DataServicesperformance-relatedconfigurationoptionscanbeputunderthefollowingcategories:

DataflowperformanceoptionsSourcetableperformanceoptionsQuerytransformperformanceoptionsLookupfunctionsperformanceoptionsTargettableperformanceoptions

Inthefollowingsections,wewillreviewandexplainallofthemindetails.

DataflowperformanceoptionsToaccessdataflowperformanceoptions,right-clickonadataflowobjectandselectPropertiesfromthecontextmenu.

TheDegreeofparallelismoptionreplicatestransformprocessesinsidethedataflowaccordingtothenumberspecified.DataServicescreatesseparatesubdataflowprocessesandexecutestheminparallel.Atthepointsinthedataflowwheretheprocessingcannotbeparallelized,dataismergedbacktogetherfromdifferentsubdataflowprocessesinthemaindataflowprocess.IfthesourcetableusedinthedataflowispartitionedandthevalueintheDegreeofparallelismoptionishigherthan1,DataServicescanusemultiplereaderprocessestoreadthedatafromthesametable.Eachreaderreadsdatafromcorrespondingpartitions.Then,dataismergedorcontinuedtobeprocessedinparallelifthenexttransformobjectallowsparallelization.

FordetailedinformationonhowtheDegreeofParallelismoptionworks,pleaserefertotheofficialdocumentation,SAPDataServices:PerformanceOptimizationGuide.Youshouldbeverycarefulwiththisparameter.TheusageandvalueofDegreeofparallelismshoulddependonthecomplexityofthedataflowandontheresourcesavailableonyourDataServicesETLserver,suchasthenumberofCPUsandamountofmemoryused.

IftheUsedatabaselinksoptionisconfiguredonbothdatabaseandDataServicesdatastorelevels,databaselinkscanhelptoproducepush-downoperations.Usethisoptiontoenableordisabledatabaselinksusageinsideadataflow.

Cachetypedefineswhichtypeofcachewillbeusedinsideadataflowforcachingdatasets.APageablecacheisstoredontheETLserver’sphysicaldiskandIn-Memorykeepsthecacheddatasetinmemory.Ifthedataflowprocessesverylargedatasets,itis

Page 438: SAP Data Services 4.x Cookbook

recommendedthatyouuseapageablecachetonotrunoutofmemory.

SourcetableperformanceoptionsOpenyourdataflowintheworkspaceanddouble-clickonanysourcetableobjecttoopenthetableconfigurationwindow.

ArrayfetchsizeallowsyoutooptimizethenumberofrequestsDataServicessendstofetchthesourcedatasetontotheETLbox.Thehigherthenumberused,thefewertherequeststhatDataServiceshastosendtofetchthedata.Thissettingshouldbedependentonthespeedofyournetwork.Thefasteryournetworkis,thehigherthenumberyoucanspecifytomovethedatainbiggerchunks.Bydecreasingthenumberofrequests,youcanpotentiallyalsodecreasetheCPUusageconsumptiononyourETLbox.

Joinrankspecifiesthe“weight”ofthetableusedinQuerytransformswhenyoujoinmultipletables.Thehighertherank,theearlierthetablewillbejoinedtotheothertables.IfyouhaveeveroptimizedSQLstatements,youknowthatspecifyingbigtablesinthejoinconditionsearliercanpotentiallydecreasetheexecutiontime.Thisisbecausethenumberofrecordsafterthefirstjoinpaircanbedecreaseddramaticallythroughinnerjoins,forexample.Thismakesthejoinpairsfurtheronproducesmallerdatasetsandrunquicker.ThesameprincipleapplieshereinDataServicesbuttospecifytheorderofjoinpairs,youcanusetherankoption.

CachecanbesetupifyouwantthesourcetabletobecachedontheETLserver.Thetypeofcacheusedisdeterminedbythedataflowcachetypeoption.

QuerytransformperformanceoptionsOpentheQuerytransformintheworkspacewindow:

Page 439: SAP Data Services 4.x Cookbook

Joinrankoffersthesameoptionsasdescribedearlierandallowsyoutospecifytheorderinwhichthetablesarejoined.

Cacheis,again,thesameasdescribedearlieranddefineswhetherthetablewillbecachedontheETLserver.

lookup_ext()performanceoptionsRight-clickontheselectedlookup_extfunctioninthecolumnmappingsectionoronthefunctioncallintheoutputschemaoftheQuerytransformandselectModifyFunctionCallinthecontextmenu:

Cachespecdefinesthetypeofcachemethodusedforthelookuptable.NO_CACHEmeansthat,foreveryrowinthemaindataset,aseparateSELECTlookupqueryisgenerated,extractingvaluefromthedatabaselookuptable.WhenPRE_LOAD_CACHEisused,thelookuptablefirstpulledtotheETLboxandcachedinmemoryoronthephysicaldisk(dependingonthedataflowcachetypeoption).DEMAND_LOAD_CACHEisamorecomplexmethodbestusedwhenyouarelookinguprepetitivevalues.Onlythenisitmostefficient.DataServicescachesonlyvaluesalreadyextractedfromthelookuptable.Ifitencountersanewkeyvaluethatdoesnotexistinthecachedtable,itmakesanotherrequesttothelookuptableinthedatabasetofinditandthencachesittoo.

Runasaseparateprocesscanbeencounteredinmanyothertransformsandobjectconfigurationoptions.Itisusefulwhenthetransformisperforminghigh-intensiveoperationsconsumingalotofCPUandmemoryresources.Ifthisoptionischecked,DataServicescreatesseparatesubdataflowprocessesthatperformthisoperation.Potentially,thisoptioncanhelpparallelizeobjectexecutionwithinadataflowandspeedupprocessingandtransformationssignificantly.Bydefault,theOScreatesasingleprocessforadataflow,andifnotparallelized,allprocessingisdonewithinthissingleOSprocess.RunasseparateprocesshelptocreatemultipleprocesseshelpingmaindataflowOSprocesstoperformallextracts,joinandcalculationsasfastaspossible.

TargettableperformanceoptionsClickonatargettabletoopenitsconfigurationoptionsintheworkspacewindow:

Page 440: SAP Data Services 4.x Cookbook

RowspercommitissimilartoArrayfetchsizebutdefineshowmanyrowsaresenttoadatabasewithinthesamenetworkpacket.Dodecreaseamountsofpacketswithrowsforinsertsenttoadatabaseyoucanincreasethisnumber.

Numberofloadershelpstoparallelizetheloadingprocesses.EnablepartitionsonthetableobjectsontheDatastorestabifthetablesarepartitionedatthedatabaselevel.Iftheyarenotpartitioned,setthesamenumberofloadersasDegreeofparallelism.

Page 441: SAP Data Services 4.x Cookbook

Chapter9.AdvancedDesignTechniquesThetopicswewillcoverinthischapterinclude:

ChangeDataCapturetechniquesAutomaticjobrecoveryinDataServicesSimplifyingETLexecutionwithsystemconfigurationsTransformingdatawiththePivottransform

Page 442: SAP Data Services 4.x Cookbook

IntroductionThischapterwillguideyouthroughtheadvancedETLdesignmethods.MostofthemwillutilizeDataServicesfeaturesandfunctionalityalreadyexplainedinthepreviouschapters.Asyouhaveprobablynoticed,therearemanywaystodothesamethinginDataServices.Themethodsandlogicyouapplytosolvethespecificproblemoftendependonenvironmentcharacteristicsandsomeotherconditions,suchasdevelopmentresourcesandextractrequirementsappliedtothesourcesystems.Onthecontrary,someofthemethodsandtechniquesexplainedfurtherdonotdependonallthesefactorsandcouldbeconsideredasETLdevelopmentbestpractices.

Inthischapterwewilldiscussaverypopularmethodofpopulatingslowlychangingdimensionsindatawarehouse,whichrequiretheuseofacombinationofDataServicestransformsanddataflowdesigntechniques.

WewillalsoreviewautomaticrecoverymethodsavailableinDataServices,whichallowyoutoeasilyrestartpreviouslyfailedjobswithoutperformingextrarecoverystepsforvariouscomponentsofETLcodeandunderlyingtargetdatastructures.

AnothertopicdiscussedinthischapteristheusageofsystemconfigurationsinDataServices.ThisfeatureallowsyoutosimplifyyourETLdevelopmentandmakesiteasytorunthesamejobsagainstvarioussourcesandtargetenvironments.

Finally,wewillreviewoneoftheadvancedDataServicestransformsthatenablesyoutoimplementthepivotingtransformationmethodonthepassingdataconvertingrowsintocolumnsandviceversa.

Page 443: SAP Data Services 4.x Cookbook

ChangeDataCapturetechniquesChangeDataCapture(CDC)isthemethodofdevelopingETLprocessestopropagatechangesinthesourcesystemintoyourdatawarehousefordimensiontables.

Page 444: SAP Data Services 4.x Cookbook

GettingreadyCDCisdirectlyrelatedtoanotherDWHconceptofSlowlyChangingDimensions(SCD),thedimensiontablesthatdatachangesconstantlythroughoutthelifeofdatawarehouse.

AgoodexamplewouldbetheEmployeedimensiontable,whichholdsdataontheemployeesinyourcompany.Asyoucanimagine,thistableisinconstantflux:newemployeesarehiredandsomeemployeesleavethecompany,changepositionsandroles,oreventransferbetweendepartments.AllthesechangeshavetobepropagatedtoanEmployeedimensiontableinDWHfromthesourcesystems,whichalwaysstoreonlythelateststateoftheEmployeedata.InDWH,inmostcases,formostofthedimensiontables,youwanttokeepthehistoricaldatatobeabletoderivethestateoftheEmployeedataataspecificpointoftimeinthepast.ThatiswhySCDtableshaveextrafieldstoaccommodatehistoricaldataandcanbepopulatedusingvariousmethods,dependingontheirtype.

TherearemanydifferenttypesofSCDtables,butwewillquicklydiscussonlythethreemainonesastherestarejustcombinationsofthesethree.WewillrefertoSCDtypenumbersaccordingtoRalphKimball’smethodologyinbrackets.

Asanexample,let’stakethecaseoftheEmployeedimensiontablewhenoneemployeeJohngetstransferredfrommarketingtofinance.

NohistorySCD(Type1)Yes,anohistorySCDtableisonethatdoesnotstorehistoricaldataatall.Recordsareinserted(newrecords)andupdated(changes).Takealookatthefollowingexample.

TheoriginalrecordforJohnlookslikethis:

ID NAME DEPARTMENT

1 John Marketing

Here’swhatthenewrecordlookslikeafterthechangesareapplied:

ID NAME DEPARTMENT

1 John Finance

ThistypeofSCDdoesnotkeephistoricalrecordsatall;asyoucansee,thereisnoinformationthatJohnhaseverworkedinadifferentdepartment.

LimitedhistorySCD(Type3)Alimitedhistorytableusesextrafieldsinthesamerecordtokeepthecurrentvalueandapreviousvalue,asshownhere:

ID NAME DEP_PREV DEP_CUR EFFECTIVE_DATE

Page 445: SAP Data Services 4.x Cookbook

1 John Marketing Finance 27/02/2015

Itis“limited”asyouhavetoaddextracolumnsforeverynew“historicalstate”oftherow.Intheprecedingexample,youcankeeptrackofonlythecurrentandpreviousvaluesoftherecord.

UnlimitedhistorySCD(Type2)Unlimitedhistoryispossibleifyoucreatemultiplerecordsforeachentity.Onlyonerecordrepresentsthecurrentvalue.OneofthevariationsofanunlimitedhistorySCDisshowninthefollowingtable:

KEY ID NAME DEPARTMENT START_DT END_DT CUR_FLAG

1 1 John Marketing 1582/01/01 27/02/2015 N

2 1 John Finance 27/02/2015 9999/12/31 Y

TheIDisanaturalkeyinthedimensiontable.ForJohn,thisis1.ThistypeofSCDrequiresthecreationofasurrogatekeytodefinetheuniquenessoftherecord.TheCUR_FLAGfielddefinescurrentrecord.TheSTART_DTandEND_DTcolumnsshowtheperiodoftimewhentherecordwasvalid/current.Notethatthesedatefieldsdonotrepresentanybusinessvaluesuchasstartemploymentdateordateofbirth.Theyjustshowthestartandenddatesoftheperiodwhentherecordwasvalid(orcurrent)andareonlyusedtoaccommodatepreservinghistoricalrecords.WhenpopulatinginitialrecordsforthefirsttimeinanSCDtable,youmayoftenwanttousedatesfromthedistantpastandfuture,suchas1582/01/01and9999/12/31,called“low”and“high”datevalues.Thisallowsuserstorunreportswhichretrievemoreaccuratehistoricalinformation.

ByusingalowdateintheSTART_DTfield,wemarktherecordasaninitialhistoricalrecordinourdimensiontable.ThesamegoesforusingahighdateintheEND_DTcolumn.ItalwayshasaCURR_INDflagsettoYandshowsthelatest(current)recordinthehistorytable.

EachtimeyoumakeachangetotheEmployeetable,inourcasetotheNAMEorDEPARTMENTfields,youhavetoupdatethe“current”recordbychangingtheEND_DTandCUR_FLAGfieldvalueswiththedateofchangeandN,respectively,andyoualsohavetoinsertanewrecordwithSTART_DTsettothedateofchangeandCUR_FLAGsettoY.

Inthisrecipe,wewillbuildadataflowthatpopulatestheSCDtableoftheunlimitedhistorytype(asshownintheType2example).DataServiceshasaspecialtransformobjectcalledHistory_Preserving,whichallowstheautomaticupdate/insertofthechangedandnewhistoryrecords.

Page 446: SAP Data Services 4.x Cookbook

Howtodoit…TobuildtheCDCprocess—whichwillupdateourtargetSCDtableindatawarehouse—fromasourceOLTPtable,weneedtohavetwodataflows.ThefirstwillextractdatafromthesourceOLTPsystemintoastagingtablelocatedintheSTAGEdatabase,andthesecondwillusethisSTAGEtabletocomparedatainitwiththetargettablecontentsandwillproducethehistoryrecords(inforofINSERTandUPDATESQLstatements)topropagatethedatechangesintoatargetSCDtable.

1. Createanewjobandnewextractdataflow,DF_OLTP_Extract_STAGE_Employee,that

extractstheEmployeetablefromtheHumanResourcesschemaintoastagingtable,STAGE_EMPLOYEE.

ForourfutureEmployeeSCDtable,wewillonlybeextractingthefollowinglistoffieldsfromOLTP.EMPLOYEE:

Field Description

BUSINESSENTITYID PrimarykeyforEmployeerecords

NATIONALIDNUMBER UniquenationalID

LOGINID Networklogic

ORGANIZATIONLEVEL Thedepthoftheemployeeinthecorporatehierarchy

JOBTITLE Worktitle

BIRTHDATE Dateofbirth

MARITALSTATUS M=Married,S=Single

GENDER M=Male,F=Female

HIREDATE Employeehiredonthisdate

SALARIEDFLAG Jobclassification

VACATIOINHOURS Numberofavailablevacationhours

Page 447: SAP Data Services 4.x Cookbook

SICKLEAVEHOURS Numberofavailablesickleavehours

MaponlythesefieldstotheoutputschemaoftheExtractQuerytransform.

2. Createanewdataflow,DF_STAGE_Load_DWH_Employee,andlinkthefirstextractdataflowtoitinthesamejob.

3. CreateanemptytargetSCDtable,EMPLOYEE,byusingtheCREATETABLEstatementinSQLServerManagementStudiowhenconnectedtotheAdventureWorks_DWHdatabase.CREATETABLE[dbo].[EMPLOYEE](

[ID][decimal](22,0)NULL,

[BUSINESSENTITYID][int]NULL,

[NATIONALIDNUMBER][varchar](15)NULL,

[LOGINID][varchar](256)NULL,

[ORGANIZATIONLEVEL][int]NULL,

[JOBTITLE][varchar](50)NULL,

[BIRTHDATE][date]NULL,

[MARITALSTATUS][varchar](1)NULL,

[GENDER][varchar](1)NULL,

[HIREDATE][date]NULL,

[SALARIEDFLAG][int]NULL,

[VACATIONHOURS][int]NULL,

[SICKLEAVEHOURS][int]NULL,

[START_DT][date]NULL,

[END_DT][date]NULL,

[CUR_FLAG][varchar](1)NULL

)ON[PRIMARY]

4. ImporttheEMPLOYEEtablecreatedinthepreviousstepintotheDWHdatastore.5. OpentheDF_STAGE_Load_DWH_Employeedataflowintheworkspacewindowtoeditit

andaddtherequiredtransformations,asshowninthefollowingfigure.

ThesestepsexplaintheconfigurationofeachoftheDFobjectswejustused:

1. TheQuerytransformisusedtocreateanextrafield,START_DT,ofthedatedatatype.

ItwillbeusedbyaHistory_PreservingtransformtoproducethestartdateofthehistoryrecordinthetargetSCDtable.

Page 448: SAP Data Services 4.x Cookbook

2. TheTable_ComparisontransformisusedtocomparethedatasetfromtheSTAGE_EMPLOYEEtabletothetargetSCDtabledatasetinordertoproducetherowsoftheINSERTtypetocreaterecordswhichdonotexistinthetargetbutdoexistinsourceaccordingtospecifiedkeycolumnsandrowsoftheUPDATEtype.SourcerecordsofwhichtheIDcolumnexistsinthetargettablewillbeusedtoprovidenewvaluesfornon-keyfields.TheinputprimarykeycolumnwespecifyforTable_ComparisontodeterminewhethertherecordexistsinthecomparisontableisBUSINESSENTITYID.TherestofthesourcecolumnsgointotheComparecolumnssectionaswewanttouseallofthemtodetermineifanyvalueinanyofthesefieldshaschanged.

3. TheHistory_PreservingtransformworksintandemwithTable_Comparisontoproduce“history”recordsupdatingtheadditionalSTART_DT,END_DT,andCUR_INDfields,alongwiththerestofthenon-keyfields,ortocreatenewhistoryrecordsfortheINSERTtypeofrowsdefinedbypreviousTable_Comparison.

Page 449: SAP Data Services 4.x Cookbook

TheComparecolumnssectionshouldhavethesamelistofcomparisoncolumnsasinthepreviousTable_Comparisontransform.Youcanalsocontrolwhichformatwillbeusedasahighdate(9999.12.31)andwhichvalueswillbeusedintheCurrentflagfield.

4. TheKey_GenerationtransformgeneratessurrogateuniquekeysintheIDfieldforourhistorySCDtableEMPLOYEEasBUSINESSENTITYIDwillnolongerrepresenttheuniquenessoftherecordifmultiplehistoryrowsarecreatedforthesameemployee.

5. SaveandexecutethejobinitiallytopopulatethetargetSCDtablewiththeinitialdataset.Afterrunningthejob,ifyoucheckthecontentsofthetargettable,youwillseethatitrepresentsthesamedatasetasintheOLTP.Employeetablebutwithextrastart/enddatecolumnspopulated.

Note

Page 450: SAP Data Services 4.x Cookbook

Notethat,asthisistheinitialdataset,nohistoryrecordshavebeencreatedforanyemployee.Thus,theBUSINESSENTITYIDcolumnstillhasuniquevaluesinthisdataset.

6. Let’sgeneratesomehistoryrecordsinourtargetSCDtable.Todothat,wehavetomakechangestothesourceOLTPtablebyexecutingthefollowingstatementsinSQLServerManagementStudiowhenconnectedtotheAdventureWorks_OLTPdatabase:select*fromHumanResources.EmployeewhereBusinessEntityIDin(1,999);

insertintoHumanResources.Employee

(BusinessEntityID,NationalIDNumber,LoginID,OrganizationNode,JobTitle,BirthDate,MaritalStatus,Gender,

HireDate,SalariedFlag,VacationHours,SickLeaveHours)

values

(999,‘999999999’,‘domain\johnny’,null,‘Engineer’,‘1982-01-

01’,‘S’,‘M’,SYSDATETIME(),1,99,10);

updateHumanResources.EmployeesetJobTitle=‘CEO’

whereBusinessEntityID=1;

7. NowrunthejobasecondtimeandcheckthecontentsofthetargetSCDtable,EMPLOYEE,foremployeeswithBUSINESSENTITYIDsetto1and999.

Page 451: SAP Data Services 4.x Cookbook

Howitworks…AnotherimportantthingwehavetodiscussbeforeweexplainindetailhowthisCDCdataflowworksisthedifferencebetweenthedifferenttypesofCDCarchitecture.

TherearetwobasictypesofCDCmethods,ormethodsallowingyoutopopulateSCDtables.Theyareusuallycalledsource-basedCDCandtarget-basedCDC.YoucanuseeitherofthemorevenbothofthemsimultaneouslytopopulateanytypeofSCDtable.Theyaredifferentonlyinhowchangesinthesourcedataaredetermined.

So,imaginethatyouhavepopulatedtheEmployeeDWHdimensiontable(whichhasnotbeenupdatedforacoupleofdays)ononehandandthesourceEmployeeOLTPtable(whichmightormightnotbedifferentfromthetargetDWHtable’scurrentsnapshotofemployeedata).

Source-basedETLCDCThismethodallowsyoutodeterminewhichemployeerecordshavehadtheirvalueschangedsincethelasttimeyouupdatedtheSCDdimensiontableinyourdatawarehousejustbylookingatthesourceEmployeetable.Forthistowork,thesourcetableshouldhavetheMODIFY_DATEandCREATE_DATEfieldsinit,updatedwiththecurrentdate/timeeachtimetherecordinthesourceEmployeetablegetsupdatedorcreated(ifitisanewemployeerecord).

Anothercomponentrequiredforsource-basedCDCisthedate/timewhentheEmployeetablehasbeenmigratedtopopulatetheDWHtableforthelasttime(usuallystoredinanETLlogtableandextractedintoavariable,$v_last_update_date).

So,eachtimeyouperformanextractionofthesourceEmployeetable,youaddafilteringcondition,suchasSELECT*FROMEMPLOYEEWHEREMODIFY_DATE>=$v_last_update_dateORCREATE_DATE>=$v_last_update_date.Thisallowsyoutoextractsignificantlyfewerrecordsfromthesourcesystem,increasingtheETLprocessingspeedanddecreasingyourCPU,memory,andnetworkresourceconsumption.

Then,inadataflowthatpopulatesthetargetSCDtableinDWH,youdeterminewhetherthisisaneworupdatedrecordbycheckingtheMODIFY_DATEandCREATE_DATEvalues.WiththeMap_Operationtransform,changetherecordoperationtypetoeitherINSERTorUPDATEtosendthemtotheHistory_Preservingtransformforhistoryrecordgeneration.

Target-basedETLCDCIntarget-basedCDC,thewholesourcetableisextractedandeachextractedrecordisthencomparedwitheachtargetSCDtablerecord.DataServiceshasanexcellenttransformationobject,Table_Comparison,whichperformsthisoperation,producingINSERT/UPDATE/DELETErecordsandsendingthemtotheHistory_Preservingtransformforhistoryrecordgeneration.

There’snoneedtospecifythatpuretarget-basedCDCisaresource-andtime-consumingmethod,themainadvantageofwhichisthesimplicityofimplementation.So,whynotmixthemtogetherthentogetthespeedofsource-basedCDCwhenextractingfewerrecordsandthesimplicityoftarget-basedCDC,usingonlytwotransforms,

Page 452: SAP Data Services 4.x Cookbook

Table_ComparisonandHistory_Preserving,todeterminetherowsforINSERTandUPDATEandforpreparinghistoryrowswhichwillbesenttotargetSCDtable.

Inthestepsofthisrecipe,weimplementedapuretarget-basedCDCmethod.Thefollowingscreenshotshowsyouoneofthepossibleways(inaverysimplisticform)inwhichtoupdateourtarget-basedCDCtoutilizethetechniquesofthesource-basedCDCmethodinordertodeterminethedatasetforextractionwithonlythechangeddata:

TheinitialscripthereusesthelogtableCDC_LOGtoextractthedatewhenthedatawassuccessfullyextractedandappliedtoSCDtargettablethelasttime.

TheCDC_LOGtablehasonlyonefield,EXTRACT_DATE,andalwayshasonlyonerecordshowingwhentheCDCprocesswasexecutedthelasttime.WeextractthisvaluefromitbeforerunningourCDCdataflowsandupdateitrightafterthesuccessfulexecutionofallCDCdataflows.

Thefinalscriptupdatesthelogtablewiththecurrenttime,sowhenthejobisexecutedthenexttime,itwillonlyextractrecordsthathavebeenmodifiedsincethatdate.

Therearemanyvariationsofsource-basedCDCmethodimplementation.Theyalldependonhowoftendataisextracted,ifthereisaMODIFIED_DATEcolumnonthesourcetable,howintensivelythesourcetableisupdatedwithnewvalues,andsoon.

Themainideahereistoextractasfewrecordsaspossiblewithoutlosingthechangesmadetothesourcetable.

NativeCDCSomedatabases,suchasMSSQLServerandOracle,havetheNativeCDCfunctionality,whichcanbeenabledforspecifictables.WhenanyDMLoperationsareperformedonthetablecontents,thedatabaseupdatestheinternalCDCstructuresloggingwhenandhowthe

Page 453: SAP Data Services 4.x Cookbook

tablerecordswereupdatedthelasttime.DataServicescanutilizethisnativeCDCfunctionalityprovidedbythedatabase.Thisconfigurationcanbedoneatadatastorelevelbyusingdatastoreoptionswhenyoucreateadatastoreobject.

Usingthisfunctionalityallowsyoutoalwaysselectonlychangedrecordsfromthesourcedatabasetables.

WewillnotdiscussthedetailsofusingNativeCDCinDataServices,butyoucanconsiderthistaskasagoodhomeworkpracticeandtrytocreateyourownCDCdataflows.JustdonotforgetthatCDChastofirstbeenabledatthedatabaselevelbeforeyoumakeanyconfigurationchangesontheDataServicessideandstartdevelopingETL.

Page 454: SAP Data Services 4.x Cookbook

AutomaticjobrecoveryinDataServicesTherecoveryprocessusuallykicksinwhenDataServicesjobsfail.Afailedjob,inmostcases,meansthatsomepartofithascompletedsuccessfullyandsomeparthasnot.Thejobwhichhasfailedrightattheverybeginningisrarelyaproblemandisofhardlyanyconcernforrecoveryasallyouhavetodoistostartitagain.

Complicationsarisewhenthejobfailsinthemiddleoftheinsert,intoatargettableforexample.Caseslikethatrequireyoutoeitherconsiderdeletingalreadyinsertedrecordsorevenrecoveringacopyofthetablefromabackupusingdatabaserecoverymethods.

RecoveryanderrorhandlingisanimportantpartofrobustETLcode.Inthisrecipe,wewilltakealookatthemethodsusedtodevelopETLinDataServicesandthefunctionalityavailableinthesoftwaretomakesurethattheprocessofresumingfailedprocessesgoesassmoothlyaspossible.

TheautomaticjobrecoveryfeatureavailableinDataServicesdoesnotfixtheproblemswiththepartiallyinserteddataormissingkeysproblems(whenaninsertintoafacttablecannotfindtherelatedkeyvaluesinthereferenceddimensiontablesbecausetheyhavenotbeenproperlypopulatedafterthelastjobfailure).Also,thisfeaturedoesnotprotectyoufrompoorETLdesignordevelopmenterrorswhen,forexample,yourETLmigrationprocessdoesanautomaticconversionofdatabetweenincompatibledatatypes.Inthatcase,itisyourjobtodevelopyourETLinsuchawaythatyoucancleansethedataifnecessaryanddomanualconversions,makingsurethatyoucaneitherconvertthevalueinthefieldbetweendatatypesorsettherowwiththisvalueasidetoinvestigateordealwithitlater.

Theautomaticjobrecoveryfeaturesimplytracksdowntheexecutionstatusesofalldataflowandworkflowobjectsfromwithinajob,andifthejobfails,itallowsyoutorestartthejobwithouttheneedtorunsuccessfullycompletedprocessesagain.

Let’sseehowitworks.

Page 455: SAP Data Services 4.x Cookbook

GettingreadyWewillusethejobfromthepreviousrecipe.Thisjobcontainstwodataflows:anextractoftheEmployeetablefromtheOLTPsourcedatabaseintothestagingareaandtheloadofthedatafromthestagingtableintothetargetdatawarehousehistorytable,Employee.

Wehavetoemulatethefailedprocess.Todothat,wewilldropthetargetdimensiontablepopulatedbytheseconddataflow.

Firstofall,generateaCREATETABLEstatementfromthetabledbo.EMPLOYEEusingSQLServerManagementStudio.Dothisbyright-clickingonthetableobjectandselectingScriptTableAs|CREATETo|NewQueryEditorWindowonthecontextmenusothatyoucancreateatablewiththesametabledefinitionwithoutanydifficulties.Savethiscodeonyourphysicaldriveforlaterusetorecreatethetable:CREATETABLE[dbo].[EMPLOYEE](

[ID][decimal](22,0)NULL,

[BUSINESSENTITYID][int]NULL,

[NATIONALIDNUMBER][varchar](15)NULL,

[LOGINID][varchar](256)NULL,

[ORGANIZATIONLEVEL][int]NULL,

[JOBTITLE][varchar](50)NULL,

[BIRTHDATE][date]NULL,

[MARITALSTATUS][varchar](1)NULL,

[GENDER][varchar](1)NULL,

[HIREDATE][date]NULL,

[SALARIEDFLAG][int]NULL,

[VACATIONHOURS][int]NULL,

[SICKLEAVEHOURS][int]NULL,

[START_DT][date]NULL,

[END_DT][date]NULL,

[CUR_FLAG][varchar](1)NULL

)ON[PRIMARY]

Then,executethefollowcommandtodropthetable:DROPTABLE[dbo].[EMPLOYEE]

Page 456: SAP Data Services 4.x Cookbook

Howtodoit…1. OpenJob_Employeeintheworkspacewindowandexecuteit.2. OntheExecutionPropertieswindow,checktheoptionEnablerecovery.This

optionwillenabletheexecutionstatusloggingoftheworkflowanddataflowobjectswithinthejob.

3. Thefirstdataflowexecutessuccessfully,butthesecondonefailsstraightawaywithanerrormessagefromtheKey_GenerationtransformwhichsendstheSQLstatementSELECTmax(ID)FROMdbo.EMPLOYEEinordertogetthelatestkeyvaluefromthetargettable.

4. Now,returnourmissingtableobjectsbyexecutingthepreviouslysavedCREATETABLEcommandinSQLServerManagementStudio.

5. Executethejobagain,butthistimeselecttheRecoverfromlastfailedexecutionoptionintheExecutionPropertieswindow.

6. ThetracelogstatesthatDF_OLTP_Extract_STAGE_Employeeissuccessfullyrecoveredfromthepreviousjobexecution.

Page 457: SAP Data Services 4.x Cookbook
Page 458: SAP Data Services 4.x Cookbook

Howitworks…Theautomaticrecoveryfeatureworksonlyifyouenabletheflagonthejobexecutionoptionswindowtoenabletheobjectstatusloggingmechanism.Ifyouhaven’tenableditbeforeyourjobfails,youcannotusetheautomaticrecoveryfeature.

Averyimportantthingbeforerunningthejobagaininrecoverymodeistocheckwhythejobhasfailed.Ifthejobfailedinthemiddleofpopulatingofoneofthetables(dimensionoffact),youhavetounderstandtheimpactofrunningthesameloadprocessagainwithoutcleaningupalreadyinsertedrecordsfirst.

Inourrecipe,wesimulatedthefailureoftheloaddataflow,whichpopulatesthetargetdimensiontable.AsithastheTable_ComparisonandHistory_Preservingtransforms,itisnotaproblemtoexecuteitagainwithoutanypreparatorystepsusingthesamedataset.RecordsthathavealreadybeeninsertedsimplywillnotbeconsideredbyTable_ComparisonforeitherINSERTorUPDATEandwillbeignored,soitissafeforustojustrestartthejobinrecoverymode.

NoteAlwaysconsiderthetypeoffailureandthenatureofyourdataandhowitispopulatedbyyourETLbeforerestartingthejobinrecoverymodetopreventinsertingduplicatesintoyourtargettablesortoavoidreferencingmissingkeyvalues.

TheworkflowobjectcangroupseveralchildobjectsplacedinsideitasasinglerecoverytransactionalunitbyusingRecoverasaunitoption.Thisisusefulwhenseveralofyourdataflowobjectsworkasasingleunitinordertopopulatethespecifictargettablebypreparingdataataspecificpointintime.Inthatcase,ifsomeofthesedataflowsfail,youwanttoexecuteallsequencesofdataflowfromthebeginning.Otherwise,DataServiceswillexecutethejobinthedefaultrecoverymode,skippingallpreviouslysuccessfullycompleteddataflowsandworkflows.

Tousethisability,placebothdataflowobjectsintothesingleworkflow.OpentheworkflowpropertiesandchecktheoptionRecoverasaunit.

Theworkflowiconwillbemarkedintheworkspacewindowbyagreenarrowandsmallblackcrosssothatyoucanvisuallydifferentiatewhichpartsofyourcodebehaveasatransactionalunitduringtherecoveryprocess.

Page 459: SAP Data Services 4.x Cookbook

NoteNotethatscriptobjectsarenotconsideredbyrecoverymodeastheyarepartoftheparentworkflowobject.Youshouldkeepthatinmindbeforererunningthejobinrecoverymode.

Page 460: SAP Data Services 4.x Cookbook

There’smore…Ofcourse,thebestwaytomakeyourlifeeasieristotrytopreventthenecessityofjobrecoveryinthefirstplace.Oneofthetechniquesthatcanbeimplementedtopreventpossibleproblemswithdatarecoveryandjobreruncomplicationsisbyputtingextracodeinthetry-catchblock.Thiscodecanbeasetofscriptsthatwillperformatableclean-upwithaconsequent“clean”failuresothejobcansimplybererunwithoutextraconsiderationsandpreparatorystepsorsoitcouldevenbeanalternativeworkflowthatprocessesthedatawithadifferentmethodascomparedtotheoriginalonethatfailed.

Forexample,ifyouuseadataflowthatloadstheflatfileintoatable,youcanwrapitinatry-catchblock.Ifitfails,executeanotherdataflowfromacatchblocktotrytoreadthefileagainbutfromadifferentlocationorusingdifferentmethod.

Page 461: SAP Data Services 4.x Cookbook

SimplifyingETLexecutionwithsystemconfigurationsWorkinginmultiplesourceandtargetenvironmentsisverycommon.ThedevelopmentofETLprocessesbyaccessingdatadirectlyfromtheproductionsystemhappensveryrarely.Mostofthetime,multiplecopiesofthesourcesystemdatabasearecreatedtoprovidetheworkingenvironmentforETLdevelopers.

Basically,thedevelopmentenvironmentisanexactcopyoftheproductionenvironmentwiththeonlydifferencebeingthatthedevelopmentenvironmentholdsanoldsnapshotofthedataortestdatainsmallervolumesforquicktestjobexecution.

So,whathappensafteryoucreateadatastoreobject,importallrequiredtablesfromdatabaseintoit,andfinishdevelopingyourETL?Youhavetoswitchtotheproductionenvironment.

DataServicesprovidesaveryconvenientwayofstoringmultipledatastoreconfigurationsinthesamedatastoreobject,soyoudonotneedtoeditdatastoreobjectoptionseachtimeyouwanttoextractfromeithertheproductionordevelopmentdatabaseenvironments.Instead,youcancreatemultipleconfigurationsthateachusedifferentcredentialsanddifferentdatabaseconnectionsettingsandquicklyswitchbetweenthemwhenexecutingajob.Thisallowsyoutotouchdatastoreobjectsettingsonlyonceinsteadofchangingthemeachtimeyouwanttorunyourjobagainstadifferentenvironment.

Page 462: SAP Data Services 4.x Cookbook

GettingreadyToimplementthestepsinthisrecipe,wewillneedtocreateacopyoftheAdventureWorks_DWHdatabase.OursampledatabasecopyisnamedDWH_backup.UseanypreferredSQLServermethodtocopythecontentsofAdventureWorks_DWHintoDWH_backup.ThequickestwayofperformingthiskindofbackupistobackupthedatabaseusingstandardSQLServermethodsavailableinthedatabaseobjectcontextmenu,andthenrecoveringthisbackupcopyinthedatabasewithanewname.

Page 463: SAP Data Services 4.x Cookbook

Howtodoit…ThereisnoneedtocreateaseparatedatastoreobjectforDWH_backuporchangetheDWHdatastoreconfigurationoptionseachtimewewanttoextracteitherfromAdventureWorks_DWHorDWH_backup.Let’sjustcreatetwoconfigurationsforourDWHdatastore.

1. GotoLocalObjectLibrary|Datastores.2. Right-clickontheDWHdatastoreandselectEdit…fromthecontextmenu.3. OntheEditDatastoreDWHwindow,clickonAdvanced<<toopentheadvanced

configurationpart,andthenclickontheEdit…buttonagainsttheConfigurations:label.

4. Inthetop-leftcorneroftheConfigurationsforDatastoreDWHwindow,youcanseefourbuttonsthatallowyoutocreateanewconfiguration,duplicatethecurrentlychosenone,andrenameordeleteconfigurations.UsethemtorenamethecurrentlyusedconfigurationtoDWH_Productionandcreateanewconfiguration,DWH_Development.

5. ChangethenewDWH_DevelopmentconfigurationtobethedefaultconfigurationbysettingDefaultconfigurationtoYes.NotethatthisvaluechangesautomaticallytoNoinotherconfigurations.

6. ChangetheDatabasenameorSIDorServiceNameoptionsettingforDWH_DevelopmenttoDWH_backuptopointthisconfigurationtoanotherdatabase.

Page 464: SAP Data Services 4.x Cookbook

Thereisnoneedtochangetheotheroptionsastheywillbeidenticalforbothconfigurations.

7. Nowlet’screatesystemconfigurationssothatwecanchoosetheconfigurationsetupwhenwerunthejobwithouttheneedtoeditthedatastore’sDefaultconfigurationoption.GotoTools|SystemConfigurations…andcreatetwosystemconfigurations:DevelopmentandProduction.

8. FortheDWHrecord,setDevelopmenttoDWH_DevelopmentandProductiontoDWH_Production.

9. ClickonOKtosavethechanges.

Page 465: SAP Data Services 4.x Cookbook

Howitworks…Usingconfigurationsenablesyoutoquicklyswitchbetweenenvironmentswithouttheneedtomodifyconnectivityandconfigurationsettingsinsideadatastoreobject.

Systemconfigurationsextendtheusabilityofdatastoreconfigurationsevenmorebyallowingyoutoselectthecombinationofenvironmentsrightatthejobexecutiontime.

NoteForthesystemconfigurationfunctionalitytowork,datastoreconfigurationshavetobecreatedfirst.

DoyouwanttobeabletoextractfromtheproductionOLTPsourcebutinsertintothedevelopmentDWHtargetwithinthesamejobwithoutchangingtheETLcodeordatastoresettings?Justcreateanewsystemconfigurationthatincludestherequiredcombinationofvariousdatastoreconfigurationsandexecutethejobwiththesystemconfigurationspecified.

Now,ifyouexecutetheJob_Employeejob,justselectthedesiredconfigurationinthejobexecutionoptions:

UsetheBrowse…buttontoreviewallsystemconfigurationscreated,ifnecessary.

Page 466: SAP Data Services 4.x Cookbook

TransformingdatawiththePivottransformThePivottransformbelongstotheDataIntegratorgroupoftransformobjects,whichareusuallyallaboutgenerationortransformation(changingthestructure)ofdata.Simplyput,thePivottransformallowsyoutoconvertcolumnsintorows.Pivotingtransformationincreasesthenumberofrowsinthedatasetasforeachcolumnconvertedintoarow,anextrarowiscreatedforeverykey(non-pivotedcolumn)pair.Convertedcolumnsarecalledpivotcolumns.

Page 467: SAP Data Services 4.x Cookbook

GettingreadyRuntheSQLfollowingstatementsagainsttheAdventureWorks_OLTPdatabasetocreateasourcetableandpopulateitwithdata:createtableSales.AccountBalance(

[AccountID]integer,

[AccountNumber]integer,

[Year]integer,

[Q1]decimal(10,2),

[Q2]decimal(10,2),

[Q3]decimal(10,2),

[Q4]decimal(10,2));

—Row1

insertintoSales.AccountBalance

([AccountID],[AccountNumber],[Year],[Q1],[Q2],[Q3],[Q4])

values(1,100,2015,100.00,150.00,120.00,300.00);

—Row2

insertintoSales.AccountBalance

([AccountID],[AccountNumber],[Year],[Q1],[Q2],[Q3],[Q4])

values(2,100,2015,50.00,350.00,620.00,180.00);

—Row3

insertintoSales.AccountBalance

([AccountID],[AccountNumber],[Year],[Q1],[Q2],[Q3],[Q4])

values(3,200,2015,333.33,440.00,12.00,105.50);

Thesourcetableshouldlooklikethefollowingfigure:

DonotforgettoimportitintotheDataServicesOLTPdatastore.

Page 468: SAP Data Services 4.x Cookbook

Howtodoit…1. CreateanewdataflowandnameitDF_OLTP_Pivot_STAGE_AccountBalance.2. Openthedataflowintheworkspacewindowtoedititandplacethesourcetable

ACCOUNTBALANCEfromtheOLTPdatastorecreatedintheGettingreadysectionofthisrecipe.

3. LinkthesourcetabletotheExtractQuerytransform,andpropagateallsourcecolumnstothetargetschema.

4. PlacethenewPivottransformobjectintoadataflowandlinktheExtractQuerytoit.ThePivottransformcanbefoundinLocalObjectLibrary|Transforms|DataIntegrator.

5. OpenthePivottransformintheworkspacetoeditandconfigureitsparametersaccordingtothefollowingscreenshot:

6. ClosethePivottransformandlinkittoanotherQuerytransformnamedPrepare_to_Load.

7. PropagateallsourcecolumnstothetargetschemaofthePrepare_to_Loadtransform,andfinallylinkittothetargetACCOUNTBALANCEtemplatetablecreatedintheDS_STAGEdatastoreandSTAGEdatabase.

8. Beforeexecutingthejob,openthePrepare_to_LoadQuerytransforminthe

Page 469: SAP Data Services 4.x Cookbook

workspacewindow,double-clickonthePIVOT_SEQcolumn,andcheckPrimarykeytospecifyanadditionalcolumnasaprimarykeycolumnforthemigrateddataset.

9. Saveandrunthejob.10. Openthedataflowagainintheworkspacewindowandimportthetargettableby

right-clickingonthetargettableandselectingImporttablefromthetablecontextmenu.

11. Openthetargettableintheworkspacewindowstoedititsproperties,andselecttheflagDeletedatafromtablebeforeloadingontheOptionstab.

12. YourdataflowandPrepare_to_LoadQuerytransformmappingshouldnowlooklikethefollowingscreenshot:

Page 470: SAP Data Services 4.x Cookbook

Howitworks…Pivotcolumnsarethecolumnswhosevalueswillbemergedintoonecolumnafterthepivotingoperationproducesanextrarowforeachpivotedcolumn.Non-pivotcolumnsarethecolumnsnotaffectedbypivotoperation.Asyoucansee,pivotingoperationdenormalizesthedataset,generatingmorerows.ThisiswhyACCOUNTIDdoesnotdefinetheuniquenessoftherecordanymoreandwhywehadtospecifytheextrakeycolumnPIVOT_SEQ.

YoumightaskWhypivot?WhynotjustusethedataasisandperformtherequiredoperationonthedatafromcolumnsQ1-Q4?

Theanswerinthegivenexampleisverysimple.Itismuchmoredifficulttoperformanaggregationwhentheamountsarespreadacrossthedifferentcolumns.Insteadofsummarizingbyasinglecolumnwiththesum(AMOUNT)function,wehavetowritetheexpressionsum(Q1+Q2+Q3+Q4)eachtime.Quartersisnottheworstthingyet.Trytoimaginethesituationwhenthetablehasamountsstoredincolumnsdefiningmonthperiodsoryouhavetofilterbythesetimeperiods.

Ofcourse,contrarycasesexistaswell—whenstoringdataacrossmultiplecolumnsinsteadofjustinoneisjustified.Inthesecases,ifyourdatastructureisnotlikethat,youcanusetheReverse_Pivottransform,whichdoesexactlytheoppositething—convertingrowsintocolumns.LookattheexampleoftheReverse_Pivotconfigurationgivenhere:

Page 471: SAP Data Services 4.x Cookbook

Reversepivotingorthetransformationofrowsintocolumnshasintroducedanotherterm:Pivotaxiscolumn.Thisisthecolumnthatholdsthecategoriesdefiningdifferentcolumnsafterreversepivotoperation.ItcorrespondstotheHeadercolumnoptioninthePivottransformconfiguration.

Page 472: SAP Data Services 4.x Cookbook

Chapter10.DevelopingReal-timeJobsTherecipesandtopicsthatwillbediscussedinthischapterareasfollows:

WorkingwithnestedstructuresTheXML_MaptransformTheHierarchy_FlatteningtransformConfiguringAccessServerCreatingreal-timejobs

Page 473: SAP Data Services 4.x Cookbook

IntroductionInallpreviouschapters,wehaveworkedwithbatch-typejobobjectsinDataServices.Aswealreadyknow,abatchjobinDataServiceshelpstoorganizeETLprocessessothattheycanbestartedondemandorscheduledtobeexecutedataspecifictimeeitheronceorregularly.

Themaindifferencebetweenareal-timejobandbatchjobisthewaythesetwojobobjectsareexecutedbyDataServicesengine.Thepurposeofareal-timejobistoprocessrequestsprovidingresponse.So,technically,areal-timejobcouldberunningforhours,days,orevenweekswithoutactuallyprocessinganydata.DataServicesengineactuallyexecutestheETLcodefromwithinthereal-timejobobjectonlywhennewrequestcomesfromanexternalservice.DataServicesusesthisrequestmessageasthedatasource,processesthisdata,andsendstheprocesseddatabacktoexternalserviceinformofresponsemessage.

AnewDataServicescomponentcalledAccessServerhascomeintotheframe.AccessServerplaystheroleofamessengerservicingreal-timejobs.ItisAccessServerthatacceptsandsendsbackmessagestobeusedasasourceandtargetdataforreal-timejobs.

Inthischapter,wewillalsoreviewtheconceptsofnestedstructures—howandwhentheyarecommonlyused.Themainreasonforthisisthatthereal-timejobsoftenuseXMLtechnologytoreceiverequestsandsendtheresponsesback.TheXMLformatisoftenusedtoexchangenesteddatastructures.

WewillalsoseehowtocreateandconfigureAccessServertobeabletousereal-timejobfunctionalityand,finally,wewillcreateareal-timejobitself.

Page 474: SAP Data Services 4.x Cookbook

WorkingwithnestedstructuresEarlierinthisbook,weworkedsolelywithaflatstructure—rowsextractedfromdatabasetablesandinsertedbackinadatabasetable,orexportedtoaflattextfile.Inthisrecipe,wewilltakealookathowtopreparenesteddatastructuresinsideadataflowandthenexportitintoanXMLfileasXMLisasimpleandveryconvenientwaytostorenesteddataandismostcommonlyusedasasourceandtargetobjectsinreal-timejobs.

Page 475: SAP Data Services 4.x Cookbook

GettingreadyWewillnotneedtohaveanXMLfilepreparedforthisrecipeaswearegoingtogeneratethemautomaticallywithhelpofDataServicesfromdatasetsstoredinourrelationaldatabases:OLTPandDWH.

Wewillconstructthenesteddatastructureofjobtitlelist,whereeachrecord(jobtitle)willhaveareferencetoalistofemployeeswhohavethesamejobtitleintheOLTPsystem.

Followingisthevisualpresentationofthisnesteddatastructure:

Intheflatdatastructure,thesewouldbetwodifferenttablesandwewouldhavetohavereferencekeycolumnsinbothtableslinkingthemtogetherasaparent–childrelationship.

Anesteddatastructureallowsyoutoavoidreferencekeyscompletely.Inotherwords,wedonotreallyneedJobTitleIDinordertolinkthesetwotablestogether.Alistofemployeeswillbeliterallystoredinthesamedatasetinoneofthefieldsforthespecificjobtitle.

WewillsourcethelistofjobtitlesfromtheHumanResources.EmployeetableofourOLTPdatabase.Persondatasuchasfirstnameandlastname,willbesourcedfromthePerson.PersontablethatislinkedtotheEmployeetablebytheBusinessEntityIDcolumn.

Page 476: SAP Data Services 4.x Cookbook

Howtodoit…1. Createanewdataflow,DF_OLTP_XML,andopenitintheworkspacewindowfor

editing.2. Import,ifnecessary,twotables,Person.PersonandHumanResources.Employee,into

theOLTPdatastore.3. PlacebothtablesinthedataflowDF_OLTP_XMLassourcetableobjects.4. PlacetheGet_PersonQuerytransforminsidetheworkspaceofDF_OLTP_XMLandlink

ittothePersontableobject.PropagatethreecolumnstotheoutputschemaoftheQuery—BUSINESSENTITYID,FIRSTNAME,LASTNAME—fromthePersontable.

5. CreatetwoQuerytransformstogetthedatafromtheEmployeetable:Get_JobTitle_PersonandDistinct_JobTitle.

Get_JobTitle_PersonshouldselectthedatasetconsistingoftwocolumnsBUSINESSENTITY_IDandJOBTITLE.

Distinct_JobTitleshouldonlyselecttheJOBTITLEcolumn.

6. IntheDistinct_JobTitleQueryEditor,tickthecheckboxDistinctrows…ontheSELECTtabandsetupascendingsortingontheJOBTITLEcolumnonORDERBYtab.

7. CreatetheGen_JobTitle_IDQuerytransformandlinkDistinct_JobTitletoit.ThisQuerytransformwillbeusedtogeneratenewuniqueidentifiersfordistinctvaluesofjobtitles.

8. Finally,joinallthreeQuerytransformstogetherusinganotherJoinQueryandpropagatefourcolumnstotheoutputschema:JOBTITLE_ID,JOBTITLE,FIRSTNAME,LASTNAME.

Page 477: SAP Data Services 4.x Cookbook

9. Nowthatwehavemergedourdatafrommultipletablesintoonedataset,let’sseewhatisrequiredtoconvertthisflatdatasettoanestedone.

10. Todothat,wehavetosplittheflatdataagain,separatingjobtitlesfromemployeedata.Bothresultdatasetsshouldhaveareferencekeycolumn,whichwillbeusedtodefinetherelationshipsbetweentherecords.

CreatetwoQuerytransforms,Q_JobTitleandQ_Person,propagatingJOBTITLE_IDinbothQueryobjects:

11. ThenestingofthedatahappensintheQuerytransformobjectthatisusedtojointhepreviouslysplitdatasets.CreatetheJobTitle_TreeQuerytransformandlinkittobothQ_JobTitleandQ_Person.

12. OpentheJobTitle_TreeQueryEditorinworkspacewindow.13. DraganddropJOBTITLE_IDandJOBTITLEfromQ_JobTitleinputschematothe

outputschema.14. DraganddropthewholeQ_Personinputschematotheoutputschema.Thatwillthe

placeQ_PersontableschemaatthesamelevelwiththeJOBTITLE_IDandJOBTITLEcolumns.Q_PersonisnowanestedsegmentinsidetheJobTitle_Treeschema.

15. Now,wecanswitchbetweenoutputschemasbydouble-clickingoneitherJobTitle_TreeorQ_Person,oryoucanright-clickonschemanameandselectMakecurrent…fromthecontextmenu.ThatisnecessaryifyouwanttochangesettingsontheQuerytransformtabs:Mapping,SELECT,FROM,WHERE,andsoon.Thosetabsarenotsharedbyallnestedoutputschemasandonly“current”outputschemavaluesaredisplayed.

16. MaketheJobTitle_TreeoutputschemacurrentandselecttheFROMtab.Make

Page 478: SAP Data Services 4.x Cookbook

surethatonlyoneQ_JobTitlecheckboxisselected.

17. Now,maketheQ_Personoutputschemacurrent.18. OntheFROMtab,tickonlytheQ_Personcheckbox.19. OntheWHEREtab,putthefollowingfilteringcondition:

(Q_Person.JOBTITLE_ID=Q_JobTitle.JOBTITLE_ID)

20. Finally,wehavetooutputournesteddatasetintoapropertargetobject,whichsupportsnesteddata.SQLServerdoesnotsupportnesteddata,andthatiswhywewilluseanXMLfileasatarget.

21. SelecttheNestedSchemasTemplateobjectfromtheright-sidetoolpanel, ,andplaceitasatargetobjectlinkedtothelastJobTitle_TreeQuerytransform.

22. NamethetargetobjectXML_targetandopenitintheworkspacewindowsforediting.Specifythefollowingoptions:

Page 479: SAP Data Services 4.x Cookbook

23. Yourdataflowshouldnowlooklikethefollowingfigure:

Page 480: SAP Data Services 4.x Cookbook

Howitworks…DataServicesallowsyoutoviewthetargetdataloadedbythelastjobrunfromtheXMLtargetobjectinthesamewayasfortargetdatabasetableobjects,asshowninthefollowingscreenshot:

IfyouopentheXML_target.xmlfilecreatedintheC:\AW\Files\folder,youwillseeacommonXMLstructure:

XMLisjustaconvenientexampleofanobjectthatcanstorenesteddatastructure.DataServiceshasothertargetobjectsthatcanacceptnesteddatasuchasBAPIfunctionsandIDocobjects,bothusedtoextract/loaddatafromandintoSAPsystems.Thesemethodsandconceptswillbeintroducedinthenextchapter.

DataServicesalsosupportstheJSONformatasanothersourceortargetfornesteddatastructures.

Nesteddataisoftencalledhierarchicaldataasitresemblesthetreestructure.Ifyouimaginerowfieldstobeleaves,thenoneoftheleavescouldbeanothertree(onerowormultiplerows)storedinsidealeafsection.

Inotherwords,nesteddatasimplymeansmappingsourcetableasacolumnintheoutputobjectstructureinsideadataflow.

Inthepreviouschapters,weworkedonlywiththeflattableorfiledatawhendatasetsconsistedofmultiplerowsandeachrowconsistedofmultiplefields,eachofwhichcouldonlyhaveonevalue(decimal,character,date,andsoon.).Nestedorhierarchicaldataallowsyoutoreferenceanothertableinsidearowfield.

Page 481: SAP Data Services 4.x Cookbook

NoteConvertingaflatdatasettoanesteddatasetnormalizesitasyoudonothavetoduplicateparentfieldsforeverychildsetofrows.

Youcanseehowanestedtablesegmentisdisplayedamongotherparentcolumns.Todefineifanestedstructurecanhavemultiplerecordsforeveryparentrecord,youcanright-clickonthenestedtablesegmentandselecttheRepeatablemenuoption.Unselectingthisoptionwillmakethenestedsegmentaone-recordsegmentandwillchangetheiconofthenestedtablesegmentfrom to .

Page 482: SAP Data Services 4.x Cookbook

Thereismore…DataServiceshasfullsupportofnesteddatastructures.Inthestepsofthisrecipe,weusedgoodoldQuerytransformtogenerateit.Inthenextrecipe,wewilldemonstratehowthesametaskcanbeimplementedwiththehelpofspecialDataServicestransform—XML_Maptransform.

Page 483: SAP Data Services 4.x Cookbook

TheXML_MaptransformInthefirstrecipeofthischapter,Workingwithnestedstructures,webuiltthenestedstructurewiththehelpofthemostuniversaltransforminDataServices—Querytransform.Querytransformhasthepowertodefinecolumnmapping,filterdata,joindatasetstogether,andmergedatainnestedsegments.Infact,manytransformsthatyouhaveusedbefore,suchasHistory_Preserving,Table_Comparison,Pivot,andothers,canbesubstitutedwiththesetofQuerytransforms.Ofcourse,thosewouldbecomplexETLsolutionsrequiringmoredevelopmenttime,wouldbehardertomaintainandread,and,mostimportantly,lessefficientintermsofperformance.

Inthisrecipe,wewilltakealookatanothertransformXML_Map,whichdoesexactlythesametaskasperformedinthepreviousrecipe—buildsandtransformsnestedstructures.

WewillusethesamesourcetablesPERSON.PERSONandHUMANRESOURCES.EMPLOYEEtobuildadatasetofjobtitleswithnestedlistsofemployees.

Page 484: SAP Data Services 4.x Cookbook

GettingreadyWehaveeverythingweneedforthisrecipealready:twosourcetables,PERSON.PERSONandHUMANRESOURCE.EMPLOYEE,importedinourOLTPdatastore.

Page 485: SAP Data Services 4.x Cookbook

Howtodoit…1. Createanewjobandnewdataflowandopenitintheworkspace.2. PlacethetwotablesPERSONandEMPLOYEEfromtheOLTPdatastoreinsideadataflow

assourcetables.3. DraganddropXML_MaptransformfromLocalObjectLibrary|Transforms|

Platformintoadataflowworkspaceandlinkbothsourcetablestoit.Whenplacingtransformintheworkspace,choosetheNormalmodeoption.

4. Left-clickonXML_Maptoopenitinworkspaceforediting.5. First,buildtheparentdatastructureofjobtitlesbymappingtheJOBTITLEcolumn

fromtheEMPLOYEEsourceschematotheoutputXML_Mapschema.6. OntheIterationRuletab,double-clickontheiterationrulefieldandselectthe

EMPLOYEEinputschema.7. OntheDISTINCTtab,draganddroptheEMPLOYEE.JOBTITLEsourcecolumninto

theDistinctcolumnsfield.8. OntheORDERBYtab,specifyAscendingsortingbytheEMPLOYEE.JOBTITLE

sourcefield,asshowninthefollowingscreenshot:

9. Now,addanesteddatasetcontainingpersonalinformation.DragthePERSONinputschematotheoutputandmakesurethatitisaddedonthesamelevelwiththepreviouslypropagatedJOBTITLEcolumn.

10. Double-clickontheoutputPERSONschematomakeitcurrentoruseMakecurrentfromthecontextmenubyright-clickingontheoutputPERSONschema.

11. OntheIterationRuletab,selecttheINNERJOINiterationruleandaddbothsourceinputschemasunderneathit.

Page 486: SAP Data Services 4.x Cookbook

12. OnthesameIterationRuletab,intheOnfield,specifythejoincondition:PERSON.BUSINESSENTITYID=EMPLOYEE.BUSINESSENTITYID

13. OntheWHEREtab,specifythejoinconditionbetweenparentandnesteddatasetsintheoutputschema:EMPLOYEE.JOBTITLE=XML_Map.JOBTITLE

14. CloseXML_MapEditorandlinkXML_MaptoQuerytransformobjectcalledGen_JobTitle_ID,inwhichwewillgenerateanIDcolumnfortheparentjobtitledataset.

Page 487: SAP Data Services 4.x Cookbook

AddtheJOBTITLE_IDoutputcolumn,asshownintheprecedingscreenshot,andputthemappingexpressiongen_row_num()foritontheMappingtab.

15. AfterQuerytransform,addtheNestedSchemasTemplateobjectasatargetobject.ConfigureitasanXMLtypewiththefilename:C:\AW\Files\XML_map.xml.

Page 488: SAP Data Services 4.x Cookbook

Howitworks…TheXML_MaptransformpropertiesareverysimilartoQuerytransformpropertieswithafewexceptionswhereXML_Maphassomeextrafunctionalitythatcanbeusedtobuildnesteddatastructures.

WhatmakestheXML_Maptransformareallypowerfultoolistheabilitytojoinanysourceinputdatasets(itdoesnotmatteriftheycomefromflatdatasourcesornesteddatastructures)anditerateonthecombineddataset,producingrequiredoutputresults.

Therearemultipletypesofjoinoperationsavailable:

*—cross-joinoperation:ThisproducesaCartesianproductofjoineddatasets.InSQLlanguage,itisanormalINNERJOINwithoutthespecifiedONclause.||—parallel-joinoperation:Thisisanon-standardSQLoperationthatbasicallyconcatenatesthecorrespondingrecordsfromtwojoineddatasets.Seetheexampleinthefollowingfigure:

Page 489: SAP Data Services 4.x Cookbook

INNERJOIN—standardSQLoperation:ThisiswhereyoucanspecifythejoinconditionintheOnfield.LEFTOUTERJOIN—standardSQLoperation:ThisiswhereyoucanspecifythejoinconditionintheOnfield.

Inthepreviousstepsoftherecipe,weproducedonehierarchicaldatasetwiththehelpofXML_Map,which,infact,hastwodatasetsinit—aparentdatasetofdistinctjobtitlessourcedfromtheEMPLOYEEtableandanesteddatasetoftheemployee’spersonalinformationwhichbelongstothespecificjobtitle.

IfwejustsourcedpersonalinformationfromonlythePERSONtable,wewouldnotbeabletospecifywhichpersonalinformation(FIRSTNAMEandLASTNAME)belongstowhichjobtitle.

Byprovidingajoineddatasetforpersonalinformationtoiterateon,wecoulddefinethedependencyforournestedstructurebyusingthefollowingexpressionintheWHEREtab,EMPLOYEE.JOBTITLE=XML_Map.JOBTITLE,whichcouldberoughlytranslatedtobuildadatasetfromthesourcetables,whichcontainsthefieldsJOBTITLE,FIRSTNAME,LASTNAME,andnesttherecordswithFIRSTNAMEandLASTNAMEfieldsinsidetheuniquerecordsoftheoutputjobtitledatasetbyreferencingthecorrespondingJOBTITLEcolumn.

ThefinalQuerytransform,whichisusedtogenerateanextraoutputcolumnwithauniqueIDforaparentjobtitledataset,isquitesimple.Wehavealreadyproducedanalphabeticallysortedanduniquelistofjobtitlesinourparentdatastructure,andallthatisleftistogeneratesequentialnumbersforeachrecord,whichcanbeeasilydonewithhelpofthegen_row_num()function.

NoteNotehowmuchmoreconciseourETLcodehasbecomewiththeuseoftheXML_MaptransformascomparedtothepreviousrecipewherewebuiltthesamehierarchicaldatasetbyonlyusingQuerytransformobjects.

Page 490: SAP Data Services 4.x Cookbook

TheHierarchy_FlatteningtransformSometimes,hierarchicaldataisnotrepresentedbynested(hierarchical)datastructuresbutisactuallystoredwithinasimpleflatstructureinnormaldatabasetablesorflatfiles.Thesimplestformofhierarchicalrelationshipsindatacanbepresentedasatablethathastwofields:parentandchild.

Lookattheexampleoffolderhierarchyonthedisk(asshowninthefollowingfigure).Thestructureontheleftisvisuallysimpletoreadandunderstand.Youcaneasilyseewhatistherootfolderandwhataretheleaves,andcaneasilyhighlightthespecificbranchyouareinterestedin.

Thetableontherightisthesimplestwaytostorethehierarchicalrelationshipsdataintheflatformat.ThisstructureisextremelyhardtoquerywiththestandardSQLlanguage.SomedatabaseslikeOraclehavespecialSQLclauses,whichcanhelptoqueryhierarchicaldatatobeabletoanalyzeitandpresentinanunderstandableandclearway.However,thosehierarchicalSQLstatementscanbequitecomplexandthemajorityofotherdatabasesdonotsupportthematall,leavingyouwiththenecessitytowritestoredproceduresinordertoparsethishierarchicaldata,answeringeventhesimplestquestionlikeselectall“children”forspecific“parent”.

Inthisrecipe,wewillreviewthemethodthatisavailableinDataServicestoconvertdatafromthatsimpleflathierarchicalpresentationofparent–childrelationshipsintothemoreefficientandeasy-to-usedatastructurethatcanbequeriedwithastandardSQLlanguage.ThiscanbedonewiththeHierarchy_Flatteningtransform.

Page 491: SAP Data Services 4.x Cookbook

GettingreadyAswedonothavemulti-levelparent–childrelationshiptable,weshouldartificiallycreateone.Let’sbuildthehierarchyoflocationsusingourthreesourcetablesfromtheOLTPdatabase:ADDRESS(tosourcecitiesfrom),STATEPROVINCE(tosourcestatesfrom),andCOUNTRYREGION(tosourcecountriesfrom)—allofthemarefromthesamePersonschemaoftheAdventureWorks_OLTPSQLServerdatabase.

Theresultingdatasetwillonlyhavetwocolumns—PARENTandCHILD—andeachrowinitwillrepresentonelinkofthehierarchicaldataset.

1. CreateanewjobandcreateanewdataflowinitnamedDF_Prepare_Hierarchy.2. Openthedataflowintheworkspacewindowforediting,andplacethreesourcetables

initfromOLTPdatastore:ADDRESS,STATEPROVINCE,andCOUNTRYREGION.3. CreateQuerytransformState_CityandjoinADDRESSandSTATEPROVINCEinitusing

theconfigurationsettings,asshowninthefollowingscreenshot,propagatingtheSTATEPROVINCE.NAMEandADDRESS.CITYsourcecolumnsasoutputPARENTandCHILDcolumnsrespectively:

4. CreateQuerytransformCountry_StateandjoinCOUNTRYREGIONandSTATEPROVINCEinitusingconfigurationsettings,asshowninthefollowingscreenshot,propagatingCOUNTRYREGION.NAMEandSTATEPROVINCE.NAMEsourcecolumnsasoutputPARENTandCHILDcolumnsrespectively:

Page 492: SAP Data Services 4.x Cookbook

5. MergetheoutputsofbothState_CityandCountry_StatetransformobjectswiththeMergetransform.

6. LinktheMergetransformoutputtotheHierarchyQuerytransformandpropagatebothPARENTandCHILDcolumnswithoutmakinganyotherconfigurationchangestotheQuerytransform.

7. Placethetargettemplatetableattheendofdataflowobjectsequencetoforwardtheresultdatato.NamethetargettableLOCATIONS_HIERARCHYandcreateitintheDS_STAGEdatastore.

Aftersavingandexecutingthejob,theLOCATIONS_HIERARCHYtablewillbecreatedand

Page 493: SAP Data Services 4.x Cookbook

populatedwithathree-levelhierarchyoflocations,whichincludecities,states,andcountries,asshowninthefollowingscreenshot:

Now,let’sseehowthisdatasetcanbeflattenedwiththeHierarchy_Flatteningtransform.

Page 494: SAP Data Services 4.x Cookbook

Howtodoit…TherearetwodifferentmodesinwhichtheHierarchy_Flatteningtransformparsesandrestructuresthesourcehierarchicaldata:horizontalandvertical.Theyproducedifferentresults,andwewillbuildtwodifferentdataflowsforeachoneoftheminordertoparseandflattenthesourcehierarchicaldataandcomparethefinalresultdatasets.

HorizontalhierarchyflatteningThefollowingarethestepstoperformHorizontalhierarchyflattening.

1. Createanewdataflow,DF_Hierarchy_Flattening_Horizontal,andlinkittothe

existingDF_Prepare_Hierarchyinthesamejob.Openitintheworkspaceforediting.

2. PuttheLOCATIONS_HIERARCHYtemplatetablefromtheDS_STAGEdatastoreasasourcetableobject.

3. LinkthesourcetabletotheHierarchy_Flatteningtransformobject,whichcanbefoundintheLocalObjectLibrary|Transforms|DataIntegratorsection.

4. OpentheHierarchy_Flatteningtransformintheworkspacewindowandchoosethehorizontalmethodofhierarchyflattening.

5. SpecifythesourcePARENTandCHILDcolumnsinthecorrespondingtransformconfigurationsettings:

6. ClosethetransformeditorandlinktheHierarchy_FlatteningtransformobjecttothetargettemplatetableLOCATIONS_TREE_HORIZONTALcreatedintheDS_STAGEdatastore.

VerticalhierarchyflatteningThefollowingarethestepstoperformverticalhierarchyflattening.

Page 495: SAP Data Services 4.x Cookbook

1. Createanewdataflow,DF_Hierarchy_Flattening_Vertical,andlinkittothepreviouslycreatedDF_Hierarchy_Flattening_Horizontaldataflowinthesamejob.Openitintheworkspaceforediting.

2. PuttheLOCATIONS_HIERARCHYtemplatetablefromtheDS_STAGEdatastoreasasourcetableobject.

3. LinkthesourcetabletotheHierarchy_Flatteningtransformobject,whichcanbefoundintheLocalObjectLibrary|Transforms|DataIntegratorsection.

4. OpentheHierarchy_Flatteningtransformintheworkspacewindowandchoosetheverticalmethodofhierarchyflattening.

5. SpecifythesourcePARENTandCHILDcolumnsinthecorrespondingtransformconfigurationsettings:

6. ClosethetransformeditorandlinktheHierarchy_FlatteningtransformobjecttothetargettemplatetableLOCATIONS_TREE_VERTICALcreatedintheDS_STAGEdatastore.

7. Saveandclosethedataflowtabintheworkspace.Yourjobshouldhavethreedataflowsnow:thefirstpreparesthehierarchicaldataset,thesecondflattensthisdatasethorizontally,andthethirdflattensthedatasetvertically.Bothresultdatasetsareinsertedintwodifferenttables:LOCATIONS_TREE_HORIZONTALandLOCATIONS_TREE_VERTICAL.

Page 496: SAP Data Services 4.x Cookbook

Howitworks…Thehorizontalflatteningresulttablelookslikethefollowing:

Youcannowseewhyitiscalled“horizontal”.Alllevelsofhierarchyarespreadacrossdifferentcolumnshorizontally.

CURRENT_LEAFshowsthenameofthespecificnodeandLEAF_LEVELshowswhichcolumnitcanbefound.

Theconvenienceofthismethodisthatyoucanseethefullpathtothenodeinonerow,startingfromtherootnode,andseetheLEVELcolumnswhereLEVEL0showstherootnode.

Verticalflatteninglooksabitdifferent:

ANCESTORandDESCENDENTarebasicallythesamePARENTandCHILDentities,butoutputresultssetafterhierarchyflatteninghavealotmorerecordsasextrarecordsshowingthe

Page 497: SAP Data Services 4.x Cookbook

dependencywerecreatedbetweenthetwonodeseveniftheyarenotrelateddirectly.

TheDEPTHcolumnshowsthedistancebetweentworelatednodes,where0meansthisisthesamenode,1meansthatthenodesarerelateddirectly,and2meansthatthereisanotherparentnodebetweenthem.

TheROOT_FLAGcolumnflagstherootnodesandtheLEAF_FLAGcolumnflagstheendleafnodesthatdonothavedescendants.

Asyoucanseefromthestepsofthisrecipe,theconfigurationoftheHierarchy_Flatteningtransformisextremelysimple.Allthatisrequiredfromyouistospecifytheparentandchildcolumnsthatstoretherelationshipsbetweentheneighbornodesofthehierarchy.

Extraparametersspecifictoeachtypeofhierarchyflatteningareexplainedasfollows.

Maximumdepth:Itexistsonlyforthehorizontalmethodbecausethismethodusesnewcolumnsfornewlevelsofhierarchy,andDataServicesneedsyoutospecifyhowmanyextracolumnsyouwanttocreateinyourresulttargettable.Imaginethesituationwhenyourhierarchicaldatasetstoresanextremelydeephierarchy—100levelsormore—andyoudonotknowaboutthisafterhavinglookedattheunflattenedhierarchyrepresentationwithonlyparentandchildfields.Inthatcase,atablewithafewhundredcolumnsforeachhierarchylevelmaynotbewhatyouarelookingfor.So,thisparameterallowsyoutocontroltheflatteningbehaviorofthetransform.Usemaximumlengthpaths:Thisparameterisspecifictoonlytheverticalmethodofhierarchyflattening.ItaffectsonlythevalueoftheDEPTHfieldintheresultoutputschema.Itworksonlyinsituationswhentherearemultiplepathsfromthedescendenttoitsancestorandtheyareofadifferentlength.SelectingthisoptionwillalwayspickthehighestnumberfortheDEPTHfieldoutofthesemultiplepaths.

QueryingresulttablesNow,let’strytoqueryaresulttablesothatyoucouldseehoweasyitistonowperformtheanalysisofthedata.YoucanrunthefollowingqueriesintheSQLServerManagementStudiowhenconnectedtotheSTAGEdatabase.

Selectallrootnodesofthehierarchy:selectCURRENT_LEAFfromdbo.LOCATIONS_TREE_HORIZONTALwhereLEAF_LEVEL=0orderbyCURRENT_LEAF;

selectANCESTORfromdbo.LOCATIONS_TREE_VERTICALwhereDEPTH=0andROOT_FLAG=1orderbyANCESTOR;

BothSQLstatementsproducethesameresult—alistof13rootnodes(weknowthatthosearecountries).

Checkif“UnitedStates”nodehasaleafnode“Aurora”amongitsdependents:select*fromdbo.LOCATIONS_TREE_HORIZONTALwhereLEVEL0=‘UnitedStates’andCURRENT_LEAF=‘Aurora’;

select*fromdbo.LOCATIONS_TREE_VERTICALwhereANCESTOR=‘UnitedStates’andDESCENDENT=‘Aurora’;

Theresultreturnedbytwoquerieslooksdifferent:

Page 498: SAP Data Services 4.x Cookbook

Youcanseethatthehorizontalviewismoreconvenientifyouwanttoseethefullpathtotheleafnodefromthetoprootnode.

TheverticalviewismoreconvenienttouseinSQLqueriesassometimesyoudonothavetofigureoutwhichcolumnyouhavetouseifyouwanttodoaspecificoperationonaspecificlevelofhierarchy.Resultcolumnsofverticalhierarchyflatteningarealwaysthesameandstatic,whereashorizontalhierarchyflatteningproducesanumberofcolumnsthatdependsonthedepthoftheflattenedhierarchy.

ThedecisionofwhattypeofhierarchyflatteningtouseshouldbemadeaftertakingintoaccountthetypeofSQLqueriesthatwillbeusedtoquerythisflatteneddata.

NoteIfyouhaveexperimentedwiththehierarchyflatteningresultdatasets,youwouldhaveprobablynoticedthatsomequerieswrittenagainst“horizontal”and“vertical”resulttablesproducedifferentresultsandarenotexactlywhatisexpected.Thathappensbecauseourparentandchildcolumnsaretextfields(namesofthecountries,regions,andcities),andtheydonotrepresenttheuniquenessofeverynode.Forexample,thereisastate“Ontario”thatbelongstoCanadaandacity“Ontario”thatbelongstothestateCalifornia.DataServicesdoesnotknowaboutthefactthatthesetwoaredifferentnodesandconsidersthemtobethesamenode(asthenamevaluematches).Youshouldkeepthatinmindanduseuniqueidentifiersforthenodesinparentandchildfieldsforhierarchyflatteningtoproducevalidandconsistentresults.

Page 499: SAP Data Services 4.x Cookbook

ConfiguringAccessServerAccessServerisrequiredforreal-timejobstowork.Inthisrecipe,wewillgothroughthestepsofcreatingandconfiguringtheAccessServercomponentthatwillberequiredforournextrecipe,wherewearegoingtocreateourfirstreal-timejob.

Page 500: SAP Data Services 4.x Cookbook

GettingreadyAccessServercanbecreatedandconfiguredwiththehelpoftwoDataServicestools:

DataServicesServerManager( )andDataServicesManagementConsole( ).

Page 501: SAP Data Services 4.x Cookbook

Howtodoit…1. StartSAPDataServicesServerManager.2. GototheAccessServertab.3. ClickontheConfigurationEditorbutton.4. OntheAccessServerConfigurationEditorwindow,clickontheAddbutton.5. FillintheAccessServerconfigurationfields,asshowninthefollowingscreenshot:

6. DonotforgettoenableAccessServerbytickingthecorrespondingoption.7. ClickonOKtocloseandsavethechanges.8. StarttheSAPDataServicesManagementconsoleinyourbrowserandlogin.9. GototheAdministrator|Managementsection.10. ClickontheAddbuttontoaddthepreviouslycreatedAccessServer.11. SpecifythehostnameandAccessServercommunicationport,andclickonApplyto

addtheAccessServer.

Page 502: SAP Data Services 4.x Cookbook

Howitworks…AccessServerisastandardDataServicescomponentthatservesasamessagebrokeracceptingrequestsandmessagesfromexternalsystems,forwardingthemtoDataServices,real-timeservicesforprocessing,andthenpassestheresponsebacktotheexternalsystem.

Inotherwords,thisisthekeycomponentrequiredinordertofeedreal-timejobswiththesourcedataandgetoutputdatafromthem.

Wewillcreateareal-timejobinthenextrecipeandexplainthedesignprocessofreal-timejobsindetail.Inthemeantime,youshouldonlyknowthatthemainsourceandtargetobjectsofreal-timejobsaremessages(mostcommonlyinanXMLstructure)andthatAccessServerisresponsiblefordeliveringthosemessages.

Withtheprecedingsteps,theAccessServerservicewascreatedandenabledintheDataServicesenvironmentandisnowreadytoaccepttherequestsfromexternalsystems.

Page 503: SAP Data Services 4.x Cookbook

Creatingreal-timejobsInthisrecipe,wewillcreatereal-timejobsandemulatetherequestsfromtheexternalsystemusingtheSoapUItestingtoolinordertogettheresponsewithprocesseddataback.Wewillgothroughallthestepsrequiredtoconfigureallcomponentsrequiredforreal-timejobstowork.

Page 504: SAP Data Services 4.x Cookbook

GettingreadyInthissection,wewillinstalltheopensourceSoapUItoolandcreateanewprojectthatwillbeusedtosendandreceiveSOAPmessages(XML-basedformat)toandfromDataServices.

InstallingSoapUIYoucandownloadandinstallSoapUIusingtheURLhttp://www.soapui.org/.

Theinstallationprocessisverystraightforward.Allyouhavetodoisjustfollowtheinstructionsonthescreen.

Aftertheinstallationiscomplete,starttheSoapUI.UsetheSOAPbuttoninthetoptoolbarmenutocreateanewSOAPproject.SpecifytheprojectnameandinitialWSDLaddress,asshowninthefollowingscreenshot:

TheinitialWSDLaddresscanbeobtainedfromDataServices.Togetit,logintoDataServicesManagementConsole,gototheAdministratorsection,chooseWebServices,andclickontheViewWSDLbuttonatthebottomofthemainwindow.

Inthenewopenedwindow,selectandcopythetopURLaddressandpasteitintheNewSOAPProjectconfigurationwindow,asshowninthefollowingscreenshot:

Page 505: SAP Data Services 4.x Cookbook

Atthispoint,wehavemadeaninitialconfigurationandcanproceedwithactuallycreatingreal-timejobsattheDataServicesside.

Page 506: SAP Data Services 4.x Cookbook

Howtodoit…Now,wehavean“external”systeminplaceandconfiguredinordertosendusrequestmessages.WerememberthattheDataServicescomponentresponsibleforacceptingthesemessagesandsendingthembackisAccessServer,anditwasalreadyconfiguredbyusinthepreviousrecipe.Now,weneedthelastandmostimportantcomponenttobecreatedandconfigured—DataServicesreal-timejob,whichwillbeprocessingtheseSOAPmessagesandreturningtherequiredresult.

Thegoalofourreal-timejobwillbetoprovidethefullnamesofthelocationcodesforaspecificcityandthepostalcodeofthecity.

1. GototheLocalObjectLibrary|Jobssection,right-clickonReal-TimeJobs,and

chooseNewfromthecontextmenu.2. Anyreal-timejobiscreatedwithtwodefaultmandatoryobjectsthatdefinethe

bordersofthereal-timejobprocessingsection:RT_Process_beginsandStep_ends.3. Createtwoscripts,Init_ScriptandFinal_Script,andplacethemcorrespondingly

beforeandafterreal-timejobprocessingsection.4. Insidethereal-timejobprocessingsection,createadataflowandnameit

DF_RT_Lookup_Geography,asshowninthefollowingfigure:

5. Now,openthedataflowDF_RT_Lookup_Geographyforeditinginthemainworkspacewindow.Firstwehave,tocreatefileformatsforourrequestandresponsemessages.

6. CreatearequestfileinyourC:\AW\FilesfoldernamedRT_request.xsd:<?xmlversion=“1.0”encoding=“UTF-8”?>

<xsd:schemaxmlns:xsd=“http://www.w3.org/2001/XMLSchema”>

<xsd:elementname=“Request”>

<xsd:complexType>

<xsd:sequence>

<xsd:elementname=“CITY”type=“xsd:string”/>

<xsd:elementname=“STATEPROVINCECODE”type=“xsd:string”/>

<xsd:elementname=“COUNTRYREGIONCODE”type=“xsd:string”/>

<xsd:elementname=“LANGUAGE”type=“xsd:string”/>

</xsd:sequence>

</xsd:complexType>

</xsd:element>

</xsd:schema>

Page 507: SAP Data Services 4.x Cookbook

7. CreatearesponsefileinyourC:\AW\FilesfoldernamedRT_response.xsd:<?xmlversion=“1.0”encoding=“UTF-8”?>

<xsd:schemaxmlns:xsd=“http://www.w3.org/2001/XMLSchema”>

<xsd:elementname=“Response”>

<xsd:complexType>

<xsd:sequence>

<xsd:elementname=“CITY”type=“xsd:string”/>

<xsd:elementname=“POSTALCODE”type=“xsd:string”/>

<xsd:elementname=“STATEPROVINCENAME”type=“xsd:string”/>

<xsd:elementname=“COUNTRYREGIONNAME”type=“xsd:string”/>

</xsd:sequence>

</xsd:complexType>

</xsd:element>

</xsd:schema>

8. Tocreatearequestmessagefileformat,openLocalObjectLibrary|Formats,right-clickonNestedSchemas,andchooseNew|XMLSchema…fromthecontextmenu.SpecifythefollowingsettingsintheopenedImportXMLSchemaFormatwindow:

9. Tocreatearesponsemessagefileformat,openLocalObjectLibrary|Formats,right-clickonNestedSchemas,andchooseNew|XMLSchema…fromthecontextmenu.SpecifythefollowingsettingsintheopenedImportXMLSchemaFormatwindow:

Page 508: SAP Data Services 4.x Cookbook

10. ImportRT_Geography_requestasasourceobjectintothedataflowDF_RT_Lookup_GeographyandlinkittotheRequestQuerytransform,propagatingallcolumnstotheoutputschema.ChoosetheMakeMessageSourceoptionwhenimportingtheobjectintoadataflow.

11. ImportRT_Geography_responseasatargetobjectintothedataflowDF_RT_Lookup_Geography.ChoosetheMakeMessageTargetoptionwhenimportingtheobjectintoadataflow.

12. ImporttheDIMGEOGRAPHYtableobjectfromDWHdatastoreandjoinitwiththeRequestQuerytransformusingtheLookup_DimGeographyQuerytransform.Configurethemappingsettingsaccordingtothefollowingscreenshots:

Page 509: SAP Data Services 4.x Cookbook

13. GototheFROMtabandconfigurethejoinconditionsforLEFTOUTERJOINbetweentheRequestQuerytransformandDIMGEOGRAPHYsourcetable:

14. LinktheLookup_DimGeographyQuerytransformtothetargetRT_Geography_responseXMLschemaobject.

15. Yourdataflowshouldlooklikethefollowingfigure:

16. Saveandvalidatethejob.

Page 510: SAP Data Services 4.x Cookbook

Howitworks…Thedataflowwecreatedinourreal-timejobacceptsXMLmessages(requests)asaninputandproducesXMLmessages(responses)asanoutput.

WeusetheDIMGEOGRAPHYtablefromourdatawarehousetofetchthepostalcode,fullstate/provincename,andcountrynameineitherFrench,English,orSpanish,dependingonwhichcityandlanguagecodewerereceivedintherequestmessage.

Basically,ourreal-timejobservesasalookupmechanismagainstdatawarehousedata.

Let’spublishourreal-timejobasawebserviceanddoourfirsttestruntoseehowtheexchangingmessagesmechanismworks.

1. OpentheDataServicesManagementConsole|Administrator|Real-Time|

<YourServerName>:4000|Real-TimeServices|Real-TimeServiceConfigurationtab.

2. ClickonAddtoaddthereal-timeserviceLookup_Geography;usethefollowingsettingstoconfigureit,andclickonApplywhenyouhavefinished:

3. GotothenextReal-TimeServicesStatustabandstarttheservicejustcreatedbyselectingitandclickingontheStartbutton:

Page 511: SAP Data Services 4.x Cookbook

Theiconofthereal-timeserviceshouldbecomegreen.

1. GototheAdministrator|WebServices|WebServicesConfigurationtab.2. SelectAddReal-TimeServicesinthebelowcomboboxandclickontheApply

buttonontheright.3. SelectLookup_GeographyfromthelistandclickonAdd:

4. TheLookup_Geographyreal-timeserviceshouldappearontheWebServicesStatustab,asshowninthefollowingscreenshot:

Page 512: SAP Data Services 4.x Cookbook

5. Wehavesuccessfullypublishedourcreatedareal-timejobasreal-timewebservice.Now,openSoapUIandmakesurethatyoucanseetheLookup_Geographywebservice.Todothat,starttheSoapUItoolandexpandtheDSProject|Real-time_Servicestabintheprojecttreepanel.

6. Right-clickontheLookup_GeographyitemandchooseNewrequestfromthecontextmenu.

7. ExpandLookup_Geographyanddouble-clickontheGeography_requestitem.8. Youwillseethatthenewwindowontherightopensshowingtwopanels:requestand

response.9. FillinvaluesinallthefieldsoftherequestXMLstructureandclickonthegreen

Page 513: SAP Data Services 4.x Cookbook

trianglebuttontosubmitarequest.Theresponseisreceivedanddisplayedintherightpanel.Asyoucansee,ithasthedatafromDIMGEOGRAPHYtable,whichresidesinthedatawarehouse:

NoteOneofthemostpopularcasesofusingreal-timejobsiscleansingthedatathroughwebservicesrequests.Areal-timejobreceivesthespecificvalueandpassesitthroughtheDataQualitytransformsavailableinDataServicesinordertocleanseitandthenreturnstheresultintheresponsemessage.

Page 514: SAP Data Services 4.x Cookbook

Chapter11.WorkingwithSAPApplications

Page 515: SAP Data Services 4.x Cookbook

IntroductionThischapterisdedicatedtothetopicofreadingandloadingdatafromSAPsystemswiththeexampleofaSAPERPsystem.DataServicesprovidesquickandconvenientmethodsofobtaininginformationfromSAPapplications.Asthisisaquitevasttopictodiscuss,therewillbeonlyonerecipebut,nevertheless,itshouldcoverallaspectsofextractingandloadingdataintoSAPERP.

Page 516: SAP Data Services 4.x Cookbook

LoadingdataintoSAPERPWewillnotdiscussthetopicofconfiguringtheSAPsystemtocommunicatewiththeDataServicesenvironmentasitwouldrequireanotherfewchaptersonthesubjectand,mostimportantly,itisnotthepurposeofthisbook.AllthisinformationcanbefoundindetailedSAPdocumentationavailableathttp://help.sap.com.

WepresumethatyouhaveexactlythesameDataServicesandstagingenvironmentsconfiguredandcreatedinthepreviouschaptersofthisbookandhavealsoinstalledandconfiguredtheSAPERPsystem,whichcancommunicatewiththeDataServicesjobserver.

Inthisrecipe,wewillgothroughthestepsofloadinginformationintotheSAPERPsystembyusingDataServices.Inoneofourpreparationprocesses,wewillbegeneratingdatarecordsforinsertionrightinthedataflow,whenusually,youhavethedatareadytobeloadedinthestagingareaextractedfromanothersystem.

WewillalsoreviewthemainSAPtransactionsinvolvedintheprocessofmanuallycreatingdataobjects,monitoringtheloadingprocessontheSAPside,andthetransaction,whichmightbeusedforthepost-loadvalidationofloadeddata.

Wewillbeusingtheexampleofcreating/loadingbatchdata,whichisrelatedtomaterialdatainSAPERP.First,wewillcreatethespecificmaterialrequiredforthebatchdatatobeloaded.Then,wewillcreatethetestbatchmanuallytoseehowitisdoneonSAPside,andthenwewilldevelopETLcodeinDataServices,whichwillpreparethebatchrecordandsendittotheSAPside.

Page 517: SAP Data Services 4.x Cookbook

GettingreadyThefirstthingwehavetodoistocreatethematerialforwhichwewillbecreatingbatchesinSAP.

1. LogintotheSAPERPsystemandrunthetransactionMM01tocreatenewmaterial.2. SpecifyMaterialasRAWMAT01,MaterialType,andIndustrysector:

3. Selectthefollowingviewsforthenewmaterial:Basicdata1,Basicdata2,Classification,andGeneralPlantData\Storage1:

Page 518: SAP Data Services 4.x Cookbook

4. Onthenextwindow,specifyOrganizationLevels:

5. Onthenextscreen,defineBaseUnitofMeasureandmaterialdescription:

Page 519: SAP Data Services 4.x Cookbook

6. OntheSales:general/planttab,ticktheBatchmanagementcheckboxtodefinethematerialasbatchmanaged:

7. Finally,ontheClassificationtab,classifythematerialastherawmaterialofclasstype023:

Page 520: SAP Data Services 4.x Cookbook

ClickontheContinue(Enter)buttoninthetop-leftcornertosaveandcreatenewmaterial.

Now,wecanmanuallycreatethefirstbatchobjectforournewmaterialsothatwecanlatercompareittothebatchobjectthatwillbegeneratedandinsertedbyDataServicesjobsautomatically.

8. RunthetransactionMSC1Ntocreateanewbatch,andspecifythematerialnumberandbatchnamethatyouwouldliketocreate:

9. ClickonContinue,andonthenextscreen,fillinthevaluesforthefollowingfields:theDateofManufacturebatch,theLastGoodsReceiptdate,andCtryofOrigin:

Page 521: SAP Data Services 4.x Cookbook

10. ClickonContinuetosaveandcreateanewbatch,20151009.

ThelastpreparationstepwehavetocompleteistheconfigurationofapartnerprofileinourSAPERPsothatthesystemcanaccepttheIDocmessagescontainingthebatchdatathatwillbesenttoSAPERPfromDataServices.

11. RunthetransactionWE20toconfigurethepartnerprofile.12. OnthePartnerprofileswindow,selectthePartnerTypeLSsectionandselectthe

clientyouarecurrentlyusing:

Page 522: SAP Data Services 4.x Cookbook

MakesurethatyourPartn.statusisActiveontheClassificationtabandthatyouhaveBATMASspecifiedintheInboundparmtrslist.Ifnot,thenclickontheCreateinboundparameterbuttonundertheInboundparmtrstabanddefinetheBATMASinboundparameter:

Page 523: SAP Data Services 4.x Cookbook

Now,everythingisreadyontheSAPERPsideandallwehavetodoiscreatetheDataServicesjobthatwillgenerateandsendthedataintotheSAPERPsystemforinsertion.

Page 524: SAP Data Services 4.x Cookbook

Howtodoit…1. StartDataServicesDesignerandgotoLocalObjectLibrary|Datastores.2. Right-clickontheemptyspaceoftheDatastorestabandchooseNewfromthe

contextmenuinordertocreatenewdatastoreobject.3. Createanewdatastore,SAP_ERP,byspecifyingthedatastoretypeSAPApplications

anddatabaseservernamealongwithyourSAPcredentials.4. ClickontheAdvancedbuttonandspecifytheadditionalsettingsrequiredforsetting

uptheconnectiontotheSAPERPsystem,suchasClientnumberandSystemnumber.Seethefollowingscreenshotforthefulllistofconfigurationsettings:

ClickonOKtocreatethedatastoreobject.

5. Importthefollowingobjectsinyourdatastorebyright-clickingontherequiredsectionoftheobjectyouwanttoimportandchoosingtheImportByName…optionfromthecontextmenu:

Page 525: SAP Data Services 4.x Cookbook

TheIDocobjectBATMAS03willbeusedasatargetobjecttotransferbatchdatatotheSAPsystem.

TheMARAandMCH1tableswillbeusedassourceobjectstoextractdatafromtheSAPsystemforpre-loadandpost-loadvalidationpurposes.

6. Createanewjobcontainingfourlinkeddataflowobjects,asshowninthefollowingscreenshot:

7. OpenthefirstDF_SAP_MARAdataflowintheworkspacewindowforeditingandspecifytheMARAtableobjectimportedintheSAP_ERPdatastoreasasourceandthenewSAP_MARAtemplatetableintheSTAGEdatabaseasatarget.PropagateallcolumnsfromthesourceMARAtabletoSAP_MARAusingQuerytransform.Runthejobonceandimportthetargettableobject:

8. OpenDF_Prepare_Batch_Dataintheworkspacewindowforediting.9. AddtheRow_Generationtransformasasource.Setituptogenerateonlyone

recordwiththerownumberstartingat1.10. LinkittotheCreate_Batch_RecordQuerytransform,whichwillbeusedtodefine

thefieldsofthecreatedrecord.Usethefollowingscreenshotasareferenceforcolumnnamesandmappings:

Page 526: SAP Data Services 4.x Cookbook

11. AddanotherQuerytransformnamedValidate_Material,linkCreate_Batch_Recordtoit,andpropagateallcolumnsfromtheinputschematotheoutputschema.

12. Addanextracolumnasanewfunctioncallofthelookup_extfunctionandconfigureitasshowninthefollowingscreenshot,lookinguptheMATNRfieldfromtheSAP_MARAtablebytheMATERIALfieldvaluefromtheinputschema:

13. AddtheValidationtransform,forkingthedatasetintothreecategories—Rule,Pass,andFail,sendingtheoutputstothreetargettables:BATCH,BATCH_REJECT,andBATCH_REJECT_RULE,asshowninthefollowingscreenshot:

Page 527: SAP Data Services 4.x Cookbook

14. OpentheValidationtransformintheworkspacewindowforeditingandaddinganewvalidationrule:

15. TheValidationtransformeditorshouldlookasshowninthefollowingscreenshot:

Page 528: SAP Data Services 4.x Cookbook

Closethedataflowandsavethejob.

16. Openthethirddataflow,DF_Batch_IDOC_Load,intheworkspacewindowforediting.17. Buildthestructureofthedataflow,asshowninthefollowingscreenshot.Thestepsto

configureeachofthedataflowcomponentswillbeprovidedfurther.

18. TheRow_Generationtransformshouldbeconfiguredtogenerateonerecord.UsethefollowingtabletodefineoutputschemamappingsintheEDI_DC40Querytransform.ThefollowingtablehastherecordsonlyforthemandatorycolumnsoftheEDI_DC40IDocsegment.PopulatetherestofthemwithNULLvalues.

Page 529: SAP Data Services 4.x Cookbook

Columnname Datatype Mappingexpression

TABNAM varchar(10) ‘EDI_DC40’

MANDT varchar(3) ‘100’

DOCREL varchar(4) ‘740’

DIRECT varchar(1) ‘2’

IDOCTYP varchar(30) ‘BATMAS03’

MESTYP varchar(30) ‘BATMAS’

SNDPOR varchar(10) ‘TRFC’

SNDPRT varchar(2) ‘LS’

SNDPRN varchar(10) ‘SBECLNT100’

CREDAT date sysdate()

CRETIM time systime()

ARCKEY varchar(70) ‘1’

NotePleasekeepinmindthatsomeofthevaluesinmappingexpressionsforthisspecificsegment,EDI_DC40,arespecifictoyourownSAPenvironment.SomeofthemareMANDTandSNDPRN,whichshouldbeobtainedfromyourSAPadministrator.

Toobtainthefulllistofcolumnsrequiredforthespecificsegment,refertotheBATMAS03objectstructureitself.

19. OpentheE1BATMASQuerytransformintheworkspacewindowforeditinganddefinethefollowingmappingsfortheoutputschemacolumns:

Columnname Datatype Mappingexpression

MATERIAL varchar(18) BATCH.MATERIAL

BATCH varchar(10) BATCH.BATCH_NUMBER

ROW_ID int BATCH.ROW_ID

20. OpentheE1BPBATCHATTQuerytransformintheworkspacewindowforeditinganddefinethefollowingmappingsfortheoutputschemacolumns:

Page 530: SAP Data Services 4.x Cookbook

Columnname Datatype Mappingexpression

LASTGRDATE date to_date(BATCH.GOODS_RECEIPT_DATE,‘YYYYMMDD’)

COUNTRYORI varchar(3) BATCH.COUNTRY_OF_ORIGIN

PROD_DATE date to_date(BATCH.DATE_OF_MANUFACTURE,‘YYYYMMDD’)

ROW_ID int BATCH.ROW_ID

21. OpentheE1BPBATCHATTXQuerytransformintheworkspacewindowforeditinganddefinethefollowingmappingsfortheoutputschemacolumns:

Columnname Datatype Mappingexpression

LASTGRDATE varchar(1) ‘X’

COUNTRYORI varchar(1) ‘X’

PROD_DATE varchar(1) ‘X’

ROW_ID int BATCH.ROW_ID

22. OpentheE1BPBATCHCTRLQuerytransformintheworkspacewindowforeditinganddefinethefollowingmappingsfortheoutputschemacolumns:

Columnname Datatype Mappingexpression

DOCLASSIFY varchar(1) ‘X’

ROW_ID Int BATCH.ROW_ID

23. OpentheIDOC_Nested_SchemaQuerytransformintheworkspacewindowforediting.

24. DraganddropEDI_DC40andE1BATMASsegmentsfromtheinputschemaintotheoutputschemaoftheIDOC_Nested_SchemaQuerytransform.

25. Double-clickonoutputschemaIDOC_Nested_Schematomakeitsstatusto“current”,opentheFROMtab,andselectonlytheE1BATMASinputschema.MarktheEDI_DC40segmentintheoutputnestedschemaasrepeatable(thefulltableicon).Ifthesegmentschemaiscreatedasrepeatablebydefaultthendonotchangeit.MarktheE1BATMASoutputschemasegmentasnon-repeatable.Todothat,makeitcurrentbydouble-clickingonit,andthenright-clickonit,unselectingtheRepeatableoptionfromthecontextmenu.SeethedifferencebetweentheoutputschemaiconsforEDI_DC40andE1BATMASasforrepeatableandnon-repeatablesegments.

Page 531: SAP Data Services 4.x Cookbook

26. Double-clickonthefirstEDI_DC40outputsegmenttomakeitsstatus“current”.OpentheFROMtabandselectonlytheEDI_DC40inputschema:

27. Double-clickonthesecondE1BATMASoutputsegmenttomakeitcurrent.OpentheFROMtabandselectonlytheEDI_DC40inputschema,inthesamewayasforthepreviousEDI_DC40outputschema.Also,deletetheROW_IDcolumnfromtheoutputschemaanddraganddroptherestoftheinputschemasE1BPBATCHATT,E1BPBATCHATTX,andE1BPBATCHCTRLinsidetheE1BATMASoutputschemacreatingnestingstructure:

Page 532: SAP Data Services 4.x Cookbook

28. Double-clickonthenestedE1BPBATCHATToutputschematomakeitcurrent.DeletetheROW_IDcolumnfromtheoutputschema.OntheFROMtab,selecttheE1BPBATCHATTinputschema.OntheWHEREtab,specifythefilteringcondition:(E1BPBATCHATT.ROW_ID=E1BATMAS.ROW_ID).

29. Performtheprecedingsamestepforthenextoutputsegment.Double-clickonthenestedE1BPBATCHATTXoutputschematomakeitcurrent.DeletetheROW_IDcolumnfromtheoutputschema.OntheFROMtab,selecttheE1BPBATCHATTXinputschema.

Page 533: SAP Data Services 4.x Cookbook

OntheWHEREtab,specifythefilteringcondition:(E1BPBATCHATTX.ROW_ID=E1BATMAS.ROW_ID).

30. Performthesameprecedingstepforthenextoutputsegment.Double-clickonthenestedE1BPBATCHCTRLoutputschematomakeitcurrent.DeletetheROW_IDcolumnfromtheoutputschema.OntheFROMtab,selecttheE1BPBATCHCTRLinputschema.OntheWHEREtab,specifythefilteringcondition:(E1BPBATCHCTRL.ROW_ID=E1BATMAS.ROW_ID).

31. ThetargetobjectBATMAS03importedintotheSAP_ERPdatastoreshouldbeconfiguredusingthevaluesshowninthefollowingscreenshot.OpentheBATMAS03targetobjectinthedataflowinthemainworkspaceforeditingtoconfigureit.

Closethedataflowobject.Saveandvalidatethejobtomakesurethatyouhavenotmadeanysyntaxerrorsinyourdataflowdesign.

32. Openthelastdataflow,DF_SAP_MCH1,foreditingintheworkspacewindow.33. AddtheMCH1tablefromtheSAP_ERPdatastoreasasourceobject.34. PropagateallthecolumnsfromtheMCH1tabletotheoutputschemausingthelinked

Querytransform.35. Addanewtemplatetable,-SAP_MCH1,fromtheSTAGEdatastoreasatargettable

object.36. Save,validate,andrunthejob.

Page 534: SAP Data Services 4.x Cookbook

Howitworks…TheprecedingstepsshowthecommonprocessofloadingdataintotheSAPsystemusingtheIDocmechanism.Theloadprocessusuallyconsistsoffewsteps:

ExtractmasterdatafromtheSAPsystemtomakesurethatwearereferencingthecorrectobjectsexistinginthetargetsystemProcessofbuilding/preparingdatasetforloadProcessofloadingthedataintoSAPThepost-validationprocessofextractingdataloadedinSAPbackintothestagingareaforvalidation

Let’sreviewalltheseprocessesbuiltintheformofadataflowinmoredetail.

Thefirstdataflow,DF_SAP_MARA,willbeextractingmaterialdatafromSAPERPforvalidationpurposestomakesurethatwedonottrytocreateabatchformaterialthatdoesnotexistinthetargetSAPsystem.

Theseconddataflow,DF_Prepare_Batch_Data,preparesthebatchrecordtobeloadedinSAP.AsyoucanseefromtheoutputschemamappingofoneoftheQuerytransforms,wepreparethebatch2015100901tobecreatedformaterialRAWMAT01.Asyoumightremember,wehavealreadymanuallycreatedbatch20151009.TherestofthemappingsshowthatwehavealsopopulatedtheCtryoforigin,LastGoodsReceipt,andDateofManufacturefields.

Thethirddataflow,DF_Batch_IDOC_Load,transformsthepreparedbatchrecordintothenestedformatofanIDocmessageandsendsthisIDocmessagetoSAP.Furthermore,wewilltakealookathowyoucanmonitortheprocessofreceivingandloadingIDocsontheSAPside.

Finally,thefourthdataflow,DF_SAP_MCH1,extractstheSAPtableMCH1,whichcontainsinformationaboutbatchescreatedinSAPforpost-loadvalidationpurposes.ThatallowsustoseewhichbatcheswereactuallyloadedinSAPandruntheSQLqueriesinourstagingareatovalidatefieldvalues.

IDocIDocisaformatandtransfermechanismthatSAPsystemsusetoexchangedata.DataServicesutilizesthismechanisminordertosendandreceiveinformationfromSAPsystems.IDocsthattheSAPsystemreceivesarecalledinboundandIDocssentbySAParecalledoutbound.YousawthattransactionWE20wasusedtoconfigureInboundIDocparameterssothatSAPcouldsuccessfullyacceptBATMASIDocmessagessenttoitfromDataServices.

BATMASIDocusedtoloadbatchdatahasanestedstructure,andthatiswhywehadtonestmultipledatasetswiththehelpofQuerytransform.WeusedartificialIDkeyROW_IDtolinkallthenestedsegmentstogether.

KeepinmindthatDataServicesdoesnotloaddatainSAPtablesdirectlyitself.AllDataServicesdoesispreparesthedataintheIDocformatsothatitcanbereceivedbySAPand

Page 535: SAP Data Services 4.x Cookbook

loadedintoSAPtablesusinginternalmechanisms/programs.

MonitoringIDocloadontheSAPsideDataServicessendsIDocmessagestoSAPsynchronously.AnIDocmessageisreceivedbySAPandthenprocessed.OnlyafterthatdoesDataServicessendsthenextIDocmessage.Sometimes,thisprocesscantakequitealongtime.AllyouwillseeintracelogontheDataServicessideisonerecordindicatingthatthedataflowloadingdataisstillrunning.

ToseewhatisgoingontheSAPside—howmanyIDocsfailandhowmanyofthemareprocessedsuccessfullybySAP—youcanusetransactionBD87:

ByexpandingtheBATMASsectionanddouble-clickingontheactualIDocrecordthatyouareinterestedin,youcanseethedataintheIDocnestedsegments:

Otherusefulinformationavailableonthisscreenincludes:

ThestatusoftheIDoc(processedsuccessfullyorfailed)Errormessages(iffailed)DatarecordsstoredinIDocmessage(E1BATMAS,E1BPBATCHATT,E1BPBATCHATTX,and

Page 536: SAP Data Services 4.x Cookbook

E1BATCHCTRLsegments)

Asyoucansee,theEDI_DC40segmentisnotvisibleasitisanIDocheaderitself.InformationwehaveprovidedinthissegmentisavailableintheShortTechnicalInformationpanelanddefinesthebehaviorofIDocprocessing.

ByclickingontheRefreshbuttonontheStatusMonitorforALEMessagesscreen,youcanseeinrealtimehowtheIDocsreceivedbySAPareprocessed.

Post-loadvalidationofloadeddataWeknowthatoneofthetablesinSAPwherebatchmasterdataisstoredisMCH1.Knowingwhichphysicaltablesareactuallypopulatedwithdatawhenyouenterdatamanuallyviatransactionalscreens,orloadingdatacomingfromexternalsystemsviaanIDocmechanism,isusefulasyoucanalwaysextractthecontentsofthesetablestoperformpost-validationtasks.

Toviewournewlycreatedbatch2015100901,wecanusetransactionMSC3N(DisplayBatch):

Or,wecanseethecontentsoftheMCH1tabledirectlyusingtheSE16transaction(DataBrowser):

Page 537: SAP Data Services 4.x Cookbook

Youcanseebothbatcheshere:theonecreatedmanuallyandtheoneloadedwiththehelpofDataServices.

DoyourememberthatwedevelopedadataflowtoextracttheMCH1tabletovalidateloadeddata?Let’schecktheactualrecordsextractedrightaftertheloadingprocesshasbeencompletedbybrowsingthecontentsoftheSAP_MCH1tableinourstagingarea:

TheCHARGcolumnintheMCH1tablestoresthebatchnumbervalues.

TipAstechnicalnamesinSAPtablescanbequitedifficulttounderstand,youcanusetransactionSE11toseethedescriptionsofthecolumnsforthespecifictable.

Page 538: SAP Data Services 4.x Cookbook

Thereismore…Wehavejustscratchedthesurfaceofoneofthepossiblemethodsofreading/loadingdatafromtheSAPsystem.

TherearemanyothermethodsthatcanbeusedtocommunicatewithSAPsystems:ABAPdataflows,BAPIcalls,directRFCcalls,OpenHubTables,andmanyothers.

Choosingbetweenthesemethodsusuallydependsonthetypeoftasksthathavetobeimplemented,theamountoftransferreddata,andthetypeofSAPenvironmentused.

Page 539: SAP Data Services 4.x Cookbook

Chapter12.IntroductiontoInformationStewardInthischapter,wewillseethefollowingrecipes:

ExploringDataInsightcapabilitiesPerformingMetadataManagementtasksWorkingwiththeMetapediafunctionalityCreatingacustomcleansingpackagewithCleansingPackageBuilder

Page 540: SAP Data Services 4.x Cookbook

IntroductionSAPInformationStewardisaseparateproductthatisinstalledalongsideSAPDataServicesandSAPBusinessIntelligenceandprovidesadditionalcapabilitiesforbusinessandITusersinordertoanalyzedataqualityandcreatecleansingpackagesthatcanincreasedatacleansingprocessesranbyDataServices.

TocoverallfunctionalitiesofInformationSteward,wewouldhavetowriteanotherbook.Inthischapter,wewillexplorethemainfunctionsofInformationStewardthatprovedthemselvestobethemostvaluabletousersoftheproduct.

AlltheseactivitiesrelatetospecificareaswithintheSAPInformationStewardapplication.

NoteLogintotheSAPInformationStewardapplicationathttp://localhost:8080/BOE/InfoStewardApp.

Onthemainpage,youcanseefivetabsthatrepresentthefourmainareasoftheInformationStewardproductfunctionality,asshowninthefollowingscreenshot:

Page 541: SAP Data Services 4.x Cookbook

ExploringDataInsightcapabilitiesTheDataInsighttabisthefirsttab,anditenablesyoutoprofilethedataavailablefromdifferentsources,buildvalidationrulesforthedata,anddesignascorecardinordertoseeavisualrepresentationofthequalityofyourdata.

Page 542: SAP Data Services 4.x Cookbook

GettingreadyBeforewelogintotheSAPInformationStewardapplication,wehavetocreatecoupleofInformationStewardobjectsinastandardCentralManagementConsole(CMC).ThegoalofthispreparationstepistodefinethesourcesofdatathatInformationStewardcanconnecttoinordertoperformdataqualityandanalysistasks.YoucandefinesomedatasourcesdirectlyintheInformationStewardapplicationlikeaflatfile,butsomeofthemshouldfirstbecreatedasconnectionsintheCMCInformationStewardarea.

1. LogintoCMCathttp://localhost:8080/BOE/CMC.2. GototheInformationStewardsection.3. ClickonConnectionsandclickontheCreateconnectionbuttoninthetopmenu.4. Fillinalltherequiredfields,asshowninthefollowingscreenshot,inordertocreate

aconnectionobjecttotheAdventureWorks_DWHSQLServerdatabase:

5. ClickontheTestConnectionbuttontovalidatetheinformationentered,andthenclickontheSavebuttontosavetheconnectionandexittheCreateConnectionscreen.

6. Thedwh_profileconnectionshouldappearinthelistofconnectionsthatcanbeusedinInformationSteward.

Page 543: SAP Data Services 4.x Cookbook

7. Finally,let’screateanewDataInsightprojectcalledGeography.Todothat,gototheDataInsightsectionandclickontheCreateaDataInsightprojectbutton.

Page 544: SAP Data Services 4.x Cookbook

Howtodoit…Beforeyoustartwiththefollowingsteps,firstlogintoSAPInformationStewardathttp://localhost:8080/BOE/InfoStewardApp.

ThecommonsequenceofactionsperformedontheDataInsighttabinInformationStewardincludes:

CreatingaconnectionobjectProfilingthedataViewingprofilingresultsCreatingavalidationruleCreatingascorecard

CreatingaconnectionobjectThefollowingstepsarerequiredtospecifythesourceofourdataforourDataInsightanalysis.

1. GotoDataInsight|GeographyProject.2. SelecttheWorkspaceHometabandclickontheAdd|Tables…buttoninthetop-

leftcorner.3. Intheopenedwindow,selectthedwh_profileconnectionobject,thenexpandit,

selectthedbo.DimGeographytable,andthenclickontheAddtoProjectbutton,asshowninthefollowingscreenshot:

Profilingthedata

Page 545: SAP Data Services 4.x Cookbook

Profilingorgatheringvariouskindsofinformationaboutthedatacanbeusedfordataanalysis.

1. ToprofilethedataintheaddedDimGeographytable,youcanusevariousprofiling

options.Let’scollectuniquenessprofilingdata.OntheWorkspaceHometab,selecttheDimGeographytableinthedwh_profileconnectionandclickontheProfile|UniquenessbuttonintheProfileResultstoolbarmenu.

2. IntheDefineTasks:Uniquenesswindow,specifywhichcolumnsyouwanttogatherauniquenessprofileinformationfor.SelectCityandCountryRegionCodeandclickontheSaveandRunNowbutton,asshowninthefollowingscreenshot:

3. Togathercolumnprofilinginformation,selecttheDimGeographytableandclickontheProfile|ColumnsbuttoninthetoolbarmenuoftheWorkspaceHome|ProfileResultstab.Specifyanameforthecolumnprofilingtask,Geography_column_profiling,andselectallprofilingoptions:Simple,Median&Distribution,andWordDistribution.Then,clickontheSaveandRunNowbuttontocreateandexecutethecolumnprofilingtask.

4. SelecttheTaskssectionontheleft-sidepaneltoseeboththeprofilingtaskscreatedintheprevioussteps.Youcanrunthemanytimefromthistabtorefreshtheprofilingdataaccordingtotheparametersspecified.

Page 546: SAP Data Services 4.x Cookbook

ViewingprofilingresultsThefollowingstepsshowyouhowtoviewthepreviouslygatheredprofilingresults.

1. Toseethedataprofileresults,gototheWorkspaceHome|ProfileResultstab.2. Expandthetableyouareinterestedintoseeitscolumnsandselectit.3. ClickontheRefresh|ProfileResultsbuttoninthetoolbarmenu.4. Then,byclickingonthefieldorspecificnumberyouareinterestedin,youcansee

thedetailedresultforthisfieldintheextrawindowsontheright-handsideofthescreen,andatthebottom,asshowninthefollowingscreenshot:

5. Toseetheresultsoftheuniquenessprofileinformationcollected,selecttheAdvancedviewmodeundertheProfileResultstab.

6. Intheopenedwindow,clickonthegreeniconintheUniquenesscolumnandselectthekeycombinationyouhavegatheredinformationon.Inourcase,wehavegathereduniquenessprofilinginformationfortwocolumnsoftheDimGeographytable,CityandCountryRegionCode,asshowninthefollowingscreenshot:

Page 547: SAP Data Services 4.x Cookbook

Byhoveringyourcursorovertheredzoneshowingthepercentageofnon-uniquerecordstheforselectedcombinationofcolumns,youcanseedetailedinformationsuchasthepercentageofnon-uniquerowsandnumberofnon-uniquerows.Inourcase,itis22.08%and151.Byclickingontheredzone,youcandisplaynon-uniquerowsatthebottomofthescreen.

Sofar,wehavegatheredtwotypesofprofilinginformation:columnprofiledataanduniquenessprofiledatafortheDimGeographytablelocatedinourdatawarehouse.

CreatingavalidationruleNow,let’sseehowyoucancreateavalidationruleinInformationStewardanddisplaytheresultofapplyingittothedatasetinagraphicalformbyusingscorecards.

1. OntheWorkspaceHome|ProfileResultstab,youcanfindayellowiconinthe

Advisorcolumnagainstthedbo.DimGeographytable,asshowninthefollowingscreenshot:

Page 548: SAP Data Services 4.x Cookbook

2. ClickontheyellowiconshownintheprecedingscreenshottolaunchDataValidationAdvisor:

3. WearenotgoingtoacceptthevalidationrulesuggestedbyDataValidationAdvisorandwillcreateourowncustomvalidationrule.

OurcustomrulewillcheckiftheDimGeographytablerecordhastranslatedvaluesinboththecolumns,FrenchCountryRegionNameandSpanishCountryRegionName.Tocreateanewrule,openthesecondverticaltableRules,whichisnexttotheWorkspaceHometab,andclickontheNewbuttonfromthetoolbarmenutocreateanewrule.

4. FillinalltheconfigurationfieldsofthenewFrench_Spanish_CountryRegionNamerule,asshowninthefollowingscreenshot:

Page 549: SAP Data Services 4.x Cookbook

Wehavecreatedtwoparameters,$French_translationand$Spanish_translation,ofthevarchardatatype.Eachparameterchecksthevalueineachofthetwocolumns,andintheDefinitiontab,wehavespecifiedtheconditiontobeappliedtothevalues.

5. ClickontheSubmitforApprovalbutton.TherulewillbesenttotheTaskstabforapprovalbyacategoryofusersspecifiedintheApproverfieldontheRuleEditorwindow.

6. TherulecanbeapprovedfromtheMyWorklistsection,asshowninthefollowingscreenshot:

7. GototheWorkspaceHome|RuleResultstabandclickontheBindtoRulebutton.

8. Bindtheruleparameterstothedwh_profile.dbo.DimGeographyfields,asshowninthefollowingscreenshot,andclickontheSaveandClosebutton:

Page 550: SAP Data Services 4.x Cookbook

9. ClickonRefresh|RuleResultstoseetheresultsofapplyingtheruletothecolumnsofthetablespecified,asshowninthefollowingscreenshot:

Theleftsideofthescreenshowstherulescoresforthespecifiedfieldsandthenumberofrecordsthatpassed/failedtherule.Inourexample,55rowsdonothaveeitheraFrenchorSpanishtranslationintheFrenchCountryRegionNameandSpanishCountryRegionNamefields.

Youcanseetheactualrecordsintheright-sidepanel.

10. YoucanseetheruleresultontheRulestabdirectly.AllyouneedtodoisselecttheruleandclickontheBindbutton.Theruleresultappearsontherightsideofthescreen,asshowninthefollowingscreenshot:

Page 551: SAP Data Services 4.x Cookbook

CreatingascorecardScorecardsareaconvenientwaytovisualizeandpresenthistoricalinformationaboutvalidationruleresults.

1. AscorecardcanbecreatedontheScorecardSetuptab.Thisisavery

straightforwardprocesswhereyoufirstspecifyKeyDataDomain,QualityDimension,thentheruleyouwanttoincludeinthescorecardoutput,and,finally,performrulebindingtolinktheruletotheactualdataset,asshowninthefollowingscreenshot:

2. Toviewthescorecardresults,gotoWorkspaceHomeandselecttheScorecardviewmodeinsteadofWorkspaceinthecomboboxlocationinthetop-rightcorner,asshowninthefollowingscreenshot:

Page 552: SAP Data Services 4.x Cookbook
Page 553: SAP Data Services 4.x Cookbook

Howitworks…Now,afterwehavecreatedourconnectionobject,gatheredtheprofilingdata,appliedthevalidationrule,andevencreatedthescorecardtoseeitsresults,let’sseeinmoredetailthevariousaspectsofthestepsperformed.

ProfilingAsyoucansee,workinginInformationStewardisaveryintuitiveprocess.

Asmentionedearlier,theDataInsightsectionofInformationStewardisallaboutunderstandingyourdata,whichispossiblewiththeprofilingcapabilitiesofIS.Inthemajorityofcases,profilingyourdataisthefirststepbeforestartinganydataqualityrelatedwork.Inthefollowingsection,wewillreviewthetypesofprofilingdataavailableintheProfileResultssection.

Thevaluesectionofprofilingdatashowstheactualborderandmedianvaluesfromthedatasetforaspecificfield.StringLengthprofilingvaluesprovideinformationaboutthesizeofthevalues.Thecompletenesssectionhelpsyoutoseeanygapsinthedata.Distributioncanbeextremelyusefultounderstandthecardinalityofthespecificfieldsinyourdataset.Forexample,seeingnumber7intheDistribution|ValuefieldoftheprofilingresultdataagainsttheCountryRegionCodefield,wewillknowthatwehaveonlysevendifferentvaluesinthatfield.Clickingonthatnumbershowsusthosevaluesandtheirdistributionintheright-handsidepanel.

RulesRulesallowyoutoanalyzethedataaccordingtocustomconditions.Rulesarecreatedforgeneralruleparameterssothatyoucanapplythesameruletodifferentdatasets,ifnecessary.Linkingtheruletoaspecificdatasetiscalledbinding.Itistheprocessoflinkingruleparameterstoactualtablefields.

Rulesareusuallydefinedbybusinessuserstounderstandhowdatacomplieswithspecificbusinessrequirements.

InformationStewardoffersaDataValidationAdvisorfeaturethatproposesthepreconfiguredrulesdependingontheprofilingresultsofyourdata.

ScorecardsScorecardsallowyoutogroupyourrulesandhelpyoutoseetrendsindatascorescalculatedbyspecificrules.

Page 554: SAP Data Services 4.x Cookbook

Thereismore…ThereismuchmoretotheDataInsightfunctionalitythanpresentedinthisrecipe.WehavejustscratchedthesurfaceofthebasicfunctionsavailableinthisareaofInformationSteward.

ItispossibletospecifyfileformatsdirectlyinInformationStewardinordertosourcedatafromflatfilesandfromExcelspreadsheets.

AnothergreatthingaboutInformationStewardDataInsightisthatitallowsyoutobuilddataviewsthatarebasedonmultiplesourcesofinformation.

Theintuitiveandwell-documentedinterfaceallowsyoutoeasilyexperimentandplaywithyourdataonyourown.Thisisalwaysaveryfascinatingprocessthatdoesnotrequireanydeeptechnicalknowledgeoftheunderlyingproduct.

Page 555: SAP Data Services 4.x Cookbook

PerformingMetadataManagementtasksThesecondtoolavailableinInformationStewardafterDataInsightisMetadataManagement.

TheMetadataManagementtoolisusedtocollectmetadatainformationfromvarioussystemsinordertogetacomprehensiveviewofitandtoanalyzetherelationshipsbetweenmetadataobjects.

Inthisrecipe,wewilltakealookattheexampleofusingMetadataManagementonourDataServicerepository,whichstorestheETLcodedevelopedforrecipesofthisbook.

Page 556: SAP Data Services 4.x Cookbook

GettingreadyAswithDataInsight,wehavetofirstestablishconnectivitytotheDataServicesrepository.ThisisusuallyanadministrationtaskthatcanbedoneinCMCintheInformationSteward|MetadataManagementsection.ClickonCreateanintegratorsourceandfillinalltherequiredfields,asshowninthefollowingscreenshot,todefineaconnectiontoDataServicesrepositoryfortheMetadataManagementtool:

Aftercreatinganintegratorsourceobject,youhavetorunitbyusingtheRunNowoptionintheobject’scontextmenu.ThatoperationwillperformthecollectionofmetadataorinformationaboutalltheobjectsintheDataServicesrepository.RememberthatanyrecentchangesmadetotherepositoryafterthisoperationwillnotbepropagatedtothecollectedMetadataManagementsnapshot,soyouwouldneedtoeitherrunitmanuallyorscheduleittorunregularlyaccordingtoyourrequirements.

ThefollowingscreenshotshowsyouhowtousetheRunNowoption:

Page 557: SAP Data Services 4.x Cookbook

TheLastRuncolumnshowsyouwhentheintegratorsourcedatawaslastupdated.

Toseethehistoryofruns,justselectHistoryfromtheintegratorsourceobjectscontextmenu,asshowninthefollowingscreenshot:

Thisscreencanshowyouhowlongittooktocollectmetadatainformationfromtherepositoryandevenprovidesaccesstothedatabaselogofthemetadatacollectionprocess,whichcanbeusedfortroubleshootinganypotentialproblems.

Page 558: SAP Data Services 4.x Cookbook

Howtodoit…NowthatwehavedefinedtheconnectiontoourDataServicesrepositoryandcollectedthemetadatasnapshotusingthisconnectioninCMC,wecanlaunchtheInformationStewardapplicationtousetheMetadataManagementfunctionality.

1. LogintoInformationStewardandgototheMetadataManagementsection,as

showninthefollowingscreenshot:

2. ClickontheData_Services_RepositorysourceintheDataIntegrationcategoryandontheopenedscreen,lookfortheDimGeographytableusingtheSearchfield.TheSearchResultssectionattheverybottomshowsyouallthepossiblematches,soallyouhavetodoisselecttheobjectyouneed—tablefromtheSTAGEdatabaseundertheTransformschema:

3. ToseetheimpactthetablehasonanotherobjectinETLrepository,clickontheViewImpactbutton.Youshouldseesomethinglikethefollowingscreenshot:

Page 559: SAP Data Services 4.x Cookbook

YoucanseethattheDimGeographytableisusedasasourcetopopulatetheotherDimGeographytables(fromAdventureWorks_DWHandDWH_backupdatabases).

4. ClickontheLINAGEsectioninthesamewindowtoseethesourceobjectfortheDimGeographytableoftheSTAGEdatabaseTransformschema,asshowninthefollowingscreenshot:

Youcanseethatthedatacamefromthreetables:ADDRESS,COUNTRYREGION,andSTATEPROVINCE.

5. ByswitchingtoColumnsMappingView,youcanseethelinageinformationonthecolumnlevel,asshowninthefollowingscreenshot:

Page 560: SAP Data Services 4.x Cookbook

6. ClosethiswindowtogobacktothemainMetadataManagementworkingarea.Now,let’sdefinetherelationshipbetweenthetwotablesfromtheDataServicesrepositoryarenotdirectlyrelatedtoeachotherinETLcode:STAGE.Transform.DIMGEOGRAPHYandSTAGE.Transform.DIMSALESTERRITORY.Todothat,youhavetoselecteachtableintheSearchresultssectionatthebottomandclickontheAddtoObjectTraybutton.

7. WhenbothtablesareaddedintoObjectTray,clickontheObjectTray(2)linkatthetopofthescreen(righttotheSearchfield).

8. Intheopenedwindow,selectbothobjects,asshowninthefollowingscreenshot:

9. ClickonEstablishRelationshipandconfigurethedesirablerelationshipbetweenthesetwoobjects,asshowninthefollowingscreenshot:

Page 561: SAP Data Services 4.x Cookbook

10. Now,ifyouclickontheViewRelatedTobutton,youcanseethattherelationshipinformationappearsonthescreen,asshowninthefollowingscreenshot:

11. ToexporttheinformationfromthisscreenintoanExcelspreadsheet,clickontheExportthetabularviewtoanExcelfilebuttoninthetop-rightcorner.

12. ChoosetheOpenwithMicrosoftExceloption,asshowninthefollowingscreenshot:

Page 562: SAP Data Services 4.x Cookbook

13. ThegeneratedExcelspreadsheetcouldbesenttootherbusinessusers,usedinfurtheranalysis,orsimplyusedasapieceofdocumentationforETLmetadata.

Page 563: SAP Data Services 4.x Cookbook

Howitworks…Metadatamanagementcanlinkinformationprovidedbymultiplesourcesinordertoperformlineageandimpactanalysisonobjects.Inourexample,weusedonlytheDataServicesrepository,butmultiplesources,suchasBusinessIntelligencemetadataareoftenimportedalongwithsourcedatabaseobjectsandtheDataServicesmetadata.Thatallowsyoutoseethefullpictureofwhatishappeningtoaspecificdataset,startingfromitsextractionfromthedatabase,whichETLtransformationsareappliedtoit,whichtargettablethetransformeddataisloadedto,and,finally,whichBIuniversesandBIreportsuseit.

Ontopofthatyoucancreateuser–customerrelationshipsbetweenobjectsthatarenotrelatedtoeachothereitherdirectlyorindirectly.

Page 564: SAP Data Services 4.x Cookbook

WorkingwiththeMetapediafunctionalityThinkofMetapediaasWikipediaforyourdata.Metapediaisusedtobuildahierarchyofbusinesstermsanddescriptionsforyourdata,groupthemintocategories,andevenassociateactualtechnicalobjectslikepiecesofETLcodeanddatabasetableswiththeseterms.

Inthisrecipe,wewillcreateasmallglossaryofbusinesstermsinInformationStewardandlearnhowitcanbedistributedoutsideofthesystemtobeupdatedbybusinessusersandimportedbackintoInformationSteward.

Page 565: SAP Data Services 4.x Cookbook

Howtodoit…1. LogintoInformationStewardandgototheMetapediasection.2. ClickontheNewCategorybuttontocreateanewcategory,Geography,asshownin

thefollowingscreenshot:

SpecifythekeywordstobeassociatedwiththecategoryforaneasysearchandclickontheSavebuttontocreatethecategory.

3. ChooseAllTermsandclickontheNewbuttontocreateanewterm,Postcode,asshowninthefollowingscreenshot:

ClickonSavetocreateitandclosethewindow.

4. Now,selectthecreatedterminthelistoftermsandclickonCategoryActions|AddtoCategory.

Page 566: SAP Data Services 4.x Cookbook

5. Ontheopenedcategorylistscreen,selecttheGeographycategoryandclickonOK,asshowninthefollowingscreenshot:

6. ClickontheExportMetapediatoMSExcelfilebuttonandselecttheAllTermsoption.

7. Inthepromptwindow,selectExporttermdescriptioninplaintextformat.8. Savethefileonthedisk.Now,let’sperformsomemodificationstothefileasifwe

arebusinessuserswhohavebeentoldtocreateaglossaryoftermsandcategoriesusingthisExcelspreadsheet.

9. AddthenewtermsontheBusinessTermstabofthespreadsheet,asshowninthefollowingscreenshot:

10. AddthenewcategoriesontheBusinessCategoriestabofthespreadsheet,asshowninthefollowingscreenshot:

11. GobacktoInformationSteward|MetapediaandclickonImportMetapediafromMSExcelfile.Specifythefilemodifiedinthepreviousstep,asshowninthefollowingscreenshot:

Page 567: SAP Data Services 4.x Cookbook

NotethatimportinginformationfromthisspreadsheetwillautomaticallyapprovealltermsandwillchangetheirstatusesfromEditingtoApproved.

12. Toassociateatermwithactualtechnicalobjects,double-clickonthespecifictermandclickontheActions|Associatewithobjectsbuttononthetermeditorscreen.SelecttheobjectsyouwanttoassociatewiththetermonebyonebyclickingontheAssociatewithtermbutton.ClickonDoneafteryouhavefinished.

13. Wehaveassociatedtwoobjects,tableCITYandparameter$p_City,fromourDataServicesrepositorywiththetermCity,asshowninthefollowingscreenshot:

Page 568: SAP Data Services 4.x Cookbook
Page 569: SAP Data Services 4.x Cookbook

Howitworks…ThemainfunctionofMetapediaistoprovideaglossarytobrowseandunderstandthedatapresentedandcategorizedinclearbusinessterms.Inotherwords,thepurposeofMetapediaistoprovideacleartranslationoftechnicaltermsintotermsthatcouldbeunderstoodbybusiness.

Itisasimplebutveryefficientsolution,andinthisrecipe,wedemonstratedhowasimpleglossarycanbecreatedinInformationStewardMetapedia,andthenexportedintoaspreadsheetfordistributionandimportedbackwithupdatedinformation.

ThisisveryusefulifyouneedtogatherthiskindofinformationfromuserswhodonothaveknowledgeoraccesstoInformationStewardandcreatetermsandcategoriesdirectlyinthesystem.

Page 570: SAP Data Services 4.x Cookbook

CreatingacustomcleansingpackagewithCleansingPackageBuilderInChapter7,ValidatingandCleansingData(seetherecipeDataQualitytransforms–cleansingyourdata),wealreadyusedthedefaultcleansingpackagePERSON_FIRMavailableinDataServicesfordatacleansingtasks.

Inthisrecipe,wewillcreateanewcleansingpackagefromscratchwiththehelpofInformationStewardandpublishitsothatitcanbeusedinDataServicestransforms.

OurnewcustomcleansingpackagewillbeusedtodeterminethetypeofstreetusedintheaddressfieldoftheAddresstablefromtheOLTPdatabase.

Page 571: SAP Data Services 4.x Cookbook

GettingreadyTheInformationStewardCleansingPackageBuildertoolrequiresasampleflatfilewithdatathatisusedtodefinecleansingrules.Thefollowingstepsdescribehowtopreparesuchaflatfilewithsampledata.

AswearegoingtouseourcustomcleansingpackagetocleansetheOLTP.Addresstabledata,wewillgenerateoursampledatasetfromthesametable.

1. LaunchDataServicesDesignerandlogintolocalrepository.2. GotoLocalObjectLibrary|Formats|FlatFiles.3. Right-clickontheFlatFilessectionandcreateanewflatfileformat,PB_sample,as

showninthefollowingscreenshot:

4. Createanewjobandnewdataflow.Insideadataflow,puttheOLTP.ADDRESStableasasourcetable.

5. LinkthesourcetabletoQuerytransformandpropagateonlytheADDRESSLINE1columntotheoutputschema.

6. LinktheoutputofaQuerytransformobjecttothetargetfilebasedonthePB_samplefileformatcreatedearlier.

Page 572: SAP Data Services 4.x Cookbook

7. Saveandrunthejob.ThePB_sample.txtfileshouldappearintheC:\AW\Files\folder.

Page 573: SAP Data Services 4.x Cookbook

Howtodoit…Now,afterwehavecreatedasamplefile,wecanfinallystarttheInformationStewardapplicationanduseCleansingPackageBuildertocreateournewcustomcleansingpackage.

1. LaunchtheInformationStewardapplicationandgototheCleansingPackage

Builderarea.2. ClickonNewCleansingPackage|CustomCleansingPackageandspecifythe

packagenameandsampledatafileinthefirststepofpackagebuilder:

3. Step2ofpackagebuildercontainsinformationwhichhelpstoparsethesampledatacorrectly:

Page 574: SAP Data Services 4.x Cookbook

4. Atstep3ofthepackagebuilder,youshoulddefinethenumberofrecordstakenfromthesamplefiletobeusedinthepackagedesignprocess.Themaximumnumberofrowsis3,000.Specifytherandommechanismofobtainingrowsfromthesamplefile,andnumberofrowstogetis3,000.

Page 575: SAP Data Services 4.x Cookbook

5. Step4definestheparsingstrategy:

6. Atstep5,youcanchooseacategorynameandassignsuggestedattributestoitifyouwantto.Inourexample,noneofthesuggestedattributesmatchesourcategorySTREET_TYPE,sowedonottickanyofthem:

7. Atstep6,wecreateattributesforourSTREET_CATEGORYcategoryandcategorizethevaluesfoundinthesamplefileagainsttheattributes.TheStandardFormscolumndefinesthestandardizedformoftheparsedvalueandtheVariationscolumndefineswhatvariationswillbestandardizedtothevaluespecifiedintheStandardFormswindow.PleaseseeanexampleoftheconfigurationfortheDRIVE_ATTRattribute:

Page 576: SAP Data Services 4.x Cookbook

8. AnotherexampleistheSTREET_ATTRattribute:

YoucanseehowwehaveassignedSTREETvaluestothestandardformthatarevisuallyandsyntacticallyverydifferent,likeStraseandRue.

9. Afterstep6,youmightthinkyouhavecreatedyourpackageandthatthejobisdone.Thisisalmosttrue.Wehavejustpassedthebasiccleansingpackagebuilderwizardstepsinordertocreatethecanvasforournewpackage.Therealworkstartswhenyoudouble-clickonthepackageintheCleansingPackageBuilderareaandthepackageeditoropens.Ithastwomaineditingmodes:DesignandAdvanced.Weare

Page 577: SAP Data Services 4.x Cookbook

notgoingtoworkwiththeadvanceddesignmodeasitwouldtakeanotherbooktocoveralltheaspectsoffine-tuningyourcleansingpackageinthismode.

10. Inthemeantime,youhaveprobablynoticedthatourcustompackagewascreatedwiththelockicon:

11. InformationStewardneedssometimetofinishitsbackgroundprocessesofthepackagecreation,soyouhavetowaitforcoupleofminutesuntiltheiconchangestodifferentone:

12. Nowthepackageisreadytobepublished.SelectthepackageontheleftandclickonthePublishbuttoninthetoolbarmenu.Theclockicononthepackageintheright-sidepanelmeansthatInformationStewardisstillperformingbackgroundoperationsinordertopublishthepackageandmakeitavailableforusageinDataServices:

13. Whenthepackagepublicationisfinished,theiconchangesagain:

Page 578: SAP Data Services 4.x Cookbook

14. Youcancontinuefine-tuningyourpackagebyenteringpackageDesignmode.Thismodeshowsyoutheresultofyouractionsimmediatelyinthetableatthebottom:

Page 579: SAP Data Services 4.x Cookbook

Howitworks…Let’sseehowthecleansingpackagewecreatedcouldactuallybeusedinDataServicestoperformdatacleansingtasks.

1. StartDataServicesDesigner.2. Createanewjobandnewdataflow.3. ImporttheOLTP.ADDRESStableasasourcetableobject.4. LinkthesourcetabletotheQuerytransformandpropagateonlytheADDRESSLINE1

columntotheoutputschemaaswearegoingtoperformcleansingonlyonthiscolumn.

5. LinktheQuerytransformobjecttotheData_Cleansetransform,whichcanbefoundinLocalObjectLibrary|Transforms|DataQuality|Data_Cleanse.

6. OpentheimportedData_Cleanseobjectforeditinginthemainworkspacewindowandgotothefirsttab,Input.

7. MaptheinputADDRESSLINE1fieldtotheMULTILINE1transforminputfieldname:

8. GotothesecondOptionstabandconfigurethefollowingoptionsspecifyingournewcreatedAddress_Customasthecleansingpackage:

Page 580: SAP Data Services 4.x Cookbook

9. Finally,openthethirdtabOutputanddefinethefollowingoutputcolumnsthatwillbeproducedbyData_Cleansetransform:

10. ClosetheData_Cleansetransformobjectandlinkittonewlyimportedtemplatetable,ADDRESS_CLEANSE_STREET_TYPE,createdintheDS_STAGEdatastore.

11. Yourdataflowshouldlooklikethatinthefollowingfigure:

Afteryouhavesavedandranthejob,youcanseethatthecleansingpackage“categorized”andpopulatedcolumnshavebeencreatedforeachattributeof

Page 581: SAP Data Services 4.x Cookbook

STREET_CATEGORY:

Howwellacleansingpackagedoesitsjobsolelydependsonyourabilitytodefinerulesandconfigureittoaccommodateallpossiblescenariosthatcanbeseeninyourdata.

Forexample,“Circle”hasnotbeencategorizedaswesimplydidnotdefineanyruleregardingthe“Circle”value.

ThisisoneofthesimplestcasesofthecleansingtaskbutitshouldgiveyouanideaoftheInformationStewardcapabilitiesinthisarea.

Page 582: SAP Data Services 4.x Cookbook

Thereismore…Openacleansingpackagebydouble-clickingandgoingtotheAdvancedmodetoseehowmanyoptionsexistforcreatingandtuningcleansingrulesandalgorithms.Youcandefinenewrulesandchangethealreadycreatedonestomakeyourcleansingprocessbehavedifferently.Thecomplexityofacleansingpackageisrestrictedonlybyyourfantasyandthecomplexityoftheaccommodatedcleansingprocessrequirements.

Page 583: SAP Data Services 4.x Cookbook

IndexA

AccessServerconfiguring/ConfiguringAccessServer,Howtodoit…

administrativetasksRepositoryManager,using/Howtodoit…ServerManager,using/Howtodoit…CMC,usedforregisteringnewrepository/Howtodoit…LicenseManager,using/Howtodoit…

aggregatefunctionsusing/Usingaggregatefunctions,Howtodoit…,Howitworks…

auditreportingabout/BuildinganexternalETLauditandauditreporting,Howtodoit…,Howitworks…

Autocorrectloadoptionabout/ExploringtheAutocorrectloadoption,Howtodoit…,Howitworks…

Page 584: SAP Data Services 4.x Cookbook

B

blobdatatype/Howitworks…bulk-load

about/Optimizingdataflowloaders–bulk-loadingmethodsbulk-loadingmethods

about/Optimizingdataflowloaders–bulk-loadingmethods,Howtodoit…,Howitworks…enabling/Whentoenablebulkloading?

bypassingfeatureusing/Usingthebypassingfeature,Howtodoit…,Howitworks…

Page 585: SAP Data Services 4.x Cookbook

C

Casetransformused,forsplittingdataflow/SplittingtheflowofdatawiththeCasetransform,Howtodoit…,Howitworks…

CentralConfigurationManagementabout/Howtodoit…

CentralManagementConsoleabout/Introduction,Howtodoit…

CentralManagementConsole(CMC)/GettingreadyCentralObjectLibrary

objects,assigningtoandfrom/AddingobjectstoandfromtheCentralObjectLibrary

centralrepositoryETLcode,migratingthrough/MigratingETLcodethroughthecentralrepository,Gettingreadyobjects,comparingbetweenLocalandCentral/ComparingobjectsbetweentheLocalandCentralrepositories

ChangeDataCapture(CDC)about/ChangeDataCapturetechniquesNohistorySCD(Type1)/NohistorySCD(Type1)limitedhistorySCD(Type3)/LimitedhistorySCD(Type3)unlimitedhistorySCD(Type2)/UnlimitedhistorySCD(Type2)process,building/Howtodoit…,Howitworks…source-basedETLCDC/Source-basedETLCDCtarget-basedETLCDC/Target-basedETLCDCnative/NativeCDC

CleansingPackageBuilderused,forcreatingcustomcleansingpackage/CreatingacustomcleansingpackagewithCleansingPackageBuilder,Gettingready,Howtodoit…,Howitworks…

clienttoolsabout/IntroductionDesignertool/IntroductionRepositoryManager/Introduction

CodePlexURL/Howtodoit…

commandline(cmd)about/Howtodoit…

conditionalandwhileloopobjectsused,forcontrollingexecutionorder/Usingconditionalandwhileloopobjectstocontroltheexecutionorder,Gettingready,Howtodoit…,Thereismore…

connectionobject,DataInsightcreating/Creatingaconnectionobject

continuousworkflow

Page 586: SAP Data Services 4.x Cookbook

using/Usingacontinuousworkflow,Howtodoit…,Howitworks…,Thereismore…

conversionfunctionsusing/Usingconversionfunctions,Howtodoit…,Howitworks…

customfunctionscreating/Creatingcustomfunctions,Howtodoit…,Howitworks…

Page 587: SAP Data Services 4.x Cookbook

D

dataloading,intoflatfile/Loadingdataintoaflatfile,Howtodoit…,Howitworks…,There’smore…loading,fromflatfile/Loadingdatafromaflatfile,Howtodoit…,Howitworks…,There’smore…loading,fromtabletotable/Loadingdatafromtabletotable–lookupsandjoins,Howtodoit…,Howitworks…flowsplitting,Casetransformused/SplittingtheflowofdatawiththeCasetransform,Howtodoit…,Howitworks…flowexecution,monitoring/Monitoringandanalyzingdataflowexecution,Gettingready,Howtodoit…,Howitworks…flowexecution,analyzing/Monitoringandanalyzingdataflowexecution,Gettingready,Howtodoit…,Howitworks…cleansing/DataQualitytransforms–cleansingyourdata,Howtodoit…,Howitworks…,There’smore…transforming,Pivottransformused/TransformingdatawiththePivottransform,Gettingready,Howtodoit…,Howitworks…loading,intoSAPERP/LoadingdataintoSAPERP,Gettingready,Howtodoit…,Howitworks…

databaseenvironmentpreparing/Preparingadatabaseenvironment,Howitworks…

databasefunctionsusing/Usingdatabasefunctionskey_generation()function/key_generation()total_rows()function/total_rows()sql()function/sql(),Howitworks…

dataflowauditenabling/Enablingdataflowaudit,Howtodoit…,Howitworks…,There’smore…

dataflowexecution,optimizingSQLtransform/Optimizingdataflowexecution–theSQLtransform,Howtodoit…,Howitworks…Data_Transfertransform/Optimizingdataflowexecution–theData_Transfertransform,Howtodoit…,Howitworks…Data_Transfertransform,usage/WhentouseData_Transfertransform,There’smore…performanceoptions/Optimizingdataflowexecution–performanceoptions,Howtodoit…dataflowperformanceoptions/Dataflowperformanceoptionssourcetableperformanceoptions/Sourcetableperformanceoptionsquerytransformperformanceoptions/Querytransformperformanceoptionslookup_ext()performanceoptions/lookup_ext()performanceoptionstargettableperformanceoptions/Targettableperformanceoptions

Page 588: SAP Data Services 4.x Cookbook

dataflowflow,optimizingpush-downtechniques/Optimizingdataflowexecution–push-downtechniques,Howtodoit…,Howitworks…

dataflowloaders,optimizingbulk-loadingmethods/Optimizingdataflowloaders–bulk-loadingmethods,Howtodoit…,Howitworks…bulkloading,enabling/Whentoenablebulkloading?

dataflowperformanceoptionsabout/Dataflowperformanceoptions

dataflowreaders,optimizinglookupmethods/Optimizingdataflowreaders–lookupmethodsQuerytransformjoin,lookupwith/LookupwiththeQuerytransformjoinlookup_ext()function,lookupwith/Lookupwiththelookup_ext()functionsql()function,lookupwith/Lookupwiththesql()functionQuerytransformjoin,advantages/Querytransformjoinslookup_ext()function/lookup_ext()sql()function/sql()performancereview/Performancereview

DataInsightcapabilities,exploring/ExploringDataInsightcapabilities,Gettingready,Howtodoit…connectionobject,creating/Creatingaconnectionobjectdata,profiling/Profilingthedataprofilingresults,viewing/Viewingprofilingresultsvalidationrule,creating/Creatingavalidationrulescorecard,creating/Creatingascorecard,Howitworks…profiling/Profilingrules/Rulesscorecards/Scorecards

DataModificationLanguage(DML)operation/UsingtheMap_OperationtransformDataQualitytransforms

about/DataQualitytransforms–cleansingyourdata,Howtodoit…,Howitworks…

DataServicesclienttools/Introductionserver-basedcomponents/Introductioninstalling/InstallingandconfiguringDataServices,Howtodoit…,Howitworks…configuring/InstallingandconfiguringDataServices,Howtodoit…,Howitworks…referenceguide,URL/Howtodoit…autodocumentation/AutoDocumentationinDataServices,Howtodoit…,Howitworks…automaticjobrecovery/AutomaticjobrecoveryinDataServices,Gettingready,Howtodoit…,Howitworks…,There’smore…

DataServicesobjects

Page 589: SAP Data Services 4.x Cookbook

andparent-childrelationships/Peekinginsidetherepository–parent-childrelationshipsbetweenDataServicesobjects,Howitworks…objecttypeslist,gettinginDataServicesrepository/GetalistofobjecttypesandtheircodesintheDataServicesrepositoryDF_Transform_DimGeographydataflowinformation,displaying/DisplayinformationabouttheDF_Transform_DimGeographydataflowSalesTerritorytableobjectinformation,displaying/DisplayinformationabouttheSalesTerritorytableobjectscriptobjectcontent,displaying/Seethecontentsofthescriptobject

DataServicesrepositorycreating/CreatingIPSandDataServicesrepositories,Howtodoit…,Howitworks…database,creating/Howtodoit…ODBClayer,configuring/Howtodoit…

datavalidationvalidationfunctions,creating/Creatingvalidationfunctions,Howtodoit…,Howitworks…results,reporting/Reportingdatavalidationresults,Howtodoit…,Howitworks…regularexpressionsupportused/Usingregularexpressionsupporttovalidatedata,Gettingready,Howtodoit…,Howitworks…

Data_Transfertransformabout/Optimizingdataflowexecution–theData_Transfertransform,Howtodoit…,Howitworks…usage/WhentouseData_Transfertransform,There’smore…

datefunctionsusing/Usingdatefunctionscurrentdateandtime,generating/Generatingcurrentdateandtimeparts,extractingfromdates/Extractingpartsfromdates,Howitworks…,There’smore…

Designertoolabout/UnderstandingtheDesignertoolsetting/Howtodoit…defaultoptions,setting/Howtodoit…ETLcode,executing/ExecutingETLcodeinDataServicesETLcode,validating/ValidatingETLcodetemplatetables,using/Templatetablesquerytransform/QuerytransformbasicsHelloWorldexample/TheHelloWorldexample

Dropandre-createtableoption/There’smore…DSManagementConsole

about/Introduction

Page 590: SAP Data Services 4.x Cookbook

E

ETLorganizing/Projectsandjobs–organizingETL,Howtodoit…,Howitworks…projects/Projectsandjobs–organizingETL,Howtodoit…,Howitworks…hierarchicalobjectview/Hierarchicalobjectviewhistoryexecutionlogfiles/Historyexecutionlogfilesjobs,schedulingfromManagementconsole/Executing/schedulingjobsfromtheManagementConsolejobs,executingfromManagementconsole/Executing/schedulingjobsfromtheManagementConsole

ETLauditexternalETLaudit,building/BuildinganexternalETLauditandauditreporting,Howtodoit…,Howitworks…built-in,using/Usingbuilt-inDataServicesETLauditandreportingfunctionality,Howtodoit…,Howitworks…

ETLcodemigrating,throughcentralrepository/MigratingETLcodethroughthecentralrepository,Gettingready,Howtodoit…migrating,withexport/import/MigratingETLcodewithexport/import,Howtodoit…

ETLexecutionsimplifying,withsystemconfigurations/SimplifyingETLexecutionwithsystemconfigurations,Gettingready,Howtodoit…,Howitworks…

ETLjobdimensiontables,populating/Usecaseexample–populatingdimensiontables,Howtodoit…building/Usecaseexample–populatingdimensiontables,Howtodoit…mapping,defining/Mappingdependencies,defining/Dependenciesdevelopment/Developmentexecutionorder/Executionordertesting/TestingETLtestdata,preparingtopopulateDimSalesTerritory/PreparingtestdatatopopulateDimSalesTerritorytestdata,preparingtopopulateDimGeography/PreparingtestdatatopopulateDimGeography

executionordercontrolling,bynestingworkflows/Nestingworkflowstocontroltheexecutionorder,Howtodoit,Howitworks…controlling,conditionalandwhileloopsobjectsused/Usingconditionalandwhileloopobjectstocontroltheexecutionorder,Gettingready,Howtodoit…,Howitworks…,Thereismore…

export/import

Page 591: SAP Data Services 4.x Cookbook

ETLcode,migratingwith/MigratingETLcodewithexport/import,GettingreadyATLfilesused/Import/ExportusingATLfilestolocalrepository/Directexporttoanotherlocalrepository,Howitworks…

Extract-Transform-Load(ETL)about/Introductionadvantages/Introduction

Page 592: SAP Data Services 4.x Cookbook

F

failurescontrolling/Controllingfailures–try-catchobjects,Howtodoit…,Howitworks…

flatfiledata,loadingin/Loadingdataintoaflatfile,Howtodoit…,Howitworks…,There’smore…data,loadingfrom/Loadingdatafromaflatfile,Howtodoit…,Howitworks…

fullpushdown/Gettingready

Page 593: SAP Data Services 4.x Cookbook

H

Hierarchy_Flatteningtransformabout/TheHierarchy_Flatteningtransform,Gettingreadyhorizontalhierarchyflattening,performing/Horizontalhierarchyflatteningverticalhierarchyflattening/Verticalhierarchyflattening,Howitworks…resulttables,querying/Queryingresulttables

horizontalhierarchyflatteningabout/Horizontalhierarchyflattening

Page 594: SAP Data Services 4.x Cookbook

I

IDocabout/IDocload,monitoringonSAPside/MonitoringIDocloadontheSAPsideloadeddata,post-loadvalidation/Post-loadvalidationofloadeddata

InformationPlatformServices(IPS)configuring/InstallingandconfiguringInformationPlatformServices,Howtodoit…,Howitworks…installing/InstallingandconfiguringInformationPlatformServices,Howtodoit…,Howitworks…

/GettingreadyIPSrepository

creating/CreatingIPSandDataServicesrepositories,Howtodoit…,Howitworks…

Page 595: SAP Data Services 4.x Cookbook

J

jobexecutiondebugging/Debuggingjobexecution,Howtodoit…,Howitworks…monitoring/Monitoringjobexecution,Howtodoit…

jobrecovery,automaticinDataServices/AutomaticjobrecoveryinDataServices,Howtodoit…,Howitworks…,There’smore…

joinoperations*-cross-joinoperation/Howitworks…||-parallel-joinoperation/Howitworks…INNERJOIN/Howitworks…LEFTOUTERJOIN/Howitworks…

Page 596: SAP Data Services 4.x Cookbook

K

key_generation()function/key_generation()

Page 597: SAP Data Services 4.x Cookbook

L

longdatatype/Howitworks…lookupmethods

withQuerytransformjoin/LookupwiththeQuerytransformjoinwithlookup_ext()function/Lookupwiththelookup_ext()functionwithsql()function/Lookupwiththesql()function

lookup_ext()functionlookupwith/Lookupwiththelookup_ext()functionadvantages/lookup_ext()

lookup_ext()performanceoptionsabout/lookup_ext()performanceoptions

Page 598: SAP Data Services 4.x Cookbook

M

Map_Operationtransformusing/UsingtheMap_Operationtransform,Howtodoit…,Howitworks…

mathfunctionsusing/Usingmathfunctions,Howtodoit…,There’smore…

MetadataManagementtasksperforming/PerformingMetadataManagementtasks,Gettingready,Howtodoit…,Howitworks…

Metapediaworkingwith/WorkingwiththeMetapediafunctionality,Howtodoit…,Howitworks…

MicrosoftSQLServer2012URL/Howtodoit…

miscellaneousfunctionsusing/Usingmiscellaneousfunctions,Howitworks…

Page 599: SAP Data Services 4.x Cookbook

N

nestedstructuresworkingwith/Workingwithnestedstructures,Howtodoit…,Howitworks…,Thereismore…

Page 600: SAP Data Services 4.x Cookbook

O

objectreplicationusing/Usingobjectreplication,Howitworks…

OLTPdatastore/Howtodoit…

Page 601: SAP Data Services 4.x Cookbook

P

parameterscreating/Creatingvariablesandparameters,Howtodoit…,Howitworks…

parent-childrelationshipsbetweenDataServicesobjects/Peekinginsidetherepository–parent-childrelationshipsbetweenDataServicesobjects,Gettingready

partialpushdown/Gettingreadyperformanceoptions

about/Optimizingdataflowexecution–performanceoptions,Howtodoit…Pivottransform

used,fortransformingdata/TransformingdatawiththePivottransform,Gettingready,Howtodoit…,Howitworks…

profilingdata/Profilingprofilingresults,DataInsight

viewing/Viewingprofilingresultspush-downoperations

about/Optimizingdataflowexecution–push-downtechniques,Howitworks…partialpushdown/Gettingreadyfullpushdown/Gettingready

Page 602: SAP Data Services 4.x Cookbook

Q

Querytransformjoinlookupwith/LookupwiththeQuerytransformjoin

querytransformjoinsadvantages/Querytransformjoins

Querytransformperformanceoptionsabout/Querytransformperformanceoptions

Page 603: SAP Data Services 4.x Cookbook

R

real-timejobscreating/Creatingreal-timejobsSoapUI,installing/InstallingSoapUI,Howtodoit…,Howitworks…

regularexpressionsupportused,forvalidatingdata/Usingregularexpressionsupporttovalidatedata,Gettingready,Howtodoit…,Howitworks…

replicationprocessabout/Howitworks…

rules/Rules

Page 604: SAP Data Services 4.x Cookbook

S

SAPERPdata,loadinginto/LoadingdataintoSAPERP,Gettingready,Howtodoit…,Howitworks…URL/LoadingdataintoSAPERP

SAPInformationStewardabout/IntroductionURL/Introduction

scorecard,DataInsightcreating/Creatingascorecard,Howitworks…

scorecards/Scorecardsscript

creating/Creatingascript,Howtodoit…,Howitworks…stringfunctions,using/Usingstringfunctionsinthescript,Howitworks…

server-basedcomponentsIPSServices/IntroductionJobServer/Introductionaccessserver/Introductionwebapplicationserver/Introduction

servicesstarting/Startingandstoppingservices,Howtodoit…,Seealsostopping/Startingandstoppingservices,Howtodoit…,Seealsowebapplicationserver/Howtodoit…DataServicesJobServer/Howtodoit…InformationPlatformServices/Howtodoit…

SlowlyChangingDimensions(SCD)about/Gettingready

SoapUIinstalling/InstallingSoapUI,Howtodoit…,Howitworks…URL/InstallingSoapUI

sourcedataobjectcreating/Creatingasourcedataobject,Howtodoit…,Howitworks…

sourcesystemdatabasecreating/Creatingasourcesystemdatabase,There’smore…

sourcetableperformanceoptionsabout/Sourcetableperformanceoptions

sql()function/sql(),Howitworks…lookupwith/Lookupwiththesql()functionabout/sql()

SQLtransformabout/Optimizingdataflowexecution–theSQLtransform,Howtodoit…,Howitworks…

stagingareastructuresdefining/Definingandcreatingstagingareastructures

Page 605: SAP Data Services 4.x Cookbook

creating/Definingandcreatingstagingareastructuresflatfiles/FlatfilesRDBMStables/RDBMStables,Howitworks…

stringfunctionsusing/Usingstringfunctions,Howtodoit…using,inscript/Usingstringfunctionsinthescript,Howitworks…

systemconfigurationsused,forsimplifyingETLexecution/SimplifyingETLexecutionwithsystemconfigurations,Howtodoit…,Howitworks…

Page 606: SAP Data Services 4.x Cookbook

T

Table_Comparisontransformusing/UsingtheTable_Comparisontransform,Gettingready,Howtodoit…,Howitworks…

targetdataobjectcreating/Creatingatargetdataobject,Howtodoit…,There’smore…

targetdatawarehousecreating/Creatingatargetdatawarehouse,Howitworks…,There’smore…

targettableperformanceoptionsabout/Targettableperformanceoptions

tasksadministering/Administeringtasks,Howtodoit…,Seealso

total_rows()function/total_rows()try-catchobjects

about/Controllingfailures–try-catchobjects,Howtodoit…,Howitworks…

Page 607: SAP Data Services 4.x Cookbook

U

useraccessconfiguring/Configuringuseraccess,Howtodoit…,Howitworks…

Page 608: SAP Data Services 4.x Cookbook

V

validationfunctionscreating/Creatingvalidationfunctions,Howtodoit…,Howitworks…using,withvalidationtransform/UsingvalidationfunctionswiththeValidationtransform,Howtodoit…,Howitworks…

validationrule,DataInsightcreating/Creatingavalidationrule

validationtransformvalidationfunctions,usingwith/UsingvalidationfunctionswiththeValidationtransform,Howtodoit…,Howitworks…

variablescreating/Creatingvariablesandparameters,Howtodoit…,Howitworks…

verticalhierarchyflatteningabout/Verticalhierarchyflattening,Howitworks…

Page 609: SAP Data Services 4.x Cookbook

W

workflowobjectcreating/Creatingaworkflowobject,Howtodoit…,Howitworks…

workflowsnesting,tocontrolexecutionorder/Nestingworkflowstocontroltheexecutionorder,Howtodoit,Howitworks…

Page 610: SAP Data Services 4.x Cookbook

X

XML_Maptransformabout/TheXML_Maptransform,Howtodoit…,Howitworks…

Page 611: SAP Data Services 4.x Cookbook

TableofContentsSAPDataServices4.xCookbook

Credits

AbouttheAuthor

AbouttheReviewers

www.PacktPub.com

Supportfiles,eBooks,discountoffers,andmore

Whysubscribe?

FreeaccessforPacktaccountholders

InstantupdatesonnewPacktbooks

Preface

Whatthisbookcovers

Whatyouneedforthisbook

Whothisbookisfor

Sections

Gettingready

Howtodoit…

Howitworks…

There’smore…

Seealso

Conventions

Readerfeedback

Customersupport

Downloadingtheexamplecode

Downloadingthecolorimagesofthisbook

Errata

Piracy

Questions

1.IntroductiontoETLDevelopment

Introduction

Preparingadatabaseenvironment

Page 612: SAP Data Services 4.x Cookbook

Gettingready

Howtodoit…

Howitworks…

Creatingasourcesystemdatabase

Howtodoit…

Howitworks…

There’smore…

Definingandcreatingstagingareastructures

Howtodoit…

Flatfiles

RDBMStables

Howitworks…

Creatingatargetdatawarehouse

Gettingready

Howtodoit…

Howitworks…

There’smore…

2.ConfiguringtheDataServicesEnvironment

Introduction

CreatingIPSandDataServicesrepositories

Gettingready…

Howtodoit…

Howitworks…

Seealso

InstallingandconfiguringInformationPlatformServices

Gettingready…

Howtodoit…

Howitworks…

InstallingandconfiguringDataServices

Gettingready…

Howtodoit…

Howitworks…

Page 613: SAP Data Services 4.x Cookbook

Configuringuseraccess

Gettingready…

Howtodoit…

Howitworks…

Startingandstoppingservices

Howtodoit…

Howitworks…

Seealso

Administeringtasks

Howtodoit…

Howitworks…

Seealso

UnderstandingtheDesignertool

Gettingready…

Howtodoit…

Howitworks…

ExecutingETLcodeinDataServices

ValidatingETLcode

Templatetables

Querytransformbasics

TheHelloWorldexample

3.DataServicesBasics–DataTypes,ScriptingLanguage,andFunctions

Introduction

Creatingvariablesandparameters

Gettingready

Howtodoit…

Howitworks…

There’smore…

Creatingascript

Howtodoit…

Howitworks…

Usingstringfunctions

Page 614: SAP Data Services 4.x Cookbook

Howtodoit…

Usingstringfunctionsinthescript

Howitworks…

There’smore…

Usingdatefunctions

Howtodoit…

Generatingcurrentdateandtime

Extractingpartsfromdates

Howitworks…

There’smore…

Usingconversionfunctions

Howtodoit…

Howitworks…

There’smore…

Usingdatabasefunctions

Howtodoit…

key_generation()

total_rows()

sql()

Howitworks…

Usingaggregatefunctions

Howtodoit…

Howitworks…

Usingmathfunctions

Howtodoit…

Howitworks…

There’smore…

Usingmiscellaneousfunctions

Howtodoit…

Howitworks…

Creatingcustomfunctions

Howtodoit…

Page 615: SAP Data Services 4.x Cookbook

Howitworks…

There’smore…

4.Dataflow–Extract,Transform,andLoad

Introduction

Creatingasourcedataobject

Howtodoit…

Howitworks…

There’smore…

Creatingatargetdataobject

Gettingready

Howtodoit…

Howitworks…

There’smore…

Loadingdataintoaflatfile

Howtodoit…

Howitworks…

There’smore…

Loadingdatafromaflatfile

Howtodoit…

Howitworks…

There’smore…

Loadingdatafromtabletotable–lookupsandjoins

Howtodoit…

Howitworks…

UsingtheMap_Operationtransform

Howtodoit…

Howitworks…

UsingtheTable_Comparisontransform

Gettingready

Howtodoit…

Howitworks…

ExploringtheAutocorrectloadoption

Page 616: SAP Data Services 4.x Cookbook

Gettingready

Howtodoit…

Howitworks…

SplittingtheflowofdatawiththeCasetransform

Gettingready

Howtodoit…

Howitworks…

Monitoringandanalyzingdataflowexecution

Gettingready

Howtodoit…

Howitworks…

There’smore…

5.Workflow–ControllingExecutionOrder

Introduction

Creatingaworkflowobject

Howtodoit…

Howitworks…

Nestingworkflowstocontroltheexecutionorder

Gettingready

Howtodoit

Howitworks…

Usingconditionalandwhileloopobjectstocontroltheexecutionorder

Gettingready

Howtodoit…

Howitworks…

Thereismore…

Usingthebypassingfeature

Gettingready…

Howtodoit…

Howitworks…

Thereismore…

Controllingfailures–try-catchobjects

Page 617: SAP Data Services 4.x Cookbook

Howtodoit…

Howitworks…

Usecaseexample–populatingdimensiontables

Gettingready

Howtodoit…

Howitworks…

Mapping

Dependencies

Development

Executionorder

TestingETL

PreparingtestdatatopopulateDimSalesTerritory

PreparingtestdatatopopulateDimGeography

Usingacontinuousworkflow

Howtodoit…

Howitworks…

Thereismore…

Peekinginsidetherepository–parent-childrelationshipsbetweenDataServicesobjects

Gettingready

Howtodoit…

Howitworks…

GetalistofobjecttypesandtheircodesintheDataServicesrepository

DisplayinformationabouttheDF_Transform_DimGeographydataflow

DisplayinformationabouttheSalesTerritorytableobject

Seethecontentsofthescriptobject

6.Job–BuildingtheETLArchitecture

Introduction

Projectsandjobs–organizingETL

Gettingready

Howtodoit…

Howitworks…

Hierarchicalobjectview

Page 618: SAP Data Services 4.x Cookbook

Historyexecutionlogfiles

Executing/schedulingjobsfromtheManagementConsole

Usingobjectreplication

Howtodoit…

Howitworks…

MigratingETLcodethroughthecentralrepository

Gettingready

Howtodoit…

Howitworks…

AddingobjectstoandfromtheCentralObjectLibrary

ComparingobjectsbetweentheLocalandCentralrepositories

Thereismore…

MigratingETLcodewithexport/import

Gettingready

Howtodoit…

Import/ExportusingATLfiles

Directexporttoanotherlocalrepository

Howitworks…

Debuggingjobexecution

Gettingready…

Howtodoit…

Howitworks…

Monitoringjobexecution

Gettingready

Howtodoit…

Howitworks…

BuildinganexternalETLauditandauditreporting

Gettingready…

Howtodoit…

Howitworks…

Usingbuilt-inDataServicesETLauditandreportingfunctionality

Gettingready

Page 619: SAP Data Services 4.x Cookbook

Howtodoit…

Howitworks…

AutoDocumentationinDataServices

Howtodoit…

Howitworks…

7.ValidatingandCleansingData

Introduction

Creatingvalidationfunctions

Gettingready

Howtodoit…

Howitworks…

UsingvalidationfunctionswiththeValidationtransform

Gettingready

Howtodoit…

Howitworks…

Reportingdatavalidationresults

Gettingready

Howtodoit…

Howitworks…

Usingregularexpressionsupporttovalidatedata

Gettingready

Howtodoit…

Howitworks…

Enablingdataflowaudit

Gettingready

Howtodoit…

Howitworks…

There’smore…

DataQualitytransforms–cleansingyourdata

Gettingready

Howtodoit…

Howitworks…

Page 620: SAP Data Services 4.x Cookbook

There’smore…

8.OptimizingETLPerformance

Introduction

Optimizingdataflowexecution–push-downtechniques

Gettingready

Howtodoit…

Howitworks…

Optimizingdataflowexecution–theSQLtransform

Howtodoit…

Howitworks…

Optimizingdataflowexecution–theData_Transfertransform

Gettingready

Howtodoit…

Howitworks…

WhyweusedasecondData_Transfertransformobject

WhentouseData_Transfertransform

There’smore…

Optimizingdataflowreaders–lookupmethods

Gettingready

Howtodoit…

LookupwiththeQuerytransformjoin

Lookupwiththelookup_ext()function

Lookupwiththesql()function

Howitworks…

Querytransformjoins

lookup_ext()

sql()

Performancereview

Optimizingdataflowloaders–bulk-loadingmethods

Howtodoit…

Howitworks…

Whentoenablebulkloading?

Page 621: SAP Data Services 4.x Cookbook

Optimizingdataflowexecution–performanceoptions

Gettingready

Howtodoit…

Dataflowperformanceoptions

Sourcetableperformanceoptions

Querytransformperformanceoptions

lookup_ext()performanceoptions

Targettableperformanceoptions

9.AdvancedDesignTechniques

Introduction

ChangeDataCapturetechniques

Gettingready

NohistorySCD(Type1)

LimitedhistorySCD(Type3)

UnlimitedhistorySCD(Type2)

Howtodoit…

Howitworks…

Source-basedETLCDC

Target-basedETLCDC

NativeCDC

AutomaticjobrecoveryinDataServices

Gettingready

Howtodoit…

Howitworks…

There’smore…

SimplifyingETLexecutionwithsystemconfigurations

Gettingready

Howtodoit…

Howitworks…

TransformingdatawiththePivottransform

Gettingready

Howtodoit…

Page 622: SAP Data Services 4.x Cookbook

Howitworks…

10.DevelopingReal-timeJobs

Introduction

Workingwithnestedstructures

Gettingready

Howtodoit…

Howitworks…

Thereismore…

TheXML_Maptransform

Gettingready

Howtodoit…

Howitworks…

TheHierarchy_Flatteningtransform

Gettingready

Howtodoit…

Horizontalhierarchyflattening

Verticalhierarchyflattening

Howitworks…

Queryingresulttables

ConfiguringAccessServer

Gettingready

Howtodoit…

Howitworks…

Creatingreal-timejobs

Gettingready

InstallingSoapUI

Howtodoit…

Howitworks…

11.WorkingwithSAPApplications

Introduction

LoadingdataintoSAPERP

Gettingready

Page 623: SAP Data Services 4.x Cookbook

Howtodoit…

Howitworks…

IDoc

MonitoringIDocloadontheSAPside

Post-loadvalidationofloadeddata

Thereismore…

12.IntroductiontoInformationSteward

Introduction

ExploringDataInsightcapabilities

Gettingready

Howtodoit…

Creatingaconnectionobject

Profilingthedata

Viewingprofilingresults

Creatingavalidationrule

Creatingascorecard

Howitworks…

Profiling

Rules

Scorecards

Thereismore…

PerformingMetadataManagementtasks

Gettingready

Howtodoit…

Howitworks…

WorkingwiththeMetapediafunctionality

Howtodoit…

Howitworks…

CreatingacustomcleansingpackagewithCleansingPackageBuilder

Gettingready

Howtodoit…

Howitworks…

Page 624: SAP Data Services 4.x Cookbook

Thereismore…

Index