informatica basics

253
© Kanbay Incorporated - All Rights Reserved Informatica PowerCenter 7 Basics Training

Upload: ashwini-teegala

Post on 26-May-2017

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Informatica PowerCenter 7Basics Training

Page 2: Informatica Basics

2

Course Objectives

» Understand how to use PowerCenter 7 components for development

» Be able to build basic ETL mappings

» Be able to create, run and monitor workflows

» Understand available options for loading target data

» Be able to Troubleshoot most problems

Page 3: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Introduction and Product OverviewChapter 1

Page 4: Informatica Basics

4

PowerCenter 7 Architecture

Repository Manager

Designer Workflow Manager

Workflow Monitor

Rep Server Administrative Console

Sources Targets

Repository

TCP/IP

TCP/IP

Repository Server

Repository Agent

Native Native

Not shown : Client ODBC Connections for Source and Target metadata

Native

Page 5: Informatica Basics

5

PowerCenter 7 Architecture

Page 6: Informatica Basics

6

PowerCenter 7 Architecture

» You can register multiple PowerCenter Servers to a repository.

» The PowerCenter Server moves data from sources to targets based on workflow and mapping metadata stored in a repository.

» The PowerCenter Server runs workflow tasks according to the conditional links connecting the tasks.

» When you have multiple PowerCenter Servers, you can assign a server to start a workflow or a session.

» This allows you to distribute the workload. Server Grid:

» You can increase performance by using a server grid to balance the workload.

» A server grid is a server object that allows you to automate the distribution of sessions across multiple servers.

Page 7: Informatica Basics

7

PowerCenter 7 Architecture

» The PowerCenter Server can combine data from different platforms and source types.

» For example, you can join data from a flat file and an Oracle source.

» The PowerCenter Server can also load data to different platforms and target types.

» For example, you can load transformed data to both a flat file target and a Microsoft SQL Server database in the same session.

Page 8: Informatica Basics

8

PowerCenter 7 Server Connectivity» The PowerCenter Server connects to the following Informatica platform

components: PowerCenter Client Other PowerCenter Servers Repository Server Repository Agent Source and target databases

» The PowerCenter Server is a repository client application. » It connects to the Repository Server and Repository Agent to retrieve

workflow and mapping metadata from the repository database. » When the PowerCenter Server requests a repository connection from

the Repository Server, the Repository Server starts and manages the Repository Agent.

» The Repository Server then re-directs the PowerCenter Server to connect directly to the Repository Agent.

» The Workflow Manager communicates directly with the PowerCenter Server over a TCP/IP connection.

Page 9: Informatica Basics

9

PowerCenter 7 Server Connectivity

» You create the connection by defining the port number in the Workflow Manager and the PowerCenter Server configuration.

» Use the Workflow Manager to register the PowerCenter Server in the repository.

» In a server grid, the Workflow Manager communicates directly with multiple PowerCenter Servers over TCP/IP connections.

» Each PowerCenter Server retrieves a server grid object from the repository, which it uses to connect to the other PowerCenter Servers in the grid.

» The PowerCenter Server maintains a database connection pool for stored procedures or lookup databases in a workflow.

» The PowerCenter Server allows an unlimited number of connections to lookup or stored procedure databases.

» If a database user does not have permission for the number of connections a session requires, the session fails.

» You can optionally set a parameter to limit the database connections.

Page 10: Informatica Basics

10

PowerCenter 7 Server Connectivity

» For a session, the PowerCenter Server holds the connection as long as it needs to read data from source tables or write data to target tables.

Page 11: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Designer OverviewChapter 2

Page 12: Informatica Basics

12

» Designer Windows: Navigator Workspace Status bar Output Overview Instance Data Target Data

Designer Interface

Page 13: Informatica Basics

13

Designer Interface

» Designer Tools: The Designer provides the following tools: Source Analyzer: Use to import or create source definitions for flat

file, XML, COBOL, Application, and relational sources Warehouse Designer: Use to import or create target definitions Transformation Developer: Use to create reusable transformations Mapplet Designer: Use to create mapplets Mapping Designer: Use to create mappings

» Navigator: Use to connect to and work in multiple repositories and folders. You can also copy and delete objects and create shortcuts using the Navigator.

» Workspace: Use to view or edit sources, targets, mapplets, transformations, and mappings. You can work with a single tool at a time in the workspace. You can use the workspace in default or workbook format.

Page 14: Informatica Basics

14

» Status bar: Displays the status of the operation you perform.» Output: Provides details when you perform certain tasks, such as

saving your work or validating a mapping. Right-click the Output window to access window options, such as printing output text, saving text to file, and changing the font size.

» Overview: An optional window to simplify viewing workbooks containing large mappings or a large number of objects. Outlines the visible area in the workspace and highlights selected objects in color. To open the Overview window, choose View-Overview Window.

» Instance Data: View transformation data while you run the Debugger to debug a mapping.

» Target Data: View target data while you run the Debugger to debug a mapping. You can view a list of open windows and switch from one window to another in the Designer.

Designer Interface

Page 15: Informatica Basics

15

Designer Tasks» The common tasks performed in each of the Designer tools:

Add a repository Print the workspace Open and close a folder Create shortcuts Check in and out repository objects Search for repository objects Enter descriptions for repository objects Copy objects Export and import repository objects Work with multiple objects, ports, or columns Rename ports Use shortcut keys

Page 16: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Naming ConventionsChapter 3

Page 17: Informatica Basics

17

Naming Conventions

» Good Practice to Follow Naming Conventions

» Can be project specific:- Workflow: wfl_ followed by workflow functionality Session: s_ followed by mapping name Mapping: m_ followed by mapping functionality Source: Table/File name Target: Table/File name Ports:

Input & Output :- Column Names Variable:- v_ followed by functionality

Page 18: Informatica Basics

18

Naming Conventions - Transformations:

Source Qualifier: sql_(followed by Source Name)

Stored Procedure: sp_(followed by purpose of transformation)

Sequence Generator: seq_Expression: exp_Joiner: jnr_Lookup: lkp_Filter: fil_Rank: rnk_Router: rtr_Update Strategy: upd_Aggregator: agg_Normalizer: nrm_

Page 19: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Working With Sources and TargetsChapter 4

Page 20: Informatica Basics

20

Design Process Overview Create Source definition(s) Create Target definition(s) Create a Mapping Create a Session Task Create a Workflow with Task components Run the Workflow and verify the results

Page 21: Informatica Basics

21

Methods of Analyzing Sources

» To extract data from a source, you must first define sources in the repository. You can import or create the following types of source definitions in the

Source Analyzer:

» Relational database

» Flat file

» COBOL file

» XML object

Page 22: Informatica Basics

22

Working with Relational Sources

» You can add and maintain relational source definitions for tables, views, and synonyms: Import source definitions. Import source definitions into the Source

Analyzer. Update source definitions. Update source definitions either manually,

or by re-importing the definition.

Page 23: Informatica Basics

23

Importing Relational Source Definitions

» You can import relational source definitions from database tables, views, and synonyms.

» When you import a source definition, you import the following source metadata: Source name Database location Column names Datatypes Key constraints

Note: When you import a source definition from a synonym, you might need to manually define the constraints in the definition.» To import a source definition, you must be able to connect to the source

database from the client machine using a properly configured ODBC data source or gateway. You may also require read permission on the database object.

» You can also manually define key relationships, which can be logical relationships created in the repository that do not exist in the database.

Page 24: Informatica Basics

24

Importing Relational Source Definitions

To import a source definition:1. In Source Analyzer, choose Sources-Import from Database.

Page 25: Informatica Basics

25

Importing Relational Source Definitions

If no table names appear or if the table you want to import does not appear, click All.

Page 26: Informatica Basics

26

Importing Relational Source Definitions

6. Click OK.

Page 27: Informatica Basics

27

Importing Relational Source Definitions

7. Choose Repository-Save

Page 28: Informatica Basics

28

Creating Target Definitions

» You can create the following types of target definitions in the Warehouse Designer: Relational. You can create a relational target for a particular

database platform. Create a relational target definition when you want to use an external loader to the target database.

Flat File. You can create fixed-width and delimited flat file target definitions.

XML File. You can create an XML target definition to output data to an XML file.

Page 29: Informatica Basics

29

Importing a Relational Target Definition

» When you import a target definition from a relational table, the Designer imports the following target details: Target name. Database location. Column names. Datatypes. Key constraints. Key relationships.

Page 30: Informatica Basics

30

Automatic Target Creation» Drag-and-drop a Source Definition into the Warehouse Designer

Workspace

Page 31: Informatica Basics

31

Target Definition properties

Page 32: Informatica Basics

32

Target Definition properties

Page 33: Informatica Basics

33

Metadata Extensions» Allows developers and partners to extend the metadata stored in the repository

» Accommodates the following metadata types: User-defined: PowerCenter users can define and create their own

metadata Vendor-defined: Third-party application vendor-created metadata lists

» For e.g., Applications like Power Connect for Siebel can add information such as contacts, version etc

Can be re-usable or non-reusable Can promote non-reusable metadata extensions to reusable; this is not

reversible Reusable ones are associated with all repository objects of that object

type. Non-reusable one is associated with a single repository object Administrator or Super User privileges are required for managing

reusable metadata extensions

Metadata Extensions

Page 34: Informatica Basics

34

Data Previewer» Preview data in

Relational Sources Flat File Sources Relational Targets Flat File Targets

» Data Preview Option is available in Source Analyzer Warehouse Designer Mapping Designer Mapplet Designer

Page 35: Informatica Basics

35

Data Previewer Source Analyzer

From Source Analyzer Select Source drop down Menu, then preview data

Page 36: Informatica Basics

36

Data Previewer Source Analyzer

A right mouse click can also be used to preview data

Page 37: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Mappings OverviewChapter 5

Page 38: Informatica Basics

38

Overview» “A mapping is a set of source and target definitions linked by

transformation objects that define the rules for data transformation.”

» Mappings represent the data flow between sources and targets. » When the PowerCenter Server runs a session, it uses the instructions

configured in the mapping to read, transform, and write data.» Every mapping must contain the following components:

Source instance: Describes the characteristics of a source table or file. Transformation: Modifies data before writing it to targets. Use different

transformation objects to perform different functions. Target instance: Defines the target table or file. Links: Connect sources, targets, and transformations so the

PowerCenter Server can move the data as it transforms it.Note:

» A mapping can also contain one or more Mapplets. A mapplet is a set of transformations that you build in the Mapplet Designer and can use in multiple mappings.

Page 39: Informatica Basics

39

Sample Mapping

Page 40: Informatica Basics

40

Developing a Mapping [1/2]

» When you develop a mapping, use the following procedure as a guideline: 1. Verify that all source, target, and reusable objects are created.

Create source and target definitions. If you want to use mapplets, you must create them also. You can create reusable transformations in the Transformation Developer, or you can create them while you develop a mapping.

2. Create the mapping. You can create a mapping by dragging a source, target, mapplet, or reusable transformation into the Mapping Designer workspace, or you can choose Mapping-Create from the menu.

3. Add sources and targets. Add sources and targets to the mapping.

Page 41: Informatica Basics

41

Developing a Mapping [2/2] 4. Add transformations and transformation logic. Add transformations

to the mapping and build transformation logic into the transformation properties.

5. Connect the mapping. Connect the mapping objects to create a flow of data from sources to targets, through mapplets and transformations that add, remove, or modify data along this flow.

6. Validate the mapping. Validate the mapping to identify connection or transformation errors.

7. Save the mapping. When you save the mapping, the Designer validates it, identifying any errors. The Designer displays validation messages in the Output window. A mapping with errors is invalid, and you cannot run a session against it until you validate it.

Page 42: Informatica Basics

42

Transformation Concepts

» “A Transformation is a repository object that generates, modifies, or passes data.”

» The Designer provides a set of transformations that perform specific functions.

» Transformations can be active or passive.» Transformations can be connected to the data flow, or they can be

unconnected. » An unconnected transformation is not connected to other

transformations in the mapping. » An Unconnected transformation is called within another transformation,

and returns a value to that transformation.» Transformations in a mapping represent the operations the PowerCenter

Server performs on the data. » Data passes into and out of transformations through ports that you link in

a mapping or mapplet.

Page 43: Informatica Basics

43

Transformation Concepts» Perform the following tasks to incorporate a transformation into a

mapping:1. Create the transformation. 2. Configure the transformation.3. Link the transformation to other transformations and target

definitions.» You can create transformations using the following Designer tools:

Mapping Designer: » Create transformations that connect sources to targets.» Transformations in a mapping cannot be used in other mappings

unless you configure them to be reusable. Transformation Developer:

» Create individual transformations, called reusable transformations, that you can use in multiple mappings.

Mapplet Designer: » Create and configure a set of transformations, called mapplets,

that you can use in multiple mappings.

Page 44: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Getting HelpChapter 6

Page 45: Informatica Basics

45

Startup Pages – 1/3

Page 46: Informatica Basics

46

Startup Pages – 2/3

Page 47: Informatica Basics

47

Startup Pages – 3/3

Page 48: Informatica Basics

48

Navigating The Online Documentation

» Informatica provides a comprehensive help manual for designers

» The entire manual can be accessed by using the Help menu from the main menu bar. It provides standard content wise, index wise and search based help with an option to save certain pages as favorites.

» Informatica also provides a context based help where in the help button on any window would directly take to the help page related to that window

Page 49: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Source Qualifier TransformationChapter 7

Page 50: Informatica Basics

50

Source Qualifier Transformation» Active Transformation » Connected» Port

All Input/Output» Usage

Modify SQL statements User defined Join Source Filter Sorted ports Select Distinct Pre/Post SQL Convert Data Types Relational sources ONLY

Page 51: Informatica Basics

51

Source Qualifier Transformation

Represents the source record set queried by the server. Mandatory in Mappings using relational or flat file sources

Page 52: Informatica Basics

52

Default Query» For relational sources, the PowerCenter Server generates a query for

each Source Qualifier transformation when it runs a session.

» The default query is a SELECT statement for each source column used in the mapping. Thus, the PowerCenter Server reads only the columns that are connected to another transformation.

» Although there are many columns in the source definition, only three columns are connected to another transformation.

» In this case, the PowerCenter Server generates a default query that selects only those three columns:

SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME FROM CUSTOMERS

Page 53: Informatica Basics

53

Joining Multiple sources» You can use one Source Qualifier transformation to join data from

multiple relational tables.

» These tables must be accessible from the same instance or database server.

» When a mapping uses related relational sources, you can join both sources in one Source Qualifier transformation.

» Default join is inner equi-join (where Src1.col_nm = Src2.col_nm) if the relationship between the tables is defined in the Source Analyzer

» This can increase performance when source tables are indexed.

» Tip: Use the Joiner transformation for heterogeneous sources and to join flat files.

Page 54: Informatica Basics

54

Joining Multiple sources

Page 55: Informatica Basics

55

Pre-SQL and Post-SQL Rules Pre & Post SQL statements are run against the source database Can use any command that is valid for the database type; no nested

comments Can use Mapping Parameters and Variables Use a semi-colon to separate multiple statements Informatica server ignores semi-colons within single quotes, double

quotes or within /* */ To use semi-colon outside of quotes or comments, ’escape’ it with a

back slash (\)

Page 56: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Hands on Exercises - I Chapter 8

Page 57: Informatica Basics

57

Lab 1 - Setting Connections

» This is a demonstration which has to be followed by the participants.

» This lab briefs about connections to Informatica Client’s and other necessary configurations

Page 58: Informatica Basics

58

Lab 2 - Creating Source Definitions

» Connect to tdbu01 Database using the SOURCE_INFA_TRN connection

» Import the Source Table Employee

Page 59: Informatica Basics

59

Lab 3 - Creating Target Definitions

» Connect to tdbu02 Database using the TARGET_INFA_TRN connection

» Import the Target Table Employee

Page 60: Informatica Basics

60

Lab 4 – Simple Mapping

» Create a Mapping using Employees as the Source and Employees as the Target instance

» No other transformations are required.

» During execution of the map, select file as the target instead of relational and delimiter is Pipe

» Ensure target file name is user specific (e.g.: Student01 should use file_name01)

Page 61: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Transformation Objects (1)Chapter 9

Page 62: Informatica Basics

62

Active Vs Passive Transformation

Active Passive

Number or rows input may not equal number of rows output

Number or rows input always equals number of rows output

Can operate on groups of data rows Operates on one row at a time

May not be re-linked into another data stream (except into a sorted join where both flows arise from the same source qualifier)

May be re-linked into another data stream

e.g. Aggregator, Filter, Joiner, Rank, Normalizer, Source Qualifier, Update Strategy, Custom

e.g. Expression, Lookup, External Procedure, Sequence Generator, Stored Procedure

Page 63: Informatica Basics

63

Transformation Types Explained [1/5]Aggregator:

Active/ConnectedPerforms aggregate calculations

Application Source Qualifier: Active/ConnectedReads ERP object sources

Custom: [Active or Passive]/ConnectedCalls a procedure in a shared library or DLL.

External Procedure: Passive/[Connected or Unconnected]Calls a procedure in a shared library / the COM layer of Windows.

Expression: Passive/ConnectedPerforms low-level calculations

Page 64: Informatica Basics

64

Transformation Types Explained [2/5]Filter:

Active/ConnectedDrops rows conditionally

Input: Passive/ConnectedDefines mapplet input rowsAvailable in the Mapplet Designer

Joiner: Active/ConnectedJoins heterogeneous sources

Lookup: Passive/[Connected or Unconnected]Looks up values and passes them to other objects

Normalizer: Active/ConnectedReorganizes records from VSAM, Relational and Flat file

Page 65: Informatica Basics

65

Output: Passive/ConnectedDefines mapplet output rowsAvailable in the Mapplet Designer

Rank: Active/ConnectedLimits record to the top or bottom of a range

Router: Active/ConnectedSplits rows conditionally

Sequence Generator: Passive/ConnectedGenerates unique ID values

Sorter: Active/ ConnectedSorts data

Transformation types Explained [3/5]

Page 66: Informatica Basics

66

Source Qualifier:Active/ConnectedReads data from Flat file & Relational Sources

Stored Procedure: Passive/Connected or Unconnected]Calls a database stored procedure

Transaction Control: Active/ConnectedDefines commit and rollback transactions.

Union: Active/ConnectedMerges data from different databases or flat file systems.

Update Strategy: Active/ConnectedDetermines whether to insert, delete, update, or reject rows.

Transformation types Explained [4/5]

Page 67: Informatica Basics

67

Transformation types Explained [5/5]XML Generator:

Active/ConnectedReads data from one or more input ports & outputs XML through a single output port.

XML Parser:Active/ConnectedReads XML from one input port and outputs data to one or more output ports.

XML Source Qualifier:Active/ConnectedRepresents the rows that the PowerCenter Server reads from an XML source when it runs a session

Page 68: Informatica Basics

68

Transformation Views

A transformation has three views : Iconized Normal Edit

Iconized: shows the transformation in the relation to the rest of the mapping

Page 69: Informatica Basics

69

Transformation Views

Normal: shows the flow of data through the transformation

Edit: shows the transformation ports and the properties; allows editing

Page 70: Informatica Basics

70

Data Flow Rules» Each Source Qualifier starts a single data stream (a data flow)

» Transformations can send rows to more than one transformation (split one data flow into multiple pipelines)

» Two or more data flow can converge only if they originate from a common active transformation.

Page 71: Informatica Basics

71

Ports & Expressions

» Ports are present in each transformation and are used to propagate the field values from the source to the target via the transformations.

» Ports are basically of 3 types:- Input Output Variable

» Ports evaluation follows the Top-Down Approach» An Expression is a calculation or conditional statement added to a

transformation.» An Expression can be composed or Ports, Functions, operators,

variables, literals, return values & constants.

Page 72: Informatica Basics

72

Ports - Evaluation» The best practice recommends the following approach for port

evaluation

» Input Ports: Should be evaluated first There is no evaluation ordering among input ports (as they do not

depend on any other ports)

» Variable Ports: Should be evaluated after all input ports are evaluated (as variable

ports can reference any input port) Variable ports can reference other variable ports also but not any

output ports. Ordering of variables is also very important as they can reference each

other’s values.

Page 73: Informatica Basics

73

Ports - Evaluation

» Output Ports: Should be evaluated last They can reference any input port or any variable port. There is no ordered evaluation of output ports (as they cannot

reference each other)

Page 74: Informatica Basics

74

Using Variable Ports

» Also known as Local variables. Used for temporary storage

» Used to simplify complex expressions E.g. – create and store a depreciation formula to be referenced more

than once» Used in another variable port or output port expression» A variable port cannot also be an input or output port.» Available in the Expression, Aggregator and Rank.» Variable ports are NOT visible in Normal view, only in Edit view

Page 75: Informatica Basics

75

Using Variable Ports» The scope of variable ports is limited to a single transformation.

» Variable ports are initialized to either ‘zero’ (for numeric values) or ‘empty string’ (for character & date variables) when the Mapping logic is processed.

» They are not initialized to ‘Null’

» Variable ports can remember values across rows (useful for comparing values) & they retain their values until the next evaluation of the variable expression.

» Thus we can effectively use the order of variable ports to do procedural computation.

Page 76: Informatica Basics

76

Default Values – Two Usages

»For Input and I/O ports

» Used to replace null values

»For Output ports

» Used to handle transformation calculation errors

(not-null handling)

Page 77: Informatica Basics

77

Expressions

» Expressions can be entered at the row-level (port) or field-level (transformation level)

» Expressions can be used in the following transformations:-

» Expression: - Output Port Level

» Aggregator - Output Port Level

» Rank - Output Port Level

» Filter - Transformation Level

» Router - Transformation Level

» Update Strategy - Transformation Level

» Transaction Control - Transformation Level

Page 78: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Expression Transformation

Page 79: Informatica Basics

79

Expression Transformation» Passive Transformation

» Connected

» Ports

» Mixed

» Variables allowed

» Create expression in output or variable port

» Used to perform majority of data manipulation

Page 80: Informatica Basics

80

Expression Transformation

Perform calculations using non-aggregate functions (row level)

Page 81: Informatica Basics

81

Expression Editor» An expression formula is a calculation or conditional statement for a

specific port in a transformation

» Performs calculation based on ports, functions, operators, variables, constants, and return values from other transformations

Page 82: Informatica Basics

82

Expression Editor

Page 83: Informatica Basics

83

Expression Validation

» The Validate or “OK “ button in Expression Editor will: Parse the current expression

» Remote port searching (resolves references to ports in other transformations)

Parse default values Check spelling, correct number of arguments in functions, other

syntactical errors

Page 84: Informatica Basics

84

Informatica Functions

» Character Functions» Conversion Functions» Date Functions» Numerical Functions» Scientific Functions» Test Functions» Special Functions

Page 85: Informatica Basics

85

Informatica Functions

Character Functions» Used to manipulate character data» InitCap returns the string value with the first

letter in upper case followed by lower caseConversion Functions» Used to convert data types

Page 86: Informatica Basics

86

Informatica Functions

Date Functions Used to round, truncate, or compare dates;

extract one part of the date; or perform arithmetic on a date

To pass a string to a date function,first use the to_date() to convert it to an alternate date/time data type

Numerical Functions Used to perform mathematical operations on

numeric data

Page 87: Informatica Basics

87

Informatica Functions

Scientific Functions» Used to calculate geometric values of numeric

dataTest Functions» Used to test if a lookup result is null» Used to validate data

ISNULL() IS_DATE() IS_NUMBER() IS_SPACES()

Page 88: Informatica Basics

88

Special Functions

» Used to handle specific conditions within a session; search for certain values; test conditional statements

» IIF(Condition,True,False)

» ERROR()

» ABORT()

» DECODE()

Informatica Functions

Page 89: Informatica Basics

89

» METAPHONE Encodes string values. You can specify the length of the string that you

want to encode. METAPHONE encodes characters of the English language alphabet

(A-Z). It encodes both uppercase and lowercase letters in uppercase. METAPHONE encodes characters according to the following list of

rules:» Skips vowels (A, E, I, O, and U) unless one of them is the first

character of the input string. » METAPHONE(‘CAR’) returns ‘KR’, METAPHONE (Lamb) returns LM

and METAPHONE(‘AAR’) returns ‘AR’.» Syntax:- METAPHONE( string [,length] )» Return Value:- String/NULL

Note: If value passed is NULL, empty string or does not have any letter of English language

Informatica New Functions Explained

Page 90: Informatica Basics

90

» SOUNDEX Encodes a string value into a four-character string. SOUNDEX works for characters in the English alphabet (A-Z). It uses

the first character of the input string as the first character in the return value and encodes the remaining three unique consonants as numbers.

SOUNDEX encodes characters according to the following list of rules:» Uses the first character in string as the first character in the return

value, and encodes it in uppercase. For e.g., both SOUNDEX(‘John’) and SOUNDEX(‘john’) return ‘J500’.

» Encodes the first three unique consonants following the first character in string and ignores the rest. For example, both SOUNDEX(‘JohnRB’) and SOUNDEX(‘JohnRBCD’) return ‘J561’.

» Assigns a single code to consonants that sound alike.» Syntax:- SOUNDEX( string )» Return Value:- String/NULL

Informatica New Functions Explained

Page 91: Informatica Basics

91

Informatica Data Types

Native Data types Transformation Data TypesSpecific to the source and target database types

PowerCenter internal database types based on ANSII SQL-92

Display in source and target tables within Mapping Designer

Display in transformations within Mapping Designer

NoteNote: a) Transformation data types allow mix-n-match of source and target database typesb) When connecting ports, native and transformation data types must be either compatible or explicitly converted

Page 92: Informatica Basics

92

Data type Conversions

» Implicit Type Conversions: All numeric data can be converted to all other numeric datatype

(e.g. integer, double and decimal) All numeric data types can be converted to string, and vice-versa Date can be converted only to date and string, and vice-versa Raw (binary) can only be linked to raw

Page 93: Informatica Basics

93

Connect Validation

» Examples of invalid connections in a Mapping: Connecting ports with incompatible data types Connecting output ports to a Source Connecting a Source to anything but a Source Qualifier or

Normalizer Transformation Connecting an output port to an output port or an input port to

another input port

Page 94: Informatica Basics

94

Mapping Validation» Mappings must

Be valid for a session to run Be end-to-end complete and contain valid expressions Pass all data flow rules

» Mappings are always validated when saved; can be validated without saving

» Output window will always display reason for invalidity

Page 95: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Filter Transformation

Page 96: Informatica Basics

96

Filter Transformation

» Active Transformation Connected

» Ports All Input/Output

» Usage Filter rows from mapping/mapplet pipeline

Page 97: Informatica Basics

97

Filter Transformation

Drops rows conditionally

Use of logical operators makes the filter very effective

(e.g. SALARY > 30000 AND SALARY < 100000)

Page 98: Informatica Basics

98

Filter Transformation in a Mapping

Page 99: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Router Transformation

Page 100: Informatica Basics

100

Router Transformation

Rows sent to multiple filter conditions

» Active Transformation Connected

» Ports All input/output Specify filter conditions for each Group

» Used to Link source data in one pass to multiple filter conditions

Page 101: Informatica Basics

101

Router Groups

» Input group(always one)» User-defined groups» Each group has one condition» All group conditions are evaluated for each

row» One row can pass multiple conditions» Unlinked group outputs are ignored» Default group(always one) can capture

rows that fail all Group conditions

Page 102: Informatica Basics

102

Router Group Filter Conditions

Page 103: Informatica Basics

103

Using Router in a mapping

Page 104: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Workflows- IChapter 10

Page 105: Informatica Basics

105

Workflow Manager Tools

Workflow Designer» Maps the execution order and dependencies of Sessions, Tasks &

Worklets, for the Informatica Server

Task Developer» Create Session, Shell Command and Email Tasks» Tasks created in the Task Developer are reusable

Worklet Designer» Creates objects that represent a set of tasks» Worklet objects are reusable

Page 106: Informatica Basics

106

Workflow Manager Interface

e.g. The simplest Workflow

Page 107: Informatica Basics

107

Workflow - Overview

» A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data.

» The PowerCenter Server runs workflow tasks according to the conditional links connecting the tasks.

»You can run a task by placing it in a workflow.» Workflow Manager is used to develop and manage workflows. » Workflow Monitor is used to monitor workflows and stop the PowerCenter

Server.» When a workflow starts, the PowerCenter Server retrieves mapping,

workflow, and session metadata from the repository to extract data from the source, transform it, and load it into the target.

» It also runs the tasks in the workflow. » You can run as many sessions in a workflow as you need.» You can run the Session tasks sequentially or concurrently, depending on

your needs.

Page 108: Informatica Basics

108

Session Overview» A session is a set of instructions that tells the PowerCenter Server how

and when to move data from sources to targets.

» A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation.

» A session is a type of task, similar to other tasks available in the Workflow Manager.

» In the Workflow Manager, you configure a session by creating a Session task.

» To run a session, you must first create a workflow to contain the Session task.

Page 109: Informatica Basics

109

Designing & Developing WorkflowsCreate a new Workflow in Workflow Designer Specify Workflow name,and select a Server

Customize Workflow Properties Workflow log displays Set and customize workflow-specific schedule Metadata Extensions provide for additional user data

Building Workflow Components Add sessions and other Tasks to the workflow Connect all Workflow components with Links Save the workflow Start the workflow

Page 110: Informatica Basics

110

Workflow Designer - Links

» Required to connect Workflow Tasks

» Can be used to create branches in a Workflow

» All links are executed-unless a link condition is used which makes a link false

» Links connecting the tasks in a workflow are not allowed to form a closed loop

Page 111: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Session Task

Page 112: Informatica Basics

112

Session Task

» Server Instructions to run the logic of ONE specific Mapping E.g- source and target data location specifications, memory

allocation,optional Mapping overrides, scheduling,processing and load instructions

» Becomes a component of a Workflow or Worklet

» If configured in the Task Developer,the Session Task is reusable

» When a session is to be created, valid mappings are displayed in the dialog box

Page 113: Informatica Basics

113

» Session Task Tabs : General Properties Config Object Mapping Components Metadata Extensions

Session Task

Page 114: Informatica Basics

114

Session Task

Page 115: Informatica Basics

115

Session Task

Page 116: Informatica Basics

116

» The Workflow Manager validates a Session task when you save it.

» You can also manually validate Session tasks and session instances.

» Validate reusable Session tasks in the Task Developer.

» Validate non-reusable sessions and reusable session instances in the Workflow Designer.

Validating a Session

Page 117: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Workflows Monitor OverviewChapter 11

Page 118: Informatica Basics

118

Monitor Workflows

» The Workflow Monitor is the tool for monitoring Workflows and Tasks

» Review details about a Workflow or Tasks in two views: Gantt Chart view Task view

» The Workflow Monitor displays Workflows that have been run at least once

Page 119: Informatica Basics

119

Gantt Chart View

Page 120: Informatica Basics

120

Task View

Page 121: Informatica Basics

121

Monitoring Workflows

» Perform operations in the Workflow Monitor Restart: restart a Task, Workflow or Worklet Stop: stop a Task, Workflow or Worklet Abort: abort a Task, Workflow or Worklet Resume: resume a suspended Workflow after a failed Task is

corrected» View Session and Workflow logs» Abort has a 60 second timeout

If the Server has not completed processing and committing data during the timeout period, the threads and processes associated with the Session are killed.

Page 122: Informatica Basics

122

Monitoring Workflows -Task view

» Task View Start, Stop, Abort, Resume Tasks, Workflows and Worklets

» Workflow Monitoring Filtering Task view provides filtering Monitoring filters can be set using drop-down menus

» Truncating Monitor Logs The Repository Manager “Truncate Log” option clears the Workflow

Monitor logs

Page 123: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Hands on Exercises - II Chapter 12

Page 124: Informatica Basics

124

Lab 5 - Expression Transformation

» Create a mapping using the Employee flat file as source, DIM_EMPLOYEE as the target

» Concatenate First Name and Last Name to get Employee Name

» Ensure all leading and trailing spaces are removed for character columns

» Use NEXTVAL of Sequence Generator transformation to connect to Employee_wk

» Target load will be truncate/load.

» Do not connect geography_wk, region_nk, region_name and direct_report_wk

Page 125: Informatica Basics

125

Lab 6 - Filter Transformation

» Create copy of Lab 5 mapping for LAB 6

» Add a Filter to the mapping to filter out all records having Region as NULL, set audit_id = 0

» Target load will be truncate/load

Page 126: Informatica Basics

126

Lab 7 - Using Router Transformation

» Create a mapping using Customer as Source

» Add a Router to have groups by Customer, based on Country = ‘USA’, Country = ‘Germany’ and all others

» Load to 3 instances of the DIM_CUSTOMER table

Page 127: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Using the Debugger Chapter 13

Page 128: Informatica Basics

128

Debugger Features

» Debugger is a wizard-driven tool View source/target data View transformation data Set breakpoints and evaluate expressions Initialize variables Manually change variable values

» Debugger is Session Driven Data can be loaded or discarded Debug environment can be saved for later use

Page 129: Informatica Basics

129

Debugger Features» You can debug a valid mapping to gain troubleshooting information

about data and error conditions.

» To debug a mapping, you configure and run the Debugger from within the Mapping Designer.

» The Debugger uses a session to run the mapping on the PowerCenter Server.

» When you run the Debugger, it pauses at breakpoints and allows you to view and edit transformation output data.

» You might want to run the Debugger in the following situations: Before you run a session After you run a session

Page 130: Informatica Basics

130

Debugger Session Types

» You can select three different debugger session types when you configure the Debugger.

» The Debugger runs a workflow for each session type.

» You can choose from the following Debugger session types when you configure the Debugger:- Use an existing non-reusable session for the mapping Use an existing reusable session for the mapping Create a debug session instance for the mapping

Page 131: Informatica Basics

131

Debugger Session TypesUse an existing non-reusable session for the mapping: » The Debugger uses existing source, target, and session configuration

properties. » When you run the Debugger, the PowerCenter Server runs the non-

reusable session and the existing workflow. » The Debugger does not suspend on error.Use an existing reusable session for the mapping: » The Debugger uses existing source, target, and session configuration

properties. » When you run the Debugger, the PowerCenter Server runs a debug

instance of the reusable session and creates and runs a debug workflow for the session.

Create a debug session instance for the mapping:» The Debugger Wizard allows you to configure source, target, and

session configuration properties. » When you run the Debugger, the PowerCenter Server runs a debug

instance of the debug workflow and creates and runs a debug workflow for the session.

Page 132: Informatica Basics

132

Debug Process

1. Create breakpoints» You create breakpoints in a mapping where you want the PowerCenter

Server to evaluate data and error conditions.2. Configure the Debugger» Use the Debugger Wizard to configure the Debugger for the mapping. » Select the session type the PowerCenter Server uses when it runs

Debugger.» When you create a debug session, you configure a subset of session

properties within the Debugger Wizard, such as source and target location.

» You can also choose to load or discard target data.

Page 133: Informatica Basics

133

Debug Process

3. Run the Debugger » Run the Debugger from within the Mapping Designer. » When you run the Debugger the Designer connects to the PowerCenter

Server. » The PowerCenter Server initializes the Debugger and runs the

debugging session and workflow. » The PowerCenter Server reads the breakpoints and pauses the

Debugger when the Breakpoints evaluate to true.4. Monitor the Debugger» While you run the Debugger, you can monitor the target data,

transformation & Mapplets output data, the debug log, and the session log.

» When you run the Debugger, the Designer displays the following windows: Debug log. View messages from the Debugger. Target window. View target data. Instance window. View transformation data.

Page 134: Informatica Basics

134

Debug Process

5. Modify data and breakpoints» When the Debugger pauses, you can modify data and see the effect on

transformations, Mapplets, and targets as the data moves through the pipeline.

» You can also modify breakpoint information.» The Designer saves mapping breakpoint and Debugger information in

the workspace files.» You can copy breakpoint information and the Debugger configuration to

another mapping. » If you want to run the Debugger from another PowerCenter Client

machine, you can copy the breakpoint information and the Debugger configuration to the other PowerCenter Client machine.

Page 135: Informatica Basics

135

Debugger Interface

Page 136: Informatica Basics

136

Creating Breakpoints» Use the Breakpoint Editor in the Mapping Designer to create breakpoint

conditions in a mapping. » You can create data or error breakpoints. » When you run the Debugger, the PowerCenter Server pauses the

Debugger when a breakpoint evaluates to true. » A breakpoint can consist of an instance name, a breakpoint type, and a

condition.» When you enter breakpoints, set breakpoint parameters in the following

order:1. Select the instance name.2. Select the breakpoint type.3. Enter the condition.

Page 137: Informatica Basics

137

Breakpoints Editor

Page 138: Informatica Basics

138

Debugger Tips

» Server must be running before starting a Debug Session» When the Debugger is started, a spinning icon displays. Spinning stops

when Debugger Server is ready» Flashing yellow/green arrow points to the current active Source Qualifier.

Solid yellow arrow points to the current Transformation instance» “Next Instance”-is a single step at a time, one row moves from

transformation to transformation» “Step to Instance”-examines one transformation at a time, one row after

other through the same transformation

Page 139: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Transformations in Depth - II Chapter 14

Page 140: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Target Instances

Page 141: Informatica Basics

141

Target Instances

» A single mapping can have more than one instance of the same target

» The data would be loaded into the instances in bulk mode like a pipeline

» Usage of multiple instances of the same target for loading is dependant on the RDBMS in use. Multiple instances may not be used if the underlying database locks the entire table while inserting records

Page 142: Informatica Basics

142

Target Instances - example

Page 143: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Joiner Transformation

Page 144: Informatica Basics

144

Joiner Transformation

» Active/Connected

» Ports Input Output Master

Page 145: Informatica Basics

145

Joins Types

Homogeneous Joins

» Joins that can be performed with a SQL SELECT statement

» Source Qualifier contains a SQL join

» Tables on same database server(or are synonyms)

» Database server does the join “work”

» Multiple Homogeneous joins can be joined

Heterogeneous Joins

» Examples of joins that cannot be done with an SQL statement : An Oracle table and a DB2 table Two flat files A flat file and a database table

Page 146: Informatica Basics

146

Heterogeneous Joins

Page 147: Informatica Basics

147

Joiner Properties» Join Types:

“Normal” (inner) Master Outer Detail Outer Full Outer

» Joiner can accept sorted data (configure the join condition to use the sort origin ports)

» Joiner Conditions & Nested Joins: Multiple Join conditions are supported Used to join three or more heterogeneous sources

Page 148: Informatica Basics

148

Mid-Mapping Join

» The joiner does not accept input in the following situations Both input pipelines begin with the same Source Qualifier Both input pipelines begin with the same Normalizer Both input pipelines begin with the same Joiner Either input pipeline contains an Update Strategy

Page 149: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Aggregator Transformation

Page 150: Informatica Basics

150

Aggregator Transformation

» Active Transformation Connected

» Ports Mixed Variables allowed

» Group by allowed

» Used for Standard aggregations

» Can also be used to get distinct records

Page 151: Informatica Basics

151

Aggregator Transformation

Performs aggregate calculations

Page 152: Informatica Basics

152

Aggregate ExpressionsAggregate functions are supported only in the Aggregator Transformation

Conditional Aggregate Expressions are supportedEx : Conditional SUM format

SUM(value,condition)

Page 153: Informatica Basics

153

» Aggregate Functions Return summary values for non-null data in selected ports Used only in Aggregator Transformations Used only in Output ports Calculate a single value(and row) for all records in a group Nested aggregate functions are allowed Ex : AVG(),COUNT(),MAX(),SUM() Conditional statements can be used with these functions

Aggregator Transformation

Page 154: Informatica Basics

154

» Sorted Data(can be aggregated more efficiently) The Aggregator can handle sorted or unsorted data The Server will cache data from each group and release the cached

data- upon reaching the first record of the next group Data Must be sorted according to the order of the Aggregator “Group

By” ports Performance gain will depend upon varying factors

» Sorted Input property Instructs the Aggregator to expect the data to be sorted

» Difference between Sorted and Unsorted Data Unsorted Data – No rows are released from Aggregator until all rows

are aggregated Sorted Data – Each Separate group (one row) released as soon as

the last row in the group is aggregated

Aggregate Properties

Page 155: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Lookup Transformation

Page 156: Informatica Basics

156

Lookup Transformation

» Passive Transformation » Connected/Unconnected» Ports

Mixed “L” indicates Lookup port “R” indicates port used as a return value

» Usage Get related values Verify if records exist or if data has changed Multiple conditions are supported

» Lookup SQL override is allowed

Page 157: Informatica Basics

157

Lookup Transformation

Page 158: Informatica Basics

158

Lookup Transformation

Looks up values in a database table and provides data to other components in a mapping

Page 159: Informatica Basics

159

Lookup Properties

» Lookup conditions Lookup Table Name Lookup condition Native Database

connection Object name

Page 160: Informatica Basics

160

How a Lookup Transformation works

» For each mapping row, one or more port values are looked up in a database table

» If a match is found, one or more table values are returned to the mapping. If no match is found, default value is returned

Page 161: Informatica Basics

161

» Caching can significantly impact performance Cached

» Lookup table data is cached locally on the server» Mapping rows are looked up against the cache» Only one SQL SELECT is needed» Cache is indexed based on the order by clause

Uncached» Each Mapping row needs one SQL SELECT

» If the data does not fit in the memory cache, the PowerCenter Server stores the overflow values in the cache files.

» When the session completes, the PowerCenter Server releases cache memory and deletes the cache files unless you configure the Lookup transformation to use a persistent cache.

Lookup Caching

Page 162: Informatica Basics

162

Lookup Caches

» When configuring a lookup cache, you can specify any of the following options:

» Persistent cache: You can save the lookup cache files and reuse them the next time the

PowerCenter Server processes a Lookup transformation configured to use the cache.

When Session completes, the persistent cache is stored on the server hard disk.

The next time Session runs, cached data is loaded fully or partially into RAM and reused.

A named persistent cache may be shared by different Sessions» Recache from source:

If the persistent cache is not synchronized with the lookup table, you can configure the Lookup transformation to rebuild the lookup cache.

Page 163: Informatica Basics

163

Lookup Caches» Static cache:

You can configure a static, or read-only, cache for any lookup source. By default, the PowerCenter Server creates a static cache. It caches the lookup file or table and looks up values in the cache for

each row that comes into the transformation.» Dynamic cache:

If you want to cache the target table and insert new rows or update existing rows in the cache and the target, you can create a Lookup transformation to use a dynamic cache.

The PowerCenter Server dynamically inserts or updates data in the lookup cache and passes data to the target table.

» Shared cache: You can share the lookup cache between multiple transformations. You can share an unnamed cache between transformations in the

same mapping. You can share a named cache between transformations in the same or

different mappings.

Page 164: Informatica Basics

164

Lookup Policy on Multiple Match

Options are

» Use first value

» Use last value

» Report error

Note: When Dynamic Cache is enabled Multiple match will report error.

Page 165: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Unconnected Lookups

Page 166: Informatica Basics

166

Unconnected Lookup

» Will be physically “unconnected” from other transformations There can be NO data flow arrows leading to or from an unconnected

Lookup Lookup function can be set within any transformation that supports

expression Function in the Aggregator calls the unconnected Lookup

Page 167: Informatica Basics

167

Conditional Lookup Technique

Two requirements:

1. Must be Unconnected(or “function mode”) Lookup

2. Lookup function used within a conditional statement

» E.g - IIF(ISNULL(cust_id), :lkp.MYLOOKUP(order_no) Conditional statement is evaluated for each row Lookup function is called only under the pre-defined condition

Page 168: Informatica Basics

168

Conditional Lookup Advantage» Data lookup is performed only for those rows which require

it.Substantial performance can be gained

E.g.- A Mapping will process 500,000 rows. For two percent of those rows(10,000) the item_id value is NULL. Item_id can be derived from the SKU_NUMB

IIF(ISNULL(item_id),:lkp.MYLOOKUP(sku_numb))

Net Savings=490,000 lookups

Page 169: Informatica Basics

169

Unconnected Lookup Functionality» One Lookup port value (Return Port) may be returned for each Lookup

WARNING:

If the Return port is not defined, you may get unexpected results.

Page 170: Informatica Basics

170

Connected Vs Unconnected Lookups

Connected LOOKUP Unconnected LOOKUPPart of the mapping data flow Separate from the mapping data flow

Returns multiple values (by linking output ports to another transformation

Returns one value (by checking the Return port option for the output port that provides the return value)

Executed for every record passing through the transformation

Only executed when the lookup function is called

More visible, shows where the lookup values are used

Less visible, as the lookup is called from an expression within another transformation

Default values are used Default values are ignored

Page 171: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Update Strategy Transformation

Page 172: Informatica Basics

172

Update Strategy Transformation

» Active Transformation Connected» Ports

All input/output» Usage

To mark a record for insert/update/delete or rejection IIF or DECODE logic determines how to handle the

record

Page 173: Informatica Basics

173

Update Strategy Transformation

Specifies how each individual row will be used to update target tables (insert,update,delete, reject)

Page 174: Informatica Basics

174

Update Strategy expressions

» IIF ( score>69,DD_INSERT,DD_DELETE)

» Expression is evaluated for each row

» Rows are “tagged” according to the logic of the expression

» Appropriate SQL(DML) is submitted to the target database: insert, delete or update

» DD_REJECT means the row will not have SQL written for it.

» Target will not “see” the row

» “Rejected” rows may be forwarded through Mapping to a reject file

Operations Constant Numeric ValueINSERT DD_INSERT 0UPDATE DD_UPDATE 1DELETE DD_DELETE 2REJECT DD_REJECT 3

Page 175: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Sequence Generator Transformation

Page 176: Informatica Basics

176

Sequence Generator Transformation

» Generates unique keys for any port on a row

» Passive Transformation / Connected

» Ports Two predefined output ports

» NEXTVAL

» CURRVAL No input ports allowed

» Usage Generate Sequence numbers Shareable across mappings

Page 177: Informatica Basics

177

Connecting CURRVAL and NEXTVAL Ports to a Target

Sequence Generator Transformation

Page 178: Informatica Basics

178

Sequence Generator Properties

» Properties Start value End Value Increment By Number of cached values Reset Cycle

» Design tip: Set Reset property and Increment by 1. Use in conjunction with lookup. Lookup to get max(value) from target. Add NextVal to it to get the new ID.

Page 179: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Hands on Exercises - 3 Chapter 15

Page 180: Informatica Basics

180

Lab 8 – Using Debugger

» Execute a debugger session for Customer mapping created in Lab 7 with Load to Target as disabled

» Check usage of “Step To Instance” , “Next Instance” and “Continue”

» Add a Breakpoint for Country =‘USA’ for the Source Qualifier and Router Transformations

» Stop at rows where condition is satisfied to observe data

Page 181: Informatica Basics

181

Lab 9 – Sequence Generator

» Create copy of mapping created in Lab 5

» Use Sequence Generator to Populate Employee_wk

» Set Properties for Reset, Cycle, Range should be 1 to 100

Page 182: Informatica Basics

182

Lab 10 – Joiner Transformation

» Use Employee file created in Lab 4 as source.

» Add another source that is a combination of EmployeeTerritories, Territories, Region tables

» Join to the flat file source by doing an inner join on EmployeeID to get RegionID, RegionDescription, TerritoryID and TerritoryDescription from the db tables and other details from the flat file

» Avoid PhotoPath and Notes from the Flat file join

» Target is a flat file (New target definition required)

Page 183: Informatica Basics

183

Lab 11 – Aggregator Transformation

» Create a mapping with Sources as Orders, OrderDetails» Target is Fact_Orders» Aggregate at Order_ID level» Formulae:

lead_time_days = requireddate - orderdate,internal_response_time_days = shippeddate - orderdate,external_response_time_days = requireddate - shippeddatetotal_order_item_count = SUM(Quantity) total_order_discount_dollars = SUM((Quantity * UnitPrice) * Discount) total_order_dollars = SUM((Quantity * UnitPrice) - ((Quantity * UnitPrice)

* Discount))» DEFAULT to -1 for customer_wk, employee_wk, order_date_wk,

required_date_wk, shipped_date_wk, ship_to_geography_wk, shipper_wk

Page 184: Informatica Basics

184

Lab 12 – Using Connected Lookup

» Use Mapping in Lab 11

» Add 2 Lookup transformation for DIM_EMPLOYEE and DIM_SHIPPER

» Populate using lookups with natural keys, default = -1

» employee_wk:

Orders.EmployeeID = DIM_EMPLOYEE.employee_nk

» shipper_wk:

Orders.ShipVia = Dim_Shipper.Shipper_nk

» Populate the other keys with Default = -1

Page 185: Informatica Basics

185

Lab 13 – Using Unconnected Lookup

» Use the Mapping in Lab 12

» Add two Unconnected Lookups for DIM_CUSTOMER DIM_CALENDER (Empty table)

» Add an Expression transformation between the aggregator and the target where the unconnected lookups can be called

Page 186: Informatica Basics

186

Lab 14 – Update Strategy

» Add Employee as Source. Add 2 instances of DIM_EMPLOYEE as Target» Add a Lookup Transformation (LKP_Target) to get employee_wk from

DIM_EMPLOYEE.» Add Expression Transformation for trimming string columns and getting

values from LKP_Target» Add Router Transformation to separate the data flow for New and Existing

Records» Add 2 Update Strategy Transformations to Flag for Insert and Update» Add a Sequence Generator for populating the employee_wk for insert

rows.» Add an Unconnected Lookup to retrieve the Max_value of employee_wk

from the Target» Add an Expression Transformation (EXP_MAX_SEQ - between the

Update Strategy for insert and the Target instance for insert) to call the unconnected lookup

» Note: Run LAB 6 first. Some rows are filtered & then run this workflow

Page 187: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Designer Features Chapter 16

Page 188: Informatica Basics

188

Arranging Workspace

Page 189: Informatica Basics

189

Propagating Changed Attributes

Page 190: Informatica Basics

190

Link Paths

Page 191: Informatica Basics

191

Exporting Objects to XML

Page 192: Informatica Basics

192

Importing Objects from XML

Page 193: Informatica Basics

193

Comparing Objects

Page 194: Informatica Basics

194

Documentation» Informatica also provides a very descriptive collection of Documentation

and Guides. » The complete set of documentation for PowerCenter includes: -

Data Profiling Guide. Designer Guide. Getting Started Installation and Configuration Guide. PowerCenter Connect® for JMS® User and Administrator Guide. Repository Guide. Transformation Language Reference. Transformation Guide Troubleshooting Guide. Web Services Provider Guide Workflow Administration Guide. XML User Guide.

Page 195: Informatica Basics

195

Versioning

» If you have the team-based development license, you can configure the repository to store multiple versions of objects

» During development, you can use the following change management features to create and manage multiple versions of objects in the repository: Check out and check in versioned objects Compare objects Track changes to an object Delete or purge a version

» You can also apply labels to versioned objects, run queries to search for objects in the repository, and include versioned objects in deployment groups

Page 196: Informatica Basics

196

Versioning

» A repository enabled for versioning can store multiple versions of the following objects: Sources Targets Transformations Mappings & Mapplets Sessions & Tasks Workflows & Worklets Session configurations Schedulers Cubes Dimensions

Page 197: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Transformations in Depth - III Chapter 17

Page 198: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Normalizer Transformation

Page 199: Informatica Basics

199

Normalizer Transformation

» Normalization is the process of organizing data.

» Normalizes Records from relational or VSAM sources

» Active Transformation

» Connected

» Ports Input/Output or Output

» Usage Required for VSAM source definitions Normalize flat file or relational source definitions Generate multiple records from one record

Page 200: Informatica Basics

200

Overview

» You primarily use the Normalizer transformation with COBOL sources, which are often stored in a de-normalized format.

» The Normalizer transformation normalizes records from COBOL and relational sources, allowing you to organize the data according to your own needs.

» A Normalizer transformation can appear anywhere in a pipeline when you normalize a relational source.

» You break out repeated data within a record into separate records.

» You can also use the Normalizer transformation with relational sources to create multiple rows from a single row of data.

Page 201: Informatica Basics

201

Overview

» Use a Normalizer transformation instead of the Source Qualifier transformation when you normalize a COBOL source.

Page 202: Informatica Basics

202

Different Normalizer Transformations» There are a number of differences between a VSAM Normalizer using

COBOL sources and a pipeline Normalizer.

VSAM Normalizer Transformation

Pipeline Normalizer Transformation

Connection COBOL Source Any TransformationPort Creation Automatically created

based on the COBOL Source

Created Manually

Transforms allowed before the Normalizer transformation

No Yes

Transforms allowed before the Normalizer transformation

Yes Yes

Reusable no YesPorts Input/Output Input/Output

Page 203: Informatica Basics

203

Pipeline Normalizer Transformation

Page 204: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Sorter Transformation

Page 205: Informatica Basics

205

» Active Transformation Connected

» Ports Input/Output Define one or more sort keys Define sort order for each key

» Usage Sort data in mapping/mapplet pipeline Before Aggregator

Sorter Transformation

Page 206: Informatica Basics

206

Sorter Transformation

» Can sort data from relational tables or flat files

» Sort takes place on the Informatica Server machine

» Multiple sort keys are supported

Page 207: Informatica Basics

207

» Sorter Properties Cache size

» Can be adjusted. [Default is 8MB]

» Server uses twice the cache listed

» If cache size is unavailable, Session Task will fail

Sorter Transformation

Page 208: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Rank Transformation

Page 209: Informatica Basics

209

Rank Transformation

» Filters the top or bottom range of records for selection.» Active Transformation » Connected» Ports

Mixed One pre-defined output port RANK INDEX Variable allowed Group By allowed

» Usage Select top/bottom Number of records

Page 210: Informatica Basics

210

Overview» You can use a Rank transformation to:-

Return the largest/smallest numeric value in a port or group. Return the strings at the top/bottom of a session sort order.

Page 211: Informatica Basics

211

Overview» Rank transformation allows you to group information (like Aggregator)

create local variables and write non-aggregate expressions.

» The Rank transformation differs from the transformation functions MAX and MIN, in that it allows you to select a group of top or bottom values, not just one value.

» You can connect ports from only one transformation to the Rank transformation.

» The Rank transformation includes input or input/output ports connected to another transformation in the mapping.

» It also includes variable ports and one rank port.

» Use the rank port to specify the column you want to rank.

Page 212: Informatica Basics

212

Rank Index

» The Designer automatically creates a RANKINDEX port for each Rank transformation.

» The PowerCenter Server uses the Rank Index port to store the ranking position for each row in a group.

» For example, if you create a Rank transformation that ranks the top five salespersons for each quarter, the rank index numbers the salespeople from 1 to 3:

RANKINDEX SALES_PERSON SALES

1 Sam 10,000

2 Mary 9,000

3 Alice 8,000

» The RANKINDEX is an output port only.

» You can pass the rank index to another transformation in the mapping or directly to a target.

Page 213: Informatica Basics

213

Rank Index

» If two rank values match, they receive the same value in the rank index and the transformation skips the next value.

» For example, if you want to see the top five retail stores in the country and two stores have the same sales, the return data might look similar to the following:

RANKINDEX SALES STORE

1 10000 Orange

1 10000 Brea

3 90000 Los Angeles

4 80000 Ventura

Page 214: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Transaction Control Transformation

Page 215: Informatica Basics

215

Overview» PowerCenter allows you to control commit and rollback transactions

based on a set of rows that pass through a Transaction Control transformation.

» A transaction is the set of rows bound by commit or rollback rows.

» You can define a transaction based on a varying number of input rows.

» You might want to define transactions based on a group of rows ordered on a common key, such as employee ID or order entry date.

Page 216: Informatica Basics

216

Overview» In PowerCenter, you define transaction control at two levels:

Within a mapping: » Within a mapping, you use the Transaction Control transformation to

define a transaction (using an expression) » Based on the return value of the expression, you can choose to commit,

roll back, or continue without any transaction changes. Within a session:

» When you configure a session, you configure it for user-defined commit.» You can choose to commit or roll back a transaction if the PowerCenter

Server fails to transform or write any row to the target.» When you run the session, the PowerCenter Server evaluates the

expression for each row that enters the transformation. » When it evaluates a commit row, it commits all rows in the transaction to

the target or targets.» When the PowerCenter Server evaluates a rollback row, it rolls back all

rows in the transaction from the target or targets.

Page 217: Informatica Basics

217

Transaction Control Properties» Enter the transaction control expression in the Transaction Control

Condition field. » The transaction control expression uses the IIF function to test each row

against the condition.» Use the following syntax for the expression:

IIF (condition, value1, value2)» The expression contains values that represent actions the PowerCenter

Server performs based on the return value of the condition. » The PowerCenter Server evaluates the condition on a row-by-row basis. » The return value determines whether the PowerCenter Server commits,

rolls back, or makes no transaction changes to the row. » When the PowerCenter Server issues a commit or rollback based on the

return value of the expression, it begins a new transaction.

Page 218: Informatica Basics

218

Transaction Control Properties» Use the following built-in variables in the Expression Editor when you

create a transaction control expression:- TC_CONTINUE_TRANSACTION (default value) TC_COMMIT_BEFORE TC_COMMIT_AFTER TC_ROLLBACK_BEFORE TC_ROLLBACK_AFTER

Page 219: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Stored Procedure Transformation

Page 220: Informatica Basics

220

Overview» A Stored Procedure transformation is an important tool for populating and

maintaining databases. » Database administrators create stored procedures to automate tasks that

are too complicated for standard SQL statements.» A stored procedure is a precompiled collection of Transact-SQL, PL-SQL

or other database procedural statements and optional flow control statements, similar to an executable script.

» Stored procedures are stored and run within the database. » You can run a stored procedure with the EXECUTE SQL statement in a

database client tool.» Not all databases support stored procedures, and stored procedure

syntax varies depending on the database. » You might use stored procedures to do the following tasks:

Check the status of a target database before loading data into it. Determine if enough space exists in a database. Perform a specialized calculation. Drop and recreate indexes

Page 221: Informatica Basics

221

Overview» Stored procedures also provide error handling and logging necessary for

critical tasks.

» The stored procedure must exist in the database before creating a Stored Procedure transformation, and the stored procedure can exist in a source, target, or any database with a valid connection to the PowerCenter Server.

» You might use a stored procedure to perform a query or calculation that you would otherwise make part of a mapping.

» If you already have a well-tested stored procedure for calculating sales tax, you can perform that calculation through the stored procedure instead of recreating the same calculation in an Expression transformation.

Page 222: Informatica Basics

222

Input and Output Data» There are three types of data that pass between the PowerCenter Server

and the stored procedure:» Input/output parameters» Return values» Status codes

Input/Output Parameters: For many stored procedures, you provide a value and receive a value in

return. These values are known as input and output parameters. The Stored Procedure transformation sends / receives input and output

parameters using ports, variables, or by entering a value in an expression.

Return Values: Most databases provide a return value after running a stored

procedure. The Stored Procedure transformation captures return values in a similar

manner as input/output parameters, depending on the method that the input/output parameters are captured.

Page 223: Informatica Basics

223

Input and Output Data

Status Codes: Status codes provide error handling for the PowerCenter Server during

a workflow. The stored procedure issues a status code that notifies whether or not

the stored procedure completed successfully. You cannot see this value. The PowerCenter Server uses it to determine whether to continue

running the session or stop. You configure options in the Workflow Manager to continue or stop the

session in the event of a stored procedure error.

Page 224: Informatica Basics

224

Connected and Unconnected

» Stored procedures run in either connected or unconnected mode.

» You can configure connected and unconnected Stored Procedure transformations in a mapping.

» The mode you use depends on what the stored procedure does and how you plan to use it in your session.

Page 225: Informatica Basics

225

Connected and Unconnected

» Connected The flow of data through a mapping in connected mode also passes

through the Stored Procedure transformation. All data entering the transformation through the input ports affects the

stored procedure. You should use a connected Stored Procedure transformation when

you need data from an input port sent as an input parameter to the stored procedure, or the results of a stored procedure sent as an output parameter to another transformation.

» Unconnected The unconnected Stored Procedure transformation is not connected

directly to the flow of the mapping. It either runs before or after the session, or is called by an expression

in another transformation in the mapping.

Page 226: Informatica Basics

226

ComparisonIf you want to Use this modeRun a stored procedure before or after your session. UnconnectedRun a stored procedure once during your mapping, such as pre- or post-session.

Unconnected

Run a stored procedure every time a row passes through the Stored Procedure transformation.

Connected Or Unconnected

Run a stored procedure based on data that passes through the mapping, such as when a specific port does not contain a null value.

Unconnected

Pass parameters to the stored procedure and receive a single output parameter.

Connected Or Unconnected

Pass parameters to the stored procedure and receive multiple output parameters.

Connected Or Unconnected

Run nested stored procedures. UnconnectedCall multiple times within a mapping. Unconnected

Page 227: Informatica Basics

227

Specifying when the Stored Procedure Runs» In the case of the unconnected stored procedure, the Expression

transformation references the stored procedure, which means the stored procedure runs every time a row passes through the Expression transformation.

» If no transformation references the Stored Procedure transformation, you have the option to run the stored procedure once before or after the session.

» The following list describes the options for running a Stored Procedure transformation:

» Normal

» Pre-load of the Source

» Post-load of the Source

» Pre-load of the Target

» Post-load of the Target

Page 228: Informatica Basics

228

Specifying when the Stored Procedure Runs

Normal: The stored procedure runs where the transformation exists in the

mapping on a row-by-row basis. This is useful for calling the stored procedure for each row of data that

passes through the mapping, such as running a calculation against an input port.

Connected stored procedures run only in normal mode.Pre-load of the Source: Before the session retrieves data from the source, the stored

procedure runs. This is useful for verifying the existence of tables or performing joins of

data in a temporary table.

Page 229: Informatica Basics

229

Specifying when the Stored Procedure Runs

Post-load of the Source:

» After the session retrieves data from the source, the stored procedure runs.

» This is useful for removing temporary tables.

Pre-load of the Target:

» Before the session sends data to the target, the stored procedure runs.

» This is useful for verifying target tables or disk space on the target system.

Post-load of the Target:

» After the session sends data to the target, the stored procedure runs.

» This is useful for re-creating indexes on the database.

Page 230: Informatica Basics

230

Specifying when the Stored Procedure Runs

You can run several Stored Procedure transformations in different modes in the same mapping.

A pre-load source stored procedure can check table integrity, a normal stored procedure can populate the table, and a post-load stored procedure can rebuild indexes in the database.

However, you cannot run the same instance of a Stored Procedure transformation in both connected and unconnected mode in a mapping.

You must create different instances of the transformation.

Page 231: Informatica Basics

231

Stored Procedure Transformation 1

Connected

Page 232: Informatica Basics

232

Stored Procedure Transformation 2

Unconnected

Page 233: Informatica Basics

233

Calling Stored Procedure From Expression

Page 234: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Workflows - II Chapter 18

Page 235: Informatica Basics

235

Additional Workflow Tasks» Eight additional Tasks are available in the Workflow Designer

Command Email Decision Assignment Timer Control Event Wait Event Raise

Page 236: Informatica Basics

236

Reusable Tasks

» Three types of reusable tasks Session: Set of instructions to execute a specific Mapping Command: Specific shell commands to run during any Workflow Email: Sends email during the Workflow

» Use the Task Developer to create a reusable tasks

» These tasks will then appear in the Navigator and can be dragged & dropped into any workflow

Page 237: Informatica Basics

237

Command Task

» Specify one or more Unix shell or DOS commands to run during the Workflow Runs in the Informatica Server(Unix or Windows) environment

» Shell command status(successful completion or failure) is held in the pre-defined variable “$command_task_name.STATUS”

» Each command Task shell command can execute before the Session begins or after the Informatica Server executes a Session

» Specify one or more Unix shell or DOS (NT, WIn2000) commands to run at a specific point in the Workflow

» Becomes a component of a Workflow (or Worklet)

Page 238: Informatica Basics

238

» If configured in the Task Developer, the Command Task is reusable (optional)

» You can use a Command task in the following ways: Standalone Command task. Pre- and post-session shell command.

Command Task

Page 239: Informatica Basics

239

Email Task

» Configure to have the Informatica Server to send email at any point in the Workflow

» Becomes a component in a Workflow (or Worklet)

» If configured in the Task Developer, the Email Task is reusable(optional)

Page 240: Informatica Basics

240

Non-reusable Tasks

» Six additional Tasks are available in the Workflow Designer Decision

Assignment

Timer

Control

Event Wait

Event Raise

Page 241: Informatica Basics

241

Decision Task

» Specifies a condition to be evaluated in the Workflow

» Use the Decision Task in branches of a Workflow

» Provides additional functionality over a Link

Page 242: Informatica Basics

242

Decision Task

» Example Workflow without a Decision Task

Page 243: Informatica Basics

243

Assignment Task

» Assigns a value to a Workflow variable

» Variables are defined in the Workflow object

Page 244: Informatica Basics

244

Timer Task

» Waits for a specified period of time to execute the next Task Absolute Time Datetime variable Relative Time

Page 245: Informatica Basics

245

Control Task

» Used to stop, abort, or fail the top-level workflow or the parent workflow based on an input link condition.

» A parent workflow or worklet is the workflow or worklet that contains the Control task.

Page 246: Informatica Basics

246

Event Wait Task

» Waits for a User-defined or a Pre-defined event to occur» Once the vent occurs,the Informatica Server completes the rest of the

Workflow» Used with the Event Raise Task» Events can be a file watch (indicator file) or User-defined» User-defined events are defined in the Workflow itself

Page 247: Informatica Basics

247

Event Raise Task

» Represents the location of a User-defined event

» The Event Raise Task triggers the User-defined event when the Informatica Server executes the Event Raise Task

Page 248: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

Hands-On - III Chapter 19

Page 249: Informatica Basics

249

Lab 15 – Using Command Task

» Copy the workflow of Lab 4 for this lab.

» Add a command task which copies the output file of session task to another directory.

Page 250: Informatica Basics

250

Lab 16 – Using Email Task

» Copy the workflow of Lab 4 for this lab.

» Configure an email task after the session, to inform successful completion.

Page 251: Informatica Basics

251

Lab 17 – Using Timer Task

» Copy the workflow of Lab 15 for this Lab.

» Include a Timer task after the session and configure it so that the command task runs after 1 minute.

Page 252: Informatica Basics

252

Conclusion

Thank You!

Page 253: Informatica Basics

© Kanbay Incorporated - All Rights Reserved

KanbayWORLDWIDE HEADQUARTERS: 6400 SHAFER COURT I ROSEMONT, ILLINOIS USA

60018TEL. 847.384.6100 I FAX 847.384.0500 I WWW.KANBAY.COM