a interview faq's -1

Upload: ypraju

Post on 30-May-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 a Interview Faq's -1

    1/25

    INFORMATICA FAQS & SENARIOS-1By PenchalaRaju.Yanamala

    1. What are mapping parameters and mapping variables?

    ANS: Mapping parameter represents a constant value that you can define beforerunning a session. A mapping parameter retains the same value throughout theentire session. When you use the mapping parameter, you declare and use theparameter in a mapping. Then define the value of parameter in a parameter file

    for the session. Unlike a mapping parameter, a mapping variable represents avalue that can change throughout the session. The informatica server saves thevalue of mapping variable to the repository at the end of session run and usesthat value next time you run the session.

    2. How to create mapping parameters?

    Ans: we can create mapping parameters in informatica designer.

    3. What is use of mapping parameters? Mapping variable?

    Ans: See the answer given in the above 1st question.4. What is use of session parameters?

    Ans: Session parameters are like mapping parameters represents values u mightwant to change between sessions such as database connections or source files.Server manager also allows u to create user defined session parameters.Following are user defined session parameters.Database connections: You can give database connections here.Location of Source file names: Use this parameter when u wants to change thename or session source file between session runs.

    Location of Target file name: Use this parameter when u wants to change thename or session target file between session runs.Location of Reject file name: Use this parameter when u wants to change thename or session rejects files between session runs.

    5. What do you know about debugging?

    Ans: If a session fails or expected data does not come in target table then we usedebugger. By using debugger we can know where the exact fault will be.

    : In real time scenario where can we use mapping parameters andvariables?

    Before using mapping parameters and mapping variables weshould declare these things in mapping tab of mappingdesigner.

    A mapping parameter cannot change untill the session hascompleted unless a mapping variable can be changed inbetween the session.

    Example:::

  • 8/14/2019 a Interview Faq's -1

    2/25

    if we declare mapping parameter we can use that parameteruntill completing the session,but if we declare mappingvariable we can change in between sessions.Use mappingvariable in Transcation Control Transformation......If mapping parameters and variables are not given we cannot call values ndifferent mappings. We define mapping parameters and variables in mappingdesigner.

    Thanks Deepak. But does it take any null values or any default values to run themapping.

    What is the difference between mapping parameter & mapping variable indata warehousing?What are the types of variable available in data

    warehousing?

    Mapping Parameter: Define the constant value it can't change the value throughthe session.

    Mapping Variables: Define the value It can be change throughout the session

    Mapping parameter is a dynamic normally pass the value to this parameterthrough the prarameter file(.par file).we can pass the parameter dynamically atevery time the session runs.This can be set to the relevent variable called byusing the $$ sign.

    Mapping variable is static.we can't change the value of this variable duringruntime.

    mapping parameter is a static value that you define before running the session

    and it retains the same value till the end of the session. Define a parameter inparamaeter file (.par) and use it in mapping or maplet and when PowerCenterserver the runs the session it evaluates the value from the parameter and retainsthe same value throughout the session. When the session run again it reads fromthe file for its value.

    A mapping variable is dynamic or changes anytime during the session.PowerCenter reads the intial value of the variable before the start of the sessionand changes its value by using variable functions and before ending the sessionits saves the current value (last value held by the variable). Next time when thesession runs the variable value is the last saved value in the previous session.

    For example if you have a count variable that it gets incremented every time asession is run the variable will have a value of 3 on its 4th run. The variable valuewill not be saved by the session when it fails or when it is configured for a testload or when it runs on a debug mode. There is also another option not save thevariable value at the end of the session by discarding session output so it thateverytime it starts it will take the intial value not the last saved value.

    What r the session parameters?

    Session parameters r like maping parameters,represent values U might want tochange betweensessions such as database connections or source files.

  • 8/14/2019 a Interview Faq's -1

    3/25

    Server manager also allows U to create userdefined sessionparameters.Following r user definedsession parameters.Database connectionsSource file names: use this parameter when u want to change the name orlocation ofsession source file between session runsTarget file name : Use this parameter when u want to change the name or

    location ofsession target file between session runs.Reject file name : Use this parameter when u want to change the name orlocation ofsession reject files between session runs.

    1. Explain your complex mapping.Ans: In one of my mappings I felt complexity. I used unconnected

    lookups, stored procedure, expression transformations, router transformations,Joiner transformation using conditions. The main complexity I felt is to maintain

    SCD 2 type in my mapping and I used to maintain the history of data.I used unconnected lookup to see whether the data exist in the target table ornot. Similarly I want to call the lookup transformation in different expressions so Iused unconnected lookup. I used stored procedure transformation because Iwant to calculate expressions for the records coming from the database itself. Iused router transformation because I am having multiple conditions based on theflag value which I set, so for that purpose I used router. I used update strategytransformation to make inserts and updates for the data coming from the source.I also used Joiner transformation because I used more than one sourceaccording to the logic and I need to join them and so I used joiner.

    2. Current mapping.

    Ans: Raju in complex mapping you remove some transformations and tell themthat this is my current mapping.

    3. Master detail tables which u will load first and why?

    Ans: First we will load data into Master table. Because with reference to mastertable we will load data into detail table. As here primary key and foreign keyrelation ship acts.

    4.Session partitions.

    Ans: Partitioning we will use to increase performance. In session propertieswe will use partitioning. Types of partitioning are round robin, hash key, passthrough, and key range. According to the need we will partition data. By defaultsource qualifier and target table will have partition points.Round robin partitioning will equally divide data passing through the partitions.For example if there are 150 records and we divide into 3 partitions then eachpartition will get 50 records.Pass through partition will simply pass data from source to target. It does notcontain any partition point.

  • 8/14/2019 a Interview Faq's -1

    4/25

    Key range partition will divide the data into our required numbers when sendingdata through partition lines. For example if there are 1000 records and we arehaving 3 partition lines then we can divide data such that 1 to 180 records passthrough 1st partition and 181 to 460 records through 2nd partition and rest allrecords through 3rd partition.Hash key will be used at rank, sorter and unsorted aggregator transformations togroup the data.

    4.Daily activities.

    Ans: After I go to office I will connect to client machine and I will developmappings according to the business logic by referring to the businessdocuments. If I am having any doubt regarding the logic I will ask my team leadelse the business analyst. At the end of the day I will report to my team leadregarding todays work which I completed.

    5.How will you delete duplicate rows in a flat file?

    Ans: By using Distinct option in sorter transformation.

    6.How will you get duplicate records of flat files?

    Ans: source -> source qualifier -> expression transformation -> filtertransformation -> target. In expression transformation we will generate a variableport and also an output port.Old_Value (variable) New_Value is assigned.

    New_value (variable) EMPNO.Flag (output) IIF(Old_Value=New_Value, 1, 0).Now in filter transformation we will give condition as Flag=1.

    7.How will you get particular records in flat file?

    Ans: source -> source qualifier -> expression -> filter -> target. In expression wewill create a variable and output port.Exp_Var (variable) Exp_Var+1Exp_out (output) Exp_var.In filter we will give condition of what records we want. For example we want 2 nd,7th, 13th records we will give condition as Exp_Out=2 or Exp_Out=7 orExp_Out=13.

    8.How will u do scheduling?

    Ans: we will not do scheduling. Tivoli is the tool to schedule workflows.

    Q. What is the difference between PowerCenter and PowerMart?

    With PowerCenter, you receive all product functionality, including the ability toregister multiple servers, share metadata across repositories, and partition data.

    A PowerCenter license lets you create a single repository that you can configureas a global repository, the core component of a data warehouse.

  • 8/14/2019 a Interview Faq's -1

    5/25

    PowerMart includes all features except distributed metadata, multiple registeredservers, and data partitioning. Also, the various options available withPowerCenter (such as PowerCenter Integration Server for BW, PowerConnectfor IBM DB2, PowerConnect for IBM MQSeries, PowerConnect for SAP R/3,PowerConnect for Siebel, and PowerConnect for PeopleSoft) are not availablewith PowerMart.

    Q. What is a repository?

    The Informatica repository is a relational database that stores information, ormetadata, used by the Informatica Server and Client tools. The repository alsostores administrative information such as usernames and passwords,permissions and privileges, and product version.

    We create and maintain the repository with the Repository Manager client tool.With the Repository Manager, we can also create folders to organize metadataand groups to organize users.

    Q. What are different kinds of repository objects? And what it will contain?

    Repository objects displayed in the Navigator can include sources, targets,transformations, mappings, mapplets, shortcuts, sessions, batches, and sessionlogs.

    Q. What is a metadata?

    Designing a data mart involves writing and storing a complex set of instructions.You need to know where to get data (sources), how to change it, and where towrite the information (targets). PowerMart and PowerCenter call this set ofinstructions metadata. Each piece of metadata (for example, the description of asource table in an operational database) can contain comments about it.

    In summary, Metadata can include information such as mappings describing howto transform source data, sessions indicating when you want the InformaticaServer to perform the transformations, and connect strings for sources andtargets.

    Q. What are folders?

    Folders let you organize your work in the repository, providing a way to separatedifferent types of metadata or different projects into easily identifiable areas.

    Q. What is a Shared Folder?

    A shared folder is one, whose contents are available to all other folders in thesame repository. If we plan on using the same piece of metadata in severalprojects (for example, a description of the CUSTOMERS table that provides datafor a variety of purposes), you might put that metadata in the shared folder.

    Q. What are mappings?

    A mapping specifies how to move and transform data from sources to targets.Mappings include source and target definitions and transformations.

    Transformations describe how the Informatica Server transforms data. Mappings

  • 8/14/2019 a Interview Faq's -1

    6/25

    can also include shortcuts, reusable transformations, and mapplets. Use theMapping Designer tool in the Designer to create mappings.

    Q. What are mapplets?

    You can design a mapplet to contain sets of transformation logic to be reused inmultiple mappings within a folder, a repository, or a domain. Rather than recreatethe same set of transformations each time, you can create a mapplet containingthe transformations, then add instances of the mapplet to individual mappings.

    Use the Mapplet Designer tool in the Designer to create mapplets.

    Q. What are Transformations?

    A transformation generates, modifies, or passes data through ports that youconnect in a mapping or mapplet. When you build a mapping, you addtransformations and configure them to handle data according to your businesspurpose. Use the Transformation Developer tool in the Designer to createtransformations.

    Q. What are Reusable transformations?

    You can design a transformation to be reused in multiple mappings within afolder, a repository, or a domain. Rather than recreate the same transformationeach time, you can make the transformation reusable, then add instances of thetransformation to individual mappings. Use the Transformation Developer tool inthe Designer to create reusable transformations.

    Q. What are Sessions and Batches?

    Sessions and batches store information about how and when the InformaticaServer moves data through mappings. You create a session for each mapping

    you want to run. You can group several sessions together in a batch. Use theServer Manager to create sessions and batches.

    Q. What are Shortcuts?

    We can create shortcuts to objects in shared folders. Shortcuts provide theeasiest way to reuse objects. We use a shortcut as if it were the actual object,and when we make a change to the original object, all shortcuts inherit thechange.

    Shortcuts to folders in the same repository are known as local shortcuts.

    Shortcuts to the global repository are called global shortcuts.We use the Designer to create shortcuts.

    Q. What are Source definitions?

    Detailed descriptions of database objects (tables, views, synonyms), flat files,XML files, or Cobol files that provide source data. For example, a sourcedefinition might be the complete structure of the EMPLOYEES table, includingthe table name, column names and datatypes, and any constraints applied tothese columns, such as NOT NULL or PRIMARY KEY. Use the Source Analyzertool in the Designer to import and create source definitions.

  • 8/14/2019 a Interview Faq's -1

    7/25

    Q. What are Target definitions?

    Detailed descriptions for database objects, flat files, Cobol files, or XML files toreceive transformed data. During a session, the Informatica Server writes theresulting data to session targets. Use the Warehouse Designer tool in theDesigner to import or create target definitions.

    Q. What is Dynamic Data Store?

    The need to share data is just as pressing as the need to share metadata. Often,several data marts in the same organization need the same information. Forexample, several data marts may need to read the same product data fromoperational sources, perform the same profitability calculations, and format thisinformation to make it easy to review.

    If each data mart reads, transforms, and writes this product data separately, thethroughput for the entire organization is lower than it could be. A more efficientapproach would be to read, transform, and write the data to one central datastore shared by all data marts. Transformation is a processing-intensive task, so

    performing the profitability calculations once saves time.

    Therefore, this kind of dynamic data store (DDS) improves throughput at the levelof the entire organization, including all data marts. To improve performancefurther, you might want to capture incremental changes to sources. For example,rather than reading all the product data each time you update the DDS, you canimprove performance by capturing only the inserts, deletes, and updates thathave occurred in the PRODUCTS table since the last time you updated the DDS.

    The DDS has one additional advantage beyond performance: when you movedata into the DDS, you can format it in a standard fashion. For example, you can

    prune sensitive employee data that should not be stored in any data mart. Or youcan display date and time values in a standard format. You can perform theseand other data cleansing tasks when you move data into the DDS instead ofperforming them repeatedly in separate data marts.

    Q. What is a Global repository?

    The centralized repository in a domain, a group of connected repositories. Eachdomain can contain one global repository. The global repository can containcommon objects to be shared throughout the domain through global shortcuts.Once created, you cannot change a global repository to a local repository. You

    can promote an existing local repository to a global repository.Q. What is Local Repository?

    Each local repository in the domain can connect to the global repository and useobjects in its shared folders. A folder in a local repository can be copied to otherlocal repositories while keeping all local and global shortcuts intact.

    Q. What are the different types of locks?

    There are five kinds of locks on repository objects:

  • 8/14/2019 a Interview Faq's -1

    8/25

    Read lock. Created when you open a repository object in a folder for which youdo not have write permission. Also created when you open an object with anexisting write lock.

    Write lock. Created when you create or edit a repository object in a folder forwhich you have write permission.

    Execute lock. Created when you start a session or batch, or when theInformatica Server starts a scheduled session or batch.

    Fetch lock. Created when the repository reads information about repositoryobjects from the database.

    Save lock. Created when you save information to the repository.

    Q. After creating users and user groups, and granting different sets ofprivileges, I find that none of the repository users can perform certaintasks, even the Administrator.

    Repository privileges are limited by the database privileges granted to the

    database user who created the repository. If the database user (one of thedefault users created in the Administrators group) does not have full databaseprivileges in the repository database, you need to edit the database user to allowall privileges in the database.

    Q. I created a new group and removed the Browse Repository privilegefrom the group. Why does every user in the group still have that privilege?

    Privileges granted to individual users take precedence over any grouprestrictions. Browse Repository is a default privilege granted to all new users andgroups. Therefore, to remove the privilege from users in a group, you must

    remove the privilege from the group, and every user in the group.

    Q. I do not want a user group to create or edit sessions and batches, but Ineed them to access the Server Manager to stop the Informatica Server.

    To permit a user to access the Server Manager to stop the Informatica Server,you must grant them both the Create Sessions and Batches, and AdministerServer privileges. To restrict the user from creating or editing sessions andbatches, you must restrict the user's write permissions on a folder level.

    Alternatively, the user can usepmcmdto stop the Informatica Server with the

    Administer Server privilege alone.

    Q. How does read permission affect the use of the command line program,pmcmd?

    To usepmcmd, you do not need to view a folder before starting a session orbatch within the folder. Therefore, you do not need read permission to startsessions or batches withpmcmd. You must, however, know the exact name ofthe session or batch and the folder in which it exists.

    Withpmcmd, you can start any session or batch in the repository if you have the

    Session Operator privilege or execute permission on the folder.

  • 8/14/2019 a Interview Faq's -1

    9/25

    Q. My privileges indicate I should be able to edit objects in the repository,but I cannot edit any metadata.

    You may be working in a folder with restrictive permissions. Check the folderpermissions to see if you belong to a group whose privileges are restricted by thefolder owner.

    Q. What is Event-Based Scheduling?

    When you use event-based scheduling, the Informatica Server starts a sessionwhen it locates the specified indicator file. To use event-based scheduling, youneed a shell command, script, or batch file to create an indicator file when allsources are available. The file must be created or sent to a directory local to theInformatica Server. The file can be of any format recognized by the InformaticaServer operating system. The Informatica Server deletes the indicator file oncethe session starts.

    Use the following syntax to ping the Informatica Server on a UNIX system:

    pmcmd ping [{user_name | %user_env_var} {password | %password_env_var}]

    [hostname:]portno

    Use the following syntax to start a session or batch on a UNIX system:

    pmcmd start {user_name | %user_env_var} {password | %password_env_var}[hostname:]portno [folder_name:]{session_name | batch_name} [:pf=param_file]session_flag wait_flag

    Use the following syntax to stop a session or batch on a UNIX system:

    pmcmd stop {user_name | %user_env_var} {password | %password_env_var}

    [hostname:]portno[folder_name:]{session_name | batch_name} session_flag

    Use the following syntax to stop the Informatica Server on a UNIX system:

    pmcmd stopserver {user_name | %user_env_var} {password |%password_env_var} [hostname:]portno

    Q. What are the different types of Commit intervals?

    The different commit intervals are:

    Target-based commit. The Informatica Server commits data based on thenumber of target rows and the key constraints on the target table. The commitpoint also depends on the buffer block size and the commit interval.

    Source-based commit. The Informatica Server commits data based on thenumber of source rows. The commit point is the commit interval you configure inthe session properties.

    Q. What are the tools provided by Designer?

    The Designer provides the following tools:

  • 8/14/2019 a Interview Faq's -1

    10/25

    Source Analyzer. Use to import or create source definitions for flat file, XML,Cobol, ERP, and relational sources.

    Warehouse Designer. Use to import or create target definitions.

    Transformation Developer. Use to create reusable transformations.

    Mapplet Designer. Use to create mapplets.

    Mapping Designer. Use to create mappings.

    Q. What is a transformation?

    A transformation is a repository object that generates, modifies, or passes data.You configure logic in a transformation that the Informatica Server uses totransform data. The Designer provides a set of transformations that performspecific functions. For example, an Aggregator transformation performscalculations on groups of data.

    Each transformation has rules for configuring and connecting in a mapping. For

    more information about working with a specific transformation, refer to thechapter in this book that discusses that particular transformation.

    You can create transformations to use once in a mapping, or you can createreusable transformations to use in multiple mappings.

    Q. What are the different types of Transformations? (Mascot)

    a) Aggregator transformation: The Aggregator transformation allows you toperform aggregate calculations, such as averages and sums. The Aggregatortransformation is unlike the Expression transformation, in that you can use the

    Aggregator transformation to perform calculations on groups. The Expressiontransformation permits you to perform calculations on a row-by-row basis only.(Mascot)

    b) Expression transformation: You can use the Expression transformations tocalculate values in a single row before you write to the target. For example, youmight need to adjust employee salaries, concatenate first and last names, orconvert strings to numbers. You can use the Expression transformation toperform any non-aggregate calculations. You can also use the Expressiontransformation to test conditional statements before you output the results totarget tables or other transformations.

    c) Filter transformation: The Filter transformation provides the means forfiltering rows in a mapping. You pass all the rows from a source transformationthrough the Filter transformation, and then enter a filter condition for thetransformation. All ports in a Filter transformation are input/output, and only rowsthat meet the condition pass through the Filter transformation.

    d) Joiner transformation: While a Source Qualifier transformation can join dataoriginating from a common source database, the Joiner transformation joins tworelated heterogeneous sources residing in different locations or file systems.

  • 8/14/2019 a Interview Faq's -1

    11/25

    e) Lookup transformation: Use a Lookup transformation in your mapping tolook up data in a relational table, view, or synonym. Import a lookup definitionfrom any relational database to which both the Informatica Client and Server canconnect. You can use multiple Lookup transformations in a mapping.

    The Informatica Server queries the lookup table based on the lookup ports in thetransformation. It compares Lookup transformation port values to lookup tablecolumn values based on the lookup condition. Use the result of the lookup topass to other transformations and the target.

    Q. What is the difference between Aggregate and ExpressionTransformation? (Mascot)

    Q. What is Update Strategy?

    When we design our data warehouse, we need to decide what type ofinformation to store in targets. As part of our target table design, we need todetermine whether to maintain all the historic data or just the most recentchanges.The model we choose constitutes ourupdate strategy, how to handle changes to

    existing records.

    Update strategy flags a record for update, insert, delete, or reject. We use thistransformation when we want to exert fine control over updates to a target, basedon some condition we apply. For example, we might use the Update Strategytransformation to flag all customer records for update when the mailing addresshas changed, or flag all employee records for reject for people no longer workingfor the company.

    Q. Where do you define update strategy?

    We can set the Update strategy at two different levels:Within a session. When you configure a session, you can instruct theInformatica Server to either treat all records in the same way (for example, treatall records as inserts), or use instructions coded into the session mapping to flagrecords for different database operations.

    Within a mapping. Within a mapping, you use the Update Strategytransformation to flag records for insert, delete, update, or reject.

    Q. What are the advantages of having the Update strategy at Session

    Level?

    Q. What is a lookup table? (KPIT Infotech, Pune)

    The lookup table can be a single table, or we can join multiple tables in the samedatabase using a lookup query override. The Informatica Server queries thelookup table or an in-memory cache of the table for all incoming rows into theLookup transformation.

    If your mapping includes heterogeneous joins, we can use any of the mappingsources or mapping targets as the lookup table.

  • 8/14/2019 a Interview Faq's -1

    12/25

    Q. What is a Lookup transformation and what are its uses?

    We use a Lookup transformation in our mapping to look up data in a relationaltable, view or synonym.

    We can use the Lookup transformation for the following purposes:

    Get a related value. For example, if our source table includes employee ID, butwe want to include the employee name in our target table to make our summary

    data easier to read.Perform a calculation. Many normalized tables include values used in acalculation, such as gross sales per invoice or sales tax, but not the calculatedvalue (such as net sales).Update slowly changing dimension tables. We can use a Lookup transformationto determine whether records already exist in the target.

    Q. What are connected and unconnected Lookup transformations?

    We can configure a connected Lookup transformation to receive input directlyfrom the mapping pipeline, or we can configure an unconnected Lookup

    transformation to receive input from the result of an expression in anothertransformation.

    An unconnected Lookup transformation exists separate from the pipeline in themapping. We write an expression using the :LKP reference qualifier to call thelookup within another transformation.

    A common use for unconnected Lookup transformations is to update slowlychanging dimension tables.

    Q. What is the difference between connected lookup and unconnected

    lookup?

    Differences between Connected and Unconnected Lookups:

    Connected Lookup Unconnected Lookup

    Receives input values directlyfrom the pipeline.

    Receives input values from theresult of a :LKP expression inanother transformation.

    We can use a dynamic or staticcache We can use a static cache

    Supports user-defined defaultvalues

    Does not support user-defineddefault values

    Q. What is Sequence Generator Transformation? (Mascot)

    The Sequence Generator transformation generates numeric values. We can usethe Sequence Generator to create unique primary key values, replace missingprimary keys, or cycle through a sequential range of numbers.

  • 8/14/2019 a Interview Faq's -1

    13/25

    The Sequence Generation transformation is a connected transformation. Itcontains two output ports that we can connect to one or more transformations.

    Q. What are the uses of a Sequence Generator transformation?

    We can perform the following tasks with a Sequence Generator transformation:Create keysReplace missing valuesCycle through a sequential range of numbers

    Q. What are the advantages of Sequence generator? Is it necessary, if sowhy?

    We can make a Sequence Generator reusable, and use it in multiple mappings.We might reuse a Sequence Generator when we perform multiple loads to asingle target.

    For example, if we have a large input file that we separate into three sessionsrunning in parallel, we can use a Sequence Generator to generate primary keyvalues. If we use different Sequence Generators, the Informatica Server might

    accidentally generate duplicate key values. Instead, we can use the samereusable Sequence Generator for all three sessions to provide a unique value foreach target row.

    Q. How is the Sequence Generator transformation different from othertransformations?

    The Sequence Generator is unique among all transformations because wecannot add, edit, or delete its default ports (NEXTVAL and CURRVAL).

    Unlike other transformations we cannot override the Sequence Generator

    transformation properties at the session level. This protecxts the integrity of thesequence values generated.

    (1).Joiner Transformation

    A:-*joiner.T can be used to join two sources, coming from twoDifferent locations or from the same locations.

    *In order to join two sources, there must be at least one matching

    Port while joining two sources.

    *It is must to specify one source as master and other source asDetails.

    *When the data comes from homogeneous databases, in that time

    We must use Joiner condition in SQT.*When the data comes from Heterogeneous databases, in thatTime we must go to joiner.T.

    *It is an active Transformation and connected Transformation.

  • 8/14/2019 a Interview Faq's -1

    14/25

    EG:-If for example to join a flat file, and a relational sources or toJoin two flat files or to join a relational sources & XML sources.

    Joiner Transformation types1. Normal join 2.Master outer join 3.Detail outer join 4.Full outer join

    1. Normal join:-Normal join discards all the rows of the data from both masterand

    Detail Sources that dont match based on the condition.

    2.Master outer join:-

    3. Detail outer join:-

    4. Full outer join:-

    Note :-(1).joiner .T will decrease the performance.

    (2). Normal, master are faster than detail, full outer join.

    (3).Normal join is default join in joiner.T.(4). Non equi join is not supported by joiner.T.

    (5).To improve source performance using less no. of records asMaster.

    Senarios

    Q) We have 10 sources. Using joiner T/R how many joins you should use tojoin them?

    ANS: - A fundamental formula is n-1. So 10-1=9.Q) We have an oracle source; from this source we drag two tables EMP,

    DEPT and there is no common column for these two tables and howCould you join them?

    ANS: No we cannot join without having atleast one common port.

    Q) Which join gives more performance and why?

    ANS: Normal join. Because normal join gives records which match the condition.Where as in case of detail join it gives records which match the condition plus

    master table records, so performance decreases. Similarly for master join it givesRecords which match the condition plus detail table records, so performancedecreases

    Q) Is normal join = equi join or not?

    ANS: Yes.

    Q) What is the use of sorted input option? In which T/R can you use it?

    ANS: Joiner, Aggregator and Source qualifier.

  • 8/14/2019 a Interview Faq's -1

    15/25

    Q) If you use a joiner T/R, what rules you should follow?

    ANS:

    Q)In joiner T/R how can you improve the performance?

    ANS: By using sorted input, normal join condition we will us

    2. Source Qualifier Transformation:-

    We can use to join from relational or flat file sources to a mapping. And itrepresents the records the power center server reads when its run a session.It perform various tasks such as

    Overriding the default SQL query,Filtering the records,Joining data from two or more tables,We can sort the ports,We can select the distinct values,

    We can create the indexes and drop the indexes by using pre SQL and postSQL,We can create custom records andWe can use mapping parameters & variables

    Note:-1. It is a default Transformation.

    2.

    4. Expression Transformation:-

    5. Stored Procedure Transformation

    A:-*It is Passive Transformation, because does not change the no. of rowsAnd pass it from source to target.

    *It is an important tool for populating and maintaining databases.

    *It is a precompiled collection of Transact-SQL, PL-SQL or other databaseProcedural statements and similar to an executable script.

    *Stored procedures are stored and run within the database.

    *Stored procedures allow user-defined variables, conditional statements,And other powerful programming features. And it perform varies taskSuch as

    Check the status of a target database before loading data into it. Determine if enough space exists in a database. Perform a specialized calculation. Drop and recreate indexes

  • 8/14/2019 a Interview Faq's -1

    16/25

    *It is both connected and unconnected Transformation.Connected Transformation means connected to other transformationsWhich are in the data flow. where as unconnected Transformation means isNot connected to other transformations and will not be the data flow.

    *Run a stored procedure every time a row passes through the Connected orUnconnected in Stored Procedure transformation.

    *Pass parameters to the stored procedure and receive a single output

    Parameter by using Connected or Unconnected in S.P.T.*Pass parameters to the stored procedure and receive multiple outputParameters by using Connected or Unconnected in S.P.T.

    *Run a stored procedure before or after our session. By using UnconnectedIn S.P.T.

    *Run a stored procedure once during our mapping, such as pre- or post-Session by using Unconnected in S.P.T.

    *Call multiple times within a mapping by using Unconnected in S.P.T.

    # Properties in Stored Procedure transformation:-

    1. Stored Procedure Name:-2. Connection Information:-3. Call Text:-4. Stored Procedure Type:-5. Execution Order:-

    Using Type 2 Slowly Changing Dimensions

    With Type 2 SCD, you always create another version of dimension record andmark the existing version as history. To accommodate this, you need to createextra metadata for your dimension table, including an effective date column andan expiration date column. These columns are used to differentiate a currentversion from a historical version as follows:

    Effective date column stores the effective date of the version; also known as startdate

    Expiration date column stores the expiration date of the version; also known asend date

    Expiration date value of the current version is always set to NULL or a defaultdate value

    You also need to decide which columns you want to store historic data for whenthe values are to be changed. These columns are defined as trigger columns andshould be described as part of your metadata.

  • 8/14/2019 a Interview Faq's -1

    17/25

  • 8/14/2019 a Interview Faq's -1

    18/25

    (7)What is the effect of the OPTIONS statement ERRORS=1

    Stop on errors=1 (if you set this option to 1 the session will be stopped afteroccurance of 1 error row. if it is 0 the session will not be stopped even u got nnumber of errors.

    (8)What are push and pull ETL strategies?

    (9)How do you tell aggregator stage that input data is already sortedCan anyone please explain why and where do we exactly use the lookuptransformations.

    Lookups can be used for validation purpose.

    (10)What is a three tier data warehouse?

    Three tier data warehouse contains three tier such as bottom tier,middle tier andtop tier.

    Bottom tier deals with retrieving related datas or information from variousinformation repositories by using SQL.Middle tier contains two types of servers.1. ROLAP server2.MOLAP serverTop tier deals with presentation or visualization of the results

    (11)What are the various methods of getting incremental recordsor

    Delta records from the source systems?

    getting incremental records from source systems to target can be doneby using incremental aggregation transformation or

    We can use control table update and ipf files for capturing incremental data ordelta data from a source. Control table will maintain the details like from whichtimestamp (previous) to which timestamp (current) we have taken the data.

    If the session is taking data everyday (daily run) then the delta will be of one day.Previous timestamp will be of yesterday's date (P1) and current timestamp will bethe time (C1)of run of the job. So in todays run we will get one day data.

    Next day when job runs C1 will become P2 and todays run time will become C2,so we will not miss any records incremented i the source systems. This will goon.Incase on weekly runs the delta will be of one week.

    (12) What are the different Lookup methods used in Informatica?

    In the lookup transormation mainly 2 types

    1)connected 2)unconnected lookup

    Connected lookup: 1)It recive the value directly from pipeline

    http://void%280%29/http://void%280%29/
  • 8/14/2019 a Interview Faq's -1

    19/25

    2)it iwill use both dynamic and static

    3)it return multiple value

    4)it support userdefined value

    (2)Unconnected lookup:it recives the value :lkp expression

    2)it will be use only dynamic

    3)it return only single value

    4)it does not support user defined values

    (19) where do we use connected and un connected lookups

    If return port only one then we can go for unconnected.

    More than one return port is not possible with Unconnected. If more than onereturn port then go for Connected.

    Connected transformation is connected to other transformations ordirectly to target table in the mapping. An unconnectedtransformation is not connected to other transformations in themapping. It is called within another transformation, and returnsa value to that transformation. Connected lookup receives inputvalues directly from mapping pipeline whereas un connected lookupreceives values from: LKP expression from anothertransformation. Connected lookup returns multiple columns fromthe same row whereas UnConnected lookup has one return port andreturns one column from each row. Connected lookup supportsuser-defined default values whereas UnConnected lookup does notsupport user defined values.

    (13) What is a mapping, session, worklet, workflow, mapplet?

    A mapping represents dataflow from sources to targets.A mapplet creates or configures a set of transformations.

    A workflow is a set of instruction sthat tell the Informatica server how to executethe tasks.

    A worklet is an object that represents a set of tasks.

    A session is a set of instructions to move data from sources to targets.

    (14)How can we use mapping variables in Informatica? Where do we

    Use them?

    After creating a variable, we can use it in any expression in a mapping or amapplet. Als they can be used in source qualifier filter, user defined joins orextract overrides and in expression editor of reusable transformations.

  • 8/14/2019 a Interview Faq's -1

    20/25

    Their values can change automatically between sessions.

    (15) What is Global and Local shortcut? Explain with advantages?

    Global shortcuts are across multiple repositories,Local shortcuts are across multiple folders in the same repository.

    *(16) In real time scenario where update strategy transformation is used? If

    we dml operations in session properties then what is the use of updatestrategy transforamtion?

    1. Mapping level.

    2.session level.

    Importence of Update strategy transformation in both cases as follows.

    In real time if we want to update the existing record with the same sourcedata you can go for session level update logic.

    If you want to applay different set of rules for updating or inserting a record, eventhat record is existed in the warehouse table .you can go for mapping levelUpdate strategy transformation.

    It means that if you are using Router transformation for performaning differentactivities.

    EX: If the employee 'X1234 ' is getting Bonus then updating the Allowance with10% less.If not, inserting the record with new Bonus in the Warehouse table.

    (17).Lets suppose we have some 10,000 odd records in source system andwhen load them into target how do we ensure that all 10,000 records thatare loaded to target doesn't contain any garbage values.

    How do we test it. We can't check every record as numbers of records arehuge

    You should have proper tesing conditions in your ETL jobs for validating all theimportant columns before they are loaded into the target. Always have properrejects to capture records containing garbage values.

    or

    Go into workflow monitor after showing the status succeed click right button gointo the property and you can see there no of source row and success targetrows and rejected rows

    (18) What is Entity relation?? How is works with Datawarehousing ETL

    modeling???

    Ans:Entity is nothing but an Object, it has characteristics.We call entity in termsof Logical view.The entity is called as a table in terms of Physical view.

    http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/
  • 8/14/2019 a Interview Faq's -1

    21/25

    The Entity relationship is nothing but maintaining a primary key,foreign keyrelation between the tables for keeping the data and satisfying the Normal form.

    There are 4 types of Entity Relationships.

    1.One-One,

    2.One-Many,

    3.Many-One,

    4.Many-Many.

    In the Datawarehouse modeling Entity Relationship is nothing but,a Relationshipbetween dimension and facts tables(ie:Primary,foreign key relations betweenthese tables).

    The fact table getting data from dimensions tables because it containing primarykeys of dimension tables as a foreign keys for getting summarized data for eachrecord. or

    Entity is nothing but an object usually it is table containing theattributes(columns)ETL is a tool not a modeling .. it's useful for transfrering thedata from the sources (hetogeniuos) to the target (warehouse) modeling is muchuseful for forming the fact's and dimension's which are used for decision makingprocess using OLAP tools after forming the warehouse.

    (20) What are the various test procedures used to check whether the datais loaded in the backend, performance of the mapping, and quality of thedata loaded in INFORMATICA.

    The best procedure to take a help of debugger where we monitor each and everyprocess of mappings and how data is loading based on conditions breaks.

    Or

    Hi, u can check out the session logfiles for the total number of records addednumber of records udated and number of rejected records and errors related tothat so this is the answer the interviewer is expecting from usk

    (21) what is the metadata extension?

    Informatica Client applications can contain the following types of metadataextensions:

    Vendor-defined. Third-party application vendors create vendor-defined

    metadata extensions. You can view and change the values of vendor-

    defined metadata extensions, but you cannot create, delete, or redefine

    them.

    http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/
  • 8/14/2019 a Interview Faq's -1

    22/25

    User-defined. You create user-defined metadata extensions using

    PowerCenter/PowerMart. You can create, edit, delete, and view user-

    defined metadata extensions. You can also change the values of user-

    defined extensions

    (22)can we lookup a table from source qualifier transformation. ie. unconnectedlookup

    I think we cannot lookup through a source qualifier as we use the souce to lookup the tables so it is not possible i think or

    You cannot lookup from a source qualifier directly. However, you can overridethe SQL in the source qualifier to join with the lookup table to perform the lookup.

    (22)Can Informatica load heterogeneous targets from heterogeneoussources?

    Informatica can load Hetrogeneuous Targets from hetrogeneuous Sources

    Or yes, But it supports only in 7.1

    What are parameter files ? Where do we use them?

    Parameter files are used for static variable exectuion of a task. this file can bmodifies/updated for later change in the parameter. say for ex, xyz="RAJAT" isdefined in the parameter file and now whererever XYZ is used in the mapping thedata is automatically taken as RAJAT. is we wann achange that we can changethat to any other varchar or int in the file.

    this file can be called upon in the session properties and shown the physicalpath in the server to read upon.

    (24) What are the modules in Power Mart?

    1. PowerMart Designer2. Server3. Server Manager

    4. Repository5. Repository Manager

    How can we use mapping variables in Informatica? Where do we use them?

    After creating a variable, we can use it in any expression in a mapping or amapplet. Als they can be used in source qualifier filter, user defined joins orextract overrides and in expression editor of reusable transformations.Their values can change automatically between sessions

    http://www.geekinterview.com/question_details/249http://www.geekinterview.com/question_details/249
  • 8/14/2019 a Interview Faq's -1

    23/25

    Techniques of Error Handling - Ignore , Rejecting bad records to a flat file ,loading the records and reviewing them (default values)

    Rejection of records either at the database due to constraint key violation orthe informatica server when writing data into target table.These rejectedrecords we can find in the badfiles folder where a reject file will be created for

    a session.we can check why a record has been rejected.And this bad filecontains first column a row indicator and second column a column indicator.These row indicators or of four typesD-valid data,O-overflowed data,N-null data,T- Truncated data,And depending on these indicators we can changes to load data successfullyto target

    111Cached Lookup and an Uncached Lookup

    (A) For a cached lookup the entire rows (lookup table) will be put in the buffer,and compare these rows with the incomming rows.

    where as uncached lookup, for every input row the lookup will query thelookup table and get the rows.

    12 What is Assignment task in informatica? In what situation this task will beexecuted? Where this task exits?

    The Assignment task allows you to assign a value to a user-defined workflow

    variable. To use an Assignment task in the workflow, first create and add theAssignment task to the workflow. Then configure the Assignment task toassign values or expressions to user-defined variables. After you assign avalue to a variable using the Assignment task, the PowerCenter Server usesthe assigned value for the variable during the remainder of the workflow.

    parameter file

    how to create parameter file and how to use it in a mapping explain with example

    Please place your parameter file in the server "srcfiles" with data in it.Inmapping designer window of powercenter designer,click on "Mapping" andthen "Parameter and variable".Add all the parameter here one by one.

    Now you can able to see the variable with "$$" added in the above will beavailable in your mapping.This variable inturn picks value from the parameterfile.Donot forget to give "parameter filename" in the "property" tab of task inworkflow manager.

    What is Target Update Override? What is the Use ?

    http://void%280%29/http://void%280%29/
  • 8/14/2019 a Interview Faq's -1

    24/25

    target update override it is also like souce qualifier override,target updateoverride is use ful to update the target with out using the update strategytransformation.

    What are tracing levels in transformation?

    Tracing level in the case of informatica specifies the level of detail of informationthat can be recorded in the session log file while executing the workflow.

    4 types of tracing levels supported

    1.Normal: It specifies the initialization and status information and summerizationof the success rows and target tows and the information about the skipped rows

    due to transformation errors.

    2.Terse specifies Normal + Notification of data

    3.Verbose Initialisation : In addition to the Normal tracing, specifies thelocation of the data cache files and index cache files that are treated and detailed

    transformation statistics for each and every transformation within the mapping.

    4. Verbose data: Along with verbose initialisation records each and every recordprocessed by the informatica server

    For better performance of mapping execution the tracing level should bespecified as TERSE

    Verbose initialisation and verbose data are used for debugging purpose.

    why do we need lookup sql override? Do we write sql override in lookupwith special aim?

    Lookup override can be used to get some specific records(using filters inwhere clause) from the lookup table. Adavantages are that the whole tableneed not be looked up..

    What is pre-session and post-session?

    Pre-session:Before executing session, this pre-session command executesPost-session:After the completion of execution of this session, the script in the post-

    session runs.

    What is a Shortcut and What is the difference between a Shortcut and aReusable Transformation?

    A Reusable Transformation can only be used with in the folder. but a shortcutcan be used anywhere in the Repository and will point to actual Transformation..

    14.What are the locks with respect to mappings?

    http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/
  • 8/14/2019 a Interview Faq's -1

    25/25

    How do you manually lock or unlock the mappings for chages?

    If your PowerCenter repository is version-enabled, then you need to check-out/check-in in order to modify any objects (incl.. mappings). This is how you do theversion control and locking on objects...!

    16. PMCMD performs following tasks1)start and stop batches and sessions

    2)recovery sessions3)stops the informatica4)schedule the sessions by shell scripting5)schedule the sessions by using operating system schedule tools like CRON

    17.Explain Session Recovery Process?

    You have 3 steps in session recovery

    If Informatica serverperforms no commit, run the session againAt least one commit, perform recovery

    perform recovery is not possible, truncate the target table and run the sessionagain.

    What are the Commit & Commit Intervals?

    Commit interval is a interval in which the Informatica server loads the data intothe target.

    18.How to run a workflow without using GUI i.e, Worlflow Manager, WorkflowMonitor and pmcmd?

    pmcmd is not GUI. It is a command you can use within unix script to run theworkflow.or

    Unless the job is scheduled, you cannot manually run a workflow without using aGUI.

    http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/