ab initio training - part 1

Upload: sanjay-nayak

Post on 04-Jun-2018

247 views

Category:

Documents


9 download

TRANSCRIPT

  • 8/13/2019 Ab Initio Training - Part 1

    1/105

    24 August 2007

    A Practical Introduction to

    Ab Initio Software:

    Part 1

  • 8/13/2019 Ab Initio Training - Part 1

    2/105

    Course Structure

    Part 1: Basic Concepts and DML

    Part 2: Building Applications

    & Parallelism

    Part 3: Parallel Topics

    Database Connectivity (Optional)

    Intermediate

    Exercises

    Day 1

    Day 2

    Finger Exercises

  • 8/13/2019 Ab Initio Training - Part 1

    3/105

    What Does Ab Initio Mean?

    Ab Initio is Latin for From the Beginning.

    From the beginning our software was designed to support a completerange of business applications, from simple to the most complex.Crucial capabilities like parallelism and checkpointing cant be addedafter the fact.

    The Graphical Development Environment and a powerful set ofcomponents allow our customers to get valuable results from thebeginning.

  • 8/13/2019 Ab Initio Training - Part 1

    4/105

    Ab Initios focus

    Moving Data

    move small and large volumes of data in an efficient manner

    deal with the complexity associated with business data

    High Performance

    scalable solutions

    Better productivity

  • 8/13/2019 Ab Initio Training - Part 1

    5/105

    Ab Initio Software

    Ab Initio software is a general-purpose data processing platformfor mission-critical applications such as:

    Data warehousing

    Batch processing

    Click-stream analysis

    Real Time Applications

    Data movement

    Data transformation

  • 8/13/2019 Ab Initio Training - Part 1

    6/105

    Parallel Computer Architecture

    Computers come in many shapes and sizes:

    Single-CPU, Multi-CPU

    Network of single-CPU nodes

    Network of multi-CPU nodes

    Multi-CPU machines are often called SMPs (for Symmetric MultiProcessors).

    Specially-built networks of machines are often called MPPs (forMassively Parallel Processors).

  • 8/13/2019 Ab Initio Training - Part 1

    7/105

    A Multi-CPU Computer (SMP)

  • 8/13/2019 Ab Initio Training - Part 1

    8/105

    A Network of Multi-CPU Nodes

  • 8/13/2019 Ab Initio Training - Part 1

    9/105

    A Network of Networks

  • 8/13/2019 Ab Initio Training - Part 1

    10/105

    Ab Initio Provides For:

    Distribution - a platform for applications to execute across acollection of processors within the confines of a single machineor across multiple machines.

    Reduced Run Time Complexity - the ability for applications to run

    in parallel on any combination of computers where the Ab InitioCo>Operating System is installed from a single point of control.

  • 8/13/2019 Ab Initio Training - Part 1

    11/105

    Applications of Ab Initio Software

    Processing just about any form and volume of data.

    Parallel sort/merge processing.

    Data transformation.

    Rehosting of corporate data.

    Parallel execution of existing applications.

  • 8/13/2019 Ab Initio Training - Part 1

    12/105

    Applications of Ab Initio Software

    Front end of Data Warehouse:

    Transformation of disparate sources

    Aggregation and other preprocessing

    Referential integrity checking

    Database loading

    Back end of Data Warehouse:

    Extraction for external processing

    Aggregation and loading of Data Marts

  • 8/13/2019 Ab Initio Training - Part 1

    13/105

    Ab Initio Product Architecture

    Native Operating System (Unix, Windows, OS/390)

    The Ab Initio Co>OperatingSystem

    Component

    Library

    Development Environments

    GDE Shell3rd Party

    Components

    User-defined

    Components

    User Applications

    Ab Initio

    EME

  • 8/13/2019 Ab Initio Training - Part 1

    14/105

    Co>Operating System Services

    Parallel and distributed application execution

    Control

    Data Transport

    Transactional semantics at the application level.

    Checkpointing.

    Monitoring and debugging.

    Parallel file management.

    Metadata-driven components.

  • 8/13/2019 Ab Initio Training - Part 1

    15/105

    The Graph Model

  • 8/13/2019 Ab Initio Training - Part 1

    16/105

    The Graph Model: Naming the Pieces

    DatasetDatasets

    Components

    Flows

  • 8/13/2019 Ab Initio Training - Part 1

    17/105

    The Graph Model: Some Details

    Ports

    Record format

    metadataExpression

    metadata

  • 8/13/2019 Ab Initio Training - Part 1

    18/105

    Components

    Components may run on any computer running the Co>Operating

    System.

    Different components do different jobs.

    The particular work a component accomplishes depends upon itsparameter settings.

    Some parameters are data transformations, that is business rules to be

    applied to an input(s) to produce a required output.

  • 8/13/2019 Ab Initio Training - Part 1

    19/105

    Datasets

    A dataset is a source or destination of data. It can be a simple file, a

    database table, a SAS dataset, ...

    Datasets may reside on any machine running the Co>OperatingSystem.

    Datasets may reside on other machines if connected by FTP ordatabase middleware.

    Data is always described by record format metadata (termed dml).

  • 8/13/2019 Ab Initio Training - Part 1

    20/105

    Dataset: Records and Fields

    A dataset is made up ofrecords; a recordconsists of fields.

    Analogous databaseterms are rows andcolumns

    0345John Smith0212Sam Spade0322Elvis Jones0492Sue West0121Mary Forth0221Bill Black

    Records

    Fields

  • 8/13/2019 Ab Initio Training - Part 1

    21/105

    Sources of Record Format Metadata

    Record formats can be generated from:

    Database catalogs

    COBOL copybooks

    Other third-party products

    SAS datasets

    One can always resort to manual entry!

  • 8/13/2019 Ab Initio Training - Part 1

    22/105

    A Sandbox Environment

    Setting up a standard working environment helps a developmentteam work together.

    The Sandbox capability allows an application to be designed tobe trivially portable

    The Sandbox contents are a project administrative function

  • 8/13/2019 Ab Initio Training - Part 1

    23/105

    Sandbox Parameters

    Start the Ab Initio GDEOpen mp/figure-01.mp

    Go to Project-Edit Sandbox...

  • 8/13/2019 Ab Initio Training - Part 1

    24/105

    EnvironmentQuick Overview

    $AI_RUNrun directory

    $AI_DMLrecord format files

    $AI_XFRtransform files

    $AI_MPgraphs

    $AI_DBdatabase config files

    $AI_SERIAL - serial source data, other serial data

    $AI_MFS - Ab Initio multifile directoryin training will also containpartition directories (more about this later!)

    $AI_LOG - A location to place logging files, etc.

  • 8/13/2019 Ab Initio Training - Part 1

    25/105

    EnvironmentOverview

    We will make use of environment variables (shortcuts, parms)during class.

    The goal is to have a development environment which enablesthe migration of a graph or set of graphs to any other

    environment with absolutely no changes

  • 8/13/2019 Ab Initio Training - Part 1

    26/105

    Viewing Component Properties

    Double click on a

    component to bring

    up its Properties Page

  • 8/13/2019 Ab Initio Training - Part 1

    27/105

    Viewing Port Properties

    Click on the Ports Tab

    to view the Port(s)

    Properties

  • 8/13/2019 Ab Initio Training - Part 1

    28/105

    Record Format Metadata in Graphical Form

    0345John Smith0212Sam Spade0322Elvis Jones0492Sue West0121Mary Forth0221Bill Black

  • 8/13/2019 Ab Initio Training - Part 1

    29/105

    Editing Types in GDE

    Field name Field type Field length

    Dont do a Save when exiting

  • 8/13/2019 Ab Initio Training - Part 1

    30/105

    The Record Format Metadata in text form

    record

    decimal(4) id;

    string(6) first_name;

    string(6) last_name;

    string(5) newfield;

    end

  • 8/13/2019 Ab Initio Training - Part 1

    31/105

    Field Names

    Names consist of letters, digits, and underscores:

    a z, A Z, 0 9, _

    Note: No spaces, hyphens, $s, #s, %s

    Case does matters! ABC and abc are different!

    Some words are reserved (record, end, date, )

  • 8/13/2019 Ab Initio Training - Part 1

    32/105

    Field Type and Field Length

    There are several built-in types available via the drop-down menu. This

    course uses three types: string, decimal (for all numbers), and date.

    A date type requires a format specifier that is an exact representationof the date (e.g., MM-DD-YYYY).

    A field length is either a number for fixed-length fields, or the delimiterthat terminates the field for variable-length fields.

  • 8/13/2019 Ab Initio Training - Part 1

    33/105

    What Data Can Be Described?

    There are both fixed-size and variable-length types.

    ASCII, EBCDIC, UNICODE character sets are supported.

    Supported types can represent strings, numbers, binarynumbers, packed decimals, dates

    Complex data formats can consist of nested records, vectors, ...

  • 8/13/2019 Ab Initio Training - Part 1

    34/105

    Access to Field Characteristics

    Some aspects of field descriptions (e.g., date formats) must be

    accessed via the attribute pane.

    To see additional attributes, use the Attributes item on theRecord Format Editors View Menu or use the Attributes button.

  • 8/13/2019 Ab Initio Training - Part 1

    35/105

    More Record Format Editing

    View Attributes.

    Field Type drop-down

    Length can be delimiter string

    Date format goes here

  • 8/13/2019 Ab Initio Training - Part 1

    36/105

    Text Record Format for Date Field

    record

    decimal(4) id;

    string(6) first_name;

    string(6) last_name;

    date("YYYY-DD-MM") newfield;

    end;

  • 8/13/2019 Ab Initio Training - Part 1

    37/105

    Expressions in DML

    Computations are expressed in the algebraic syntax of C, Pascal, etc.

    Field names act as variables.

    Arithmetic operators: +, -, *, ...

    Comparison operators: >,

  • 8/13/2019 Ab Initio Training - Part 1

    38/105

    Viewing Data (mp/figure-01.mp)

    1. Right click on dataset.

    2. Select View Data...

  • 8/13/2019 Ab Initio Training - Part 1

    39/105

    The View Data Panel

  • 8/13/2019 Ab Initio Training - Part 1

    40/105

    Evaluating Expressions from View Data

    Type in an expression...

    or use the expression editor

  • 8/13/2019 Ab Initio Training - Part 1

    41/105

    Expression Editor

    Expression text

    Fields Functions Operators

  • 8/13/2019 Ab Initio Training - Part 1

    42/105

    Exercise 1: Writing DML

    Open mp/ex1.mp

    The data file ex1.dat contains these lines:

    Smith,John,1992.02.23,2400

    Jones,Jane,1993.10.29,320

    Warren,Jake,1994.11.02,9045

    Use the Record Format Editor (New) to create a description of this data:lastname, firstname, pur_date, and amt. Then use View Data to verifythe description is correct.

    Hint: Newline delimiters are written: \n

  • 8/13/2019 Ab Initio Training - Part 1

    43/105

    Simple Components

    In these components the recordformat metadata does notchange from input to output

  • 8/13/2019 Ab Initio Training - Part 1

    44/105

    The Filter by Expression Component

    For each record on the input port the select_expr parameter is

    evaluated. If select_expr evaluates true (non-zero), the input record iswritten to the out port exactly as the input was read.

    If the select_expr evaluates false (zero), the record is written to thedeselect port.

    The out port must be connected downstream, those records meetingthe select_expr criteria

    The deselect output may be optionally used

    Filt D t (S l ti ) (fi 02)

  • 8/13/2019 Ab Initio Training - Part 1

    45/105

    Filter Data (Selection) (figure-02)

    1. Push Run button.

    2. View monitoring information.3. View output data.

    E i P t

  • 8/13/2019 Ab Initio Training - Part 1

    46/105

    Expression Parameter

    E i 2 D t Filt i (S l ti )

  • 8/13/2019 Ab Initio Training - Part 1

    47/105

    Exercise 2: Data Filtering (Selection)

    Using example graph figure-02.mp, change the select expression

    parameter of the Filter by Expression component to selectrecords with id greater than 215.

    Run the application and examine the resulting data.

    K

  • 8/13/2019 Ab Initio Training - Part 1

    48/105

    Keys

    A key identifies a single field or set of fields (a composite key) used to

    organize a dataset in some way.Single field: {id}

    Multiple field: {last_name; first_name}

    Modifiers: {id descending}

    Used for sorting, grouping, partitioning.

    (See the Data Manipulation Language Reference for more information onkeys. Note: keys are also called collators.)

    Th S t C t

  • 8/13/2019 Ab Initio Training - Part 1

    49/105

    The Sort Component

    Reads records from input port, sorts them by key, and writes the

    result on the output port.

    S ti ( /fi 03 )

  • 8/13/2019 Ab Initio Training - Part 1

    50/105

    Sorting (mp/figure-03.mp)

    S ti Th K S ifi Edit

  • 8/13/2019 Ab Initio Training - Part 1

    51/105

    Sorting - The Key Specifier Editor

    Exercise 3: Sorting

  • 8/13/2019 Ab Initio Training - Part 1

    52/105

    Exercise 3: Sorting

    Using example graph figure-03.mp, change the key parameter of

    the Sort component to sort the data by first_name.

    Run the application and examine the resulting data.

    More Complex Components

  • 8/13/2019 Ab Initio Training - Part 1

    53/105

    More Complex Components

    In these components the record

    format metadata typically changes(goes through a transformation)from input to output

    Data Transformation

  • 8/13/2019 Ab Initio Training - Part 1

    54/105

    Data Transformation

    0345,090263John,Smith;

    1000345Smith 1963.09.02

    Drop

    id+1000000

    Reformat

    Reformat Reorder

    Input record format:

    recorddecimal(,) id;date(MMDDYY) bday;string(,)first_name;string(;) last_name;

    end

    Output record format:

    recorddecimal(7) id;string(8) last_name;

    date(YYYY.MM.DD) bday;end

  • 8/13/2019 Ab Initio Training - Part 1

    55/105

    Transformation Functions

  • 8/13/2019 Ab Initio Training - Part 1

    56/105

    Transformation Functions

    A transform function specifies the business rules used to create

    the output record.

    Each field of the output record must successfully be assigned avalue. Partial output records are not allowed!

    The Transform Editor is used to create a transform function in agraphical manner.

    The Transform Function Editor

  • 8/13/2019 Ab Initio Training - Part 1

    57/105

    The Transform Function Editor

    Text DML: Transform Function Syntax

  • 8/13/2019 Ab Initio Training - Part 1

    58/105

    Text DML: Transform Function Syntax

    Transform Functions look like:

    output-variables :: name ( input-variables ) =

    begin

    assignments;

    end;

    Assignments look like:

    output-variable.field :: expression;

    (See the Data Manipulation Language Reference for more information ontransform functions.)

    The Transform Function in Text Format

  • 8/13/2019 Ab Initio Training - Part 1

    59/105

    The Transform Function in Text Format

    out :: reformat (in) =

    begin

    out.id :: in.id + 1000000;out.last_name :: string_concat(Mac, in.last_name);

    end;

    A Look Inside the Reformat Component

  • 8/13/2019 Ab Initio Training - Part 1

    60/105

    A Look Inside the Reformat Component

    b ca

    x zy

    A Record arrives at the input port

  • 8/13/2019 Ab Initio Training - Part 1

    61/105

    45QF9

    out :: trans(in) =begin

    out.x :: in.b - 1;out.y :: in.a;out.z :: fn(in.c);

    end;

    A Record arrives at the input port

  • 8/13/2019 Ab Initio Training - Part 1

    62/105

    The Transformation Function is evaluated

  • 8/13/2019 Ab Initio Training - Part 1

    63/105

    45 QF9

    out :: trans(in) =begin

    out.x :: in.b - 1;out.y :: in.a;out.z :: fn(in.c);

    end;

    The Transformation Function is evaluated

    Since every rule within the Transform function

  • 8/13/2019 Ab Initio Training - Part 1

    64/105

    44 RG9

    out :: trans(in) =begin

    out.x :: in.b - 1;out.y :: in.a;out.z :: fn(in.c);

    end;

    is successful, a result record is issued

    Since every rule within the Transform function

    The result record is written to the output port of the component

  • 8/13/2019 Ab Initio Training - Part 1

    65/105

    out :: trans(in) =begin

    out.x :: in.b - 1;out.y :: in.a;out.z :: fn(in.c);

    end;

    44 RG9

    The result record is written to the output port of the component

    Exercise 4: Reformat Data

  • 8/13/2019 Ab Initio Training - Part 1

    66/105

    Exercise 4: Reformat Data

    Using graph figure-04.mp, change the record format metadata of the

    Simple-Out dataset to add a new field called name of type string(20).

    Add a business rule to the existing transform function to populatename by concatenating first_name and last_name using string_concat.

    Run the graph and examine the results.

    Then modify the transform to trim the spaces from the first name beforeconcatenating with last name to get John Smith rather than John

    Smith

    Data Aggregation

  • 8/13/2019 Ab Initio Training - Part 1

    67/105

    Data Aggregation

    0345Smith Bristol 56

    0212Spade London 80322Jones Compton 120492West London 230121Forth Bristol 7

    0221Black New York 42

    Bristol 63

    Compton 12London 31

    New York 42

    Data Aggregation of Sorted/Grouped Input

  • 8/13/2019 Ab Initio Training - Part 1

    68/105

    Data Aggregation of Sorted/Grouped Input

    0345Smith Bristol 56

    0121Forth Bristol 70322Jones Compton 120212Spade London 80492West London 23

    0221Black New York 42

    Bristol 63Compton 12

    London 31

    New York 42

    The Rollup Component (mp/figure-05.mp)

  • 8/13/2019 Ab Initio Training - Part 1

    69/105

    p p ( p g p)

    By default, Rollup reads grouped (sorted) records from the inputport, aggregates them as indicated by keyand t ransformparameters, and writes the resulting aggregate record on the outport.

    Built-in Functions for Rollup

  • 8/13/2019 Ab Initio Training - Part 1

    70/105

    p

    The following aggregation functions are predefined and are only

    available in the rollup component:

    avg max

    count min

    first product

    last sum

    Rollup Wizard

  • 8/13/2019 Ab Initio Training - Part 1

    71/105

    Note the use of an aggregation function in the expression

    p

    Exercise 6: Rollup Data

  • 8/13/2019 Ab Initio Training - Part 1

    72/105

    p

    Using example graph figure-05.mp, modify the transform function

    to count the number of records for the same city.

    Run the application and examine the results.

    Joining Data

  • 8/13/2019 Ab Initio Training - Part 1

    73/105

    g

    0345Smith Bristol 560212Spade London 80322Jones Compton 120492West London 230121Forth Bristol 7

    0221Black New York 42

    0322970402 1242.500345970924 923.750121961211 12392.000492971123 234.120666950616 2312.10

    0345Bristol 561997/09/240212London 81900/01/01

    0322Compton 121997/04/020492London 231997/11/230121Bristol 71996/12/110221New York 421900/01/01

    Joining Sorted Data on the id field

  • 8/13/2019 Ab Initio Training - Part 1

    74/105

    g

    0121Forth Bristol 70212Spade London 80221Black New York 420322Jones Compton 12

    0345Smith Bristol 560492West London 23

    0121961211 12392.00

    0322970402 1242.50

    0345970924 923.750492971123 234.120666950616 2312.10

    0121Bristol 71996/12/110212London 81900/01/01...

    Building the Output Record

  • 8/13/2019 Ab Initio Training - Part 1

    75/105

    g

    in0:recorddecimal(4) id;string(6) name;string(8) city;decimal(3) amount;

    end

    in1:

    recorddecimal(4) id;date(YYMMDD) dt;decimal(9.2) cost;

    end

    out:record

    decimal(4) id;

    string(8) city;decimal(3) amount;date(YYYY/MM/DD)dt;

    end

    What if the in1 record is missing?

  • 8/13/2019 Ab Initio Training - Part 1

    76/105

    out:record

    decimal(4) id;

    string(8) city;decimal(3) amount;date(YYYY/MM/DD)dt;

    end

    in0:record

    decimal(4) id;string(6) name;string(8) city;decimal(3) amount;

    end

    in1:recorddecimal(4) id;date(YYMMDD) dt; ???decimal(9.2) cost;

    end

    Prioritized Assignment

  • 8/13/2019 Ab Initio Training - Part 1

    77/105

    In DML, a missing value (say, if there is no in1 record) causes anassignment to fail .

    If an assignment for a left hand side fails, the next priorityassignment is tried. There must be one successful assignment foreach output field.

    out.dt :1: in1.dt;

    out.dt :2: 1900/01/01;

    PriorityDestination Source

    Assigning Priorities to Business Rules

  • 8/13/2019 Ab Initio Training - Part 1

    78/105

    Resulting display when out.dt is selected

  • 8/13/2019 Ab Initio Training - Part 1

    79/105

    The Join Component

  • 8/13/2019 Ab Initio Training - Part 1

    80/105

    Join performs a join of inputs. By default, the inputs to join

    must be sorted and an inner join is computed.

    Note: The following slides and the on-line example assume thejoin-type parameter is set to Outer, and thus compute an outerjoin.

    Driving Key, max-core, Record - Required

    Joining (mp/figure-06.mp)

  • 8/13/2019 Ab Initio Training - Part 1

    81/105

    A Look Inside the Join Component*

  • 8/13/2019 Ab Initio Training - Part 1

    82/105

    out:: fname(in0, in1) =begin

    ...

    ...

    ...

    ...

    ...

    end;

    q rab ca

    Align inputs by key

    xa q

    q rab ca

    *join-type = Full

    Outer join

    Records arrive at the inputs of the Join

  • 8/13/2019 Ab Initio Training - Part 1

    83/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;

    out.q :2: XX;end;

    Align inputs by a

    NY 4G234 42G

    The input records are read into the Join component

  • 8/13/2019 Ab Initio Training - Part 1

    84/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;out.q :2: XX;

    end;

    Align inputs by a

    NY 4G234 42G

    The input Key fields are compared

  • 8/13/2019 Ab Initio Training - Part 1

    85/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;out.q :2: XX;

    end;

    Align inputs by a

    NY 4G234 42G

    The aligned records are passed to the transformation function

  • 8/13/2019 Ab Initio Training - Part 1

    86/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;out.q :2: XX;

    end;

    Align inputs by a

    NY 4G234 42G

  • 8/13/2019 Ab Initio Training - Part 1

    87/105

    A result record is emitted and written out

  • 8/13/2019 Ab Initio Training - Part 1

    88/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;

    out.q :2: XX;end;

    Align inputs by a

    24G NY

    as long as all output fields have been successfullycomputed

    New records arrive at the inputs of the Join

  • 8/13/2019 Ab Initio Training - Part 1

    89/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;out.q :2: XX;

    end;

    Align inputs by a

    IL 8K79 23H

    Again, they are read into the Join component

  • 8/13/2019 Ab Initio Training - Part 1

    90/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;out.q :2: XX;

    end;

    Align inputs by a

    IL 8K79 23H

    The input key fields are compared

  • 8/13/2019 Ab Initio Training - Part 1

    91/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;out.q :2: XX;

    end;

    Align inputs by a

    IL 8K79 23H

    The aligned records are passed to the transformation function

  • 8/13/2019 Ab Initio Training - Part 1

    92/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;out.q :2: XX;

    end;

    Align inputs by a

    IL 8K

    79 23H

    The transformation engine evaluates based on the inputs

  • 8/13/2019 Ab Initio Training - Part 1

    93/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;out.q :2: XX;

    end;

    Align inputs by a

    IL 8K

    79 23H

    as all output fields are successfully computedA result record is generated and written out

  • 8/13/2019 Ab Initio Training - Part 1

    94/105

    out :: join(in0, in1) =begin

    out.a : : in0.a;out.x :1: in1.r + 20;out.x :2: in0.b + 10;out.q :1: in1.q;out.q :2: XX;

    end;

    Align inputs by a

    89H XX

    IL 8K

    as all output fields are successfully computed

    Exercise 7: Join Data

  • 8/13/2019 Ab Initio Training - Part 1

    95/105

    Using example graph figure-06.mp, modify the transform function

    to join visits.dat and last-visits.dat so that no records arerejected.

    Run the application, and examine the results. The Unmatched

    Last Visits dataset should be empty.

    Exercise 8 (if time): Join Retaining All Fields

  • 8/13/2019 Ab Initio Training - Part 1

    96/105

    Building upon the graph you created in Exercise 7, create a new

    output record format and transform function to join visits.datand last-visits.dat according to the following rules:

    Retain all fields from each dataset.

    Supply defaults where necessary.

    Change the necessary parameters, run the application, andexamine the results.

    Lookup Files

  • 8/13/2019 Ab Initio Training - Part 1

    97/105

    DML provides a facility for looking up records in a dataset based

    on a key:lookup(file-name, key-expression)

    The data is read from a file into memory.

    The GDE provides a Lookup File component as a special datasetwith no ports.

    Using lookup instead of Join

  • 8/13/2019 Ab Initio Training - Part 1

    98/105

    Using Last-Visitsas a lookup file

    Configuring a Lookup File

  • 8/13/2019 Ab Initio Training - Part 1

    99/105

    1. Label used as name in

    lookup expression

    3. Set record format2. Browse for pathname

    4. Set the lookup key

    Using a lookup file in a Transform Function

  • 8/13/2019 Ab Initio Training - Part 1

    100/105

    Input 0 record format:recorddecimal(4) id;string(6) name;string(8) city;decimal(3) amount;

    end

    Output record format:record

    decimal(4) id;string(8) city;decimal(3) amount;date(YYYY/MM/DD) dt;

    end

    Transform function:out :: lookup_info(in) =begin

    out.id : : in.id;out.city : : in.city;out.amount : : in.amount;out.dt :1 : lookup(Last-Visits, in.id).dt;out.dt :2 : 1900/01/01;

    end;

    Exercise 9 (if time): Lookup

  • 8/13/2019 Ab Initio Training - Part 1

    101/105

    Building upon the graph you created in Exercise 8, convert into

    lookup formatChange the necessary parameters, run the application, and

    examine the results.

    The GDE Debugger

  • 8/13/2019 Ab Initio Training - Part 1

    102/105

    The GDE has a built in debugger capability

    To enable the Debugger, Debugger:Enable Debugger

    The Debugger Toolbar

    Enable Debugger

    Add Watcher File Isolate Components

    Remove All Watchers

    The GDE Debugger

  • 8/13/2019 Ab Initio Training - Part 1

    103/105

    To add a Watcher File, select a flow and click Add Watcher

    To remove a Watcher File, click Remove All WatchersTo Isolate a set of components, select the components to be Isolated,

    Watcher Files will automatically be placed into the graph by theDebugger.

    Note that if the Watcher files do not exist, the GDE will build them during the first run only,using the Watchers on successive runs

    Q & A

  • 8/13/2019 Ab Initio Training - Part 1

    104/105

    Any Questions ?

  • 8/13/2019 Ab Initio Training - Part 1

    105/105

    CapgeminiWORLDWIDE HEADQUARTERS 6400 SHAFER COURT ROSEMONT, ILLINOIS USA 60018

    Tel. 847.384.6100 Fax 847.384.0500 WWW.Capgemini.COM