pervasive etl fundamental exercises

239
Pervasive Integration Platform Fundamental End User Training

Upload: medina-lalo

Post on 08-Apr-2016

330 views

Category:

Documents


1 download

DESCRIPTION

Pervasive ETL Fundamental Exercises

TRANSCRIPT

Page 1: Pervasive ETL Fundamental Exercises

Pervasive Integration Platform Fundamental End User Training

Page 2: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

2

Page 3: Pervasive ETL Fundamental Exercises

©2010 Pervasive Software Inc. All rights reserved. Design by Pervasive.

Pervasive is a registered trademark, and "Integrating the Interconnected World" is a trademark of

Pervasive Software Inc. Cosmos, Integration Architect, Process Designer, Map Designer, Structured

Schema Designer, Extract Schema Designer, Document Schema Designer, Content Extractor, CXL,

Process Designer, Pervasive Integration Engine, DJIS, Data Junction Integration Suite, Data

Junction Integration Engine, XML Junction, HIPAA Junction, and Integration Engineering are

trademarks of Pervasive Software Inc..

All names of databases, formats and corporations are trademarks or registered trademarks of their

respective companies.

This exercise scenario workbook is written for Pervasive’s Integration Platform software, version

9.x. (Dewberry)

Page 4: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

4

Table of Contents

Forward ............................................................................................................................................... 7

The Pervasive Integration Platform ................................................................................................. 8

Architectural Overview of the Integration Platform ........................................................................ 9

Design Tools ................................................................................................................................... 10

MetaData Tools .............................................................................................................................. 14

Production Tools ............................................................................................................................ 15

Inside a Simple Integration.............................................................................................................. 17

Course Setup Instructions ............................................................................................................... 19

Course Setup Instructions............................................................................................................... 20

Workspaces and Repositories .......................................................................................................... 24

Workspaces and Repositories Defined ........................................................................................... 25

Repository Explorer ......................................................................................................................... 26

Repository Explorer - Defined ........................................................................................................ 27

Splash Screen – Licensing and Version Information...................................................................... 28

Map Designer – Fundamentals of Transformation ....................................................................... 30

Map Designer – The Foundation ................................................................................................... 31 Interface Familiarization ................................................................................................................... 32 Basic Map .......................................................................................................................................... 33

Connectors and Connections – Methods of Accessing Data .......................................................... 38 Factory Connections .......................................................................................................................... 39 Macro Definitions .............................................................................................................................. 41 User Defined Connections ................................................................................................................. 44

Basic Transformation Features ...................................................................................................... 47 Source Data Features – Sort .............................................................................................................. 48 Source Data Features – Filter ............................................................................................................ 51 Target Output Modes - Replace, Append, Clear and Append ........................................................... 54 Target Output Modes – Delete .......................................................................................................... 57 Target Output Modes – Update ......................................................................................................... 59

The Rapid Integration Flow Language (RIFL) Script Editor ......................................................... 63 RIFL Script - Functions ..................................................................................................................... 64 RIFL Script – Flow Control .............................................................................................................. 69

Transformation Map Properties ..................................................................................................... 74 Reject Connection Info ...................................................................................................................... 75

Event Handlers & Actions .............................................................................................................. 78 Understanding Event Handlers .......................................................................................................... 79 Source and Target Buffers – ClearMapPut Action ............................................................................ 82

Page 5: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

5

Event Sequence Issues ....................................................................................................................... 85 Using Action Parameters – Conditional Put ...................................................................................... 89 Using OnDataChange Events ............................................................................................................ 92 Trapping Processing Errors with Events ........................................................................................... 96

Error and Exception Handling Review ........................................................................................ 100

Comprehensive Review .................................................................................................................. 102

Metadata – Using the Schema Designers ...................................................................................... 104

Structured Schema Designer ........................................................................................................ 105 No Metadata Available (ASCII Fixed) ............................................................................................ 106 External Metadata (Cobol Copybook) ............................................................................................. 107

Extract Schema Designer ............................................................................................................. 110 Interface Fundamentals & CXL ...................................................................................................... 112 Data Collection/Output Options ...................................................................................................... 116 Extract Schema Designer: Extracting Variable Fixed Field Definitions ........................................ 118

Process Designer for Data Integrator ........................................................................................... 121

Process Designer Fundamentals .................................................................................................. 122 Creating a Process ........................................................................................................................... 123 Parallel vs. Sequential Processing ................................................................................................... 128 Conditional Branching – The Step Result Wizard .......................................................................... 130 FileList - Batch Processing Multiple Files ...................................................................................... 132

Pervasive Integration Engine ........................................................................................................ 137

Syntax: Version Information ........................................................................................................... 138 Options and Switches ...................................................................................................................... 139 Execute a Transformation ................................................................................................................ 141 Using a “-Macro_File” Option ........................................................................................................ 142 Executing a Process ......................................................................................................................... 143

Additional Sample Exercises – Integration Engine ...................................................................... 144 Command Line Overrides – Source Connection ............................................................................. 145 Ease of Use: Options File ................................................................................................................ 146

Checklist – Integration Engine ..................................................................................................... 147

Intermediate Mapping Techniques ............................................................................................... 150

Multiple Record Type Structures .................................................................................................. 151 Multiple Record Type – 1 One-to-Many ......................................................................................... 152 Multiple Record Type – 2 Many-to-One ......................................................................................... 156

User Defined Functions ................................................................................................................ 159 Code Reuse – Save/Open a RIFL script Code Modules .................................................................. 160 Code Reuse - Code Modules ........................................................................................................... 161

Lookup Wizards ............................................................................................................................ 163 Incore Table Lookup ....................................................................................................................... 164

Relational Database Management System (RDBMS) Mapping ................................................... 168 Select Statements – SQL Passthrough ............................................................................................. 169 DJX in Select Statements – Dynamic Row sets .............................................................................. 171

Page 6: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

6

Multimode Introduction................................................................................................................... 173 Multimode – Data Normalization .................................................................................................... 176 Multimode Implementation with Upsert Action ............................................................................. 181

Reference ......................................................................................................................................... 185

Checklist – Starting Your Integration Project .............................................................................. 186

Upgrading from 8.x to 9.x ............................................................................................................ 188

Cosmos.ini Settings ...................................................................................................................... 189

Windows Default Installation Locations ...................................................................................... 190

Design Tool User Interfaces ......................................................................................................... 192

Setting Properties ......................................................................................................................... 194

Reading a Log File ....................................................................................................................... 195

Examples of Complex Process Layouts ........................................................................................ 197

Additional Documentation Resources .......................................................................................... 199

Glossary ........................................................................................................................................... 200

Appendix ......................................................................................................................................... 210

Additional Exercises ..................................................................................................................... 211 Extract Schema Designer: Extracting Fixed Field Definitions....................................................... 212 Integration Engine: Using the “-Set” Variable Option ................................................................... 214 Integration Engine: Scheduling Executions ................................................................................... 216 Lookup Wizard: Flat File Lookup .................................................................................................. 217 Lookup Wizard: Dynamic SQL Lookup ........................................................................................ 221 RDBMS: Integration Querybuilder ................................................................................................ 225 Structured Schema Designer: Binary Data and Code Pages .......................................................... 229 Structured Schema Designer: Reuse Metadata (Reusing a Structured Schema) ............................ 231 Structured Schema Designer: Multiple Record Type Support in Structured Schema Designer..... 233 Structured Schema Designer: Conflict Resolution ......................................................................... 237

Page 7: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

7

Forward

This course is designed to be presented in a classroom environment in which each student has access

to their own computer that has the Pervasive Integration Products installed as well as the

Fundamentals courseware. It could be used as a stand-alone tutorial course if the student is already

familiar with the interface of the Pervasive tools.

The Fundamentals course is not meant to be a comprehensive tutorial of all of our products. At the

end of this course it is our intention that a student will have a basic understanding of Map Designer,

Structured Schema Designer, Extract Schema Designer, Process Designer, and the Integration

Engine. The student should know how to use and how to expand their own knowledge of these

tools.

Further training can be obtained from Pervasive Training Services.

Any path mentioned in this document assumes a default installation of the Pervasive software and

the Fundamentals courseware. If the student installs differently, that will have to be taken into

account when doing exercises or following links.

We hope that the student enjoys this class and takes away everything needed. We welcome any

feedback.

Page 8: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

8

The Pervasive Integration Platform

This section describes the integration stack from the user’s perspective.

Page 9: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

9

Architectural Overview of the Integration Platform

This presentation depicts the architecture of the Integration Platform from the end-user’s

perspective. It briefly discusses all of the Integration tools and how they work together.

Integration General Overview.ppt

Page 10: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

10

Design Tools

Data Integrator includes 6 tools used to create maps (transformations), schemas, profiles and

processes. Each of the tools is discussed below.

Map Designer

Map Designer is the heart of the integration product tool set. It transfers data among a wide variety

of data types. In Map Designer, to transfer data, the user designs and runs what is called a

“Transformation” or a “Map”. Each Transformation created contains all the information Map

Designer needs to transform data from an existing data file or table to a new Target data file or table,

including any modifications made on the data during the transformation.

Map Designer solves complex Transformation problems by allowing the user to:

transform data between applications

combine data from external Sources

change data types

add, delete, rearrange, split or concatenate fields

parse and select substrings; pad or truncate data fields

clean address fields and execute unlimited string and numerical manipulations

control log errors and events

define external table lookups

Map Designer creates two files (tf.xml and map.xml) that contain all the information necessary to

run a transformation. A transformation can be run from Map Designer, Process Designer or the

Integration Engine.

Map Designer is covered extensively in this course and is also explored in the Advanced and the

EDI/HIPAA courses.

Process Designer

Process Designer is a graphical data transformation management tool that can be used to arrange a

complete transformation project. Listed below are some of the Steps that a user can put into a

process:

Map Designer Transformation

SQL Command

Decision

RIFL Scripting

Command Line Application

SQL Server DTS Package

Sub-process

Validation

Page 11: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

11

XSLT

Queue

Iterator

Aggregator

Invoker

Transformer

Once the user has organized these Steps in the order of execution, the entire workflow sequence can

be run as one unit. This workflow is saved as an .ip.xml file which can be run from the Process

Designer or from Integration Engine.

Process Designer processes can also be packaged using the Repository Manager. This packaging

gathers all of the files that are required by the process and puts them into a single DJAR file that can

then be run from the Integration Engine.

This courseware covers some basic functionality of the Process Designer. Both the Advanced and

the EDI/HIPAA courses cover the more advanced functionality of this tool.

Structured Schema Designer

The Structured Schema Designer provides a visual user interface for designing structural data files.

The resulting metadata is stored as Structured Schema files with an .ss.xml extension. The .ss.xml

files include schema, record recognition rules and record validation rule information.

The Data Parser is used to manually parse flat Binary, fixed-length ASCII, or record manager files.

The Data Parser defines Source record length, Source field sizes and data types, and Source data

properties. It also assigns Source field names, and defines Schemas with multiple record types.

Structured Schema Designer can be used to import and read schemas from outside sources such as

Cobol Copybooks, XML DTD’s, or Oracle DDL’s.

The ss.xml files that are created by Structured Schema Designer are used as input in Map Designer

as part of a source or target connection.

There are courseware and exercises on the Structured Schema Designer in this document.

Extract Schema Designer

The Extract Schema Designer is a parser tool that allows the user to visually select fields and records

from text files that are of an irregular format. Some examples are:

Printouts from programs captured as disk files Reports of any size or dimension

ASCII or any type of EBCDIC text files

Spooled print files

Fixed length sequential files

Complex multi-line files

Downloaded text files (e.g., news retrieval, financial, real estate...)

HTML and other structured documents

Internet text downloads

E-mail header and body

Page 12: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

12

On-line textual databases

CD-ROM textbases

Files with tagged data fields

Extract Schema Designer creates schemas that are stored as CXL files. These files are then used as

input in Map Designer as part of a source connection.

There are courseware and exercises on the Extract Schema Designer in this document.

Document Schema Designer

Document Schema Designer is a Java-based tool that allows you to build templates for E-document

files. You can custom-build schema subsets for specific EDI Trading Partner and TranType

scenarios. In addition, the Document Schema Designer is also very useful to those working with

HL7, HIPAA, SAP (IDoc), SWIFT and FIX data files.

You can develop schema files for all e-documents that are compatible with Map Designer. The

document schemas serve several useful purposes:

File Structure

Metadata Support

Parsing Capabilities

Validation Support

In an easy-to-use GUI interface, the user selects desired segments from the "template" document

schemas that are generated from the controlling standards documentation. The segments are saved in

a schema file that can be edited. The user may also add segments from a "master" segment library,

add loops/segments/composites/elements by hand, add discrimination rules for distinguishing

loops/segments of the same type at the same level, and use code tables for data validation.

The user can copy, paste and delete any part of the structure, including the segments, elements,

composites loops, and fields (and their subordinate loops/segments/subcomponents).

The Document Schema Designer produces DS.XML document schema files that can be used as

input in Map Designer as part of a source or target connection. These files can also be used in a

Process as part of a Validation step.

This document does not have exercises or courseware on Document Schema Designer, though there

is a one-day course available from Pervasive Training Services.

Join Designer

Join Designer is an application that allows the user to join two or more single-record type data

sources prior to running a Map Designer Transformation on them. These sources do not have to be

of the same type. For example, an SQL database table could be joined with a simple ASCII text file.

The user first uses Source View Designer to create Source View Files that hold metadata about the

Sources. From these a Join View File is created, which contains the metadata needed by Map

Designer to treat the Source files as if they were a single Source. The user then supplies this Join

Page 13: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

13

View File to Map Designer using "Join Engine" as the connection type. The original Source files and

the Source View Files must still be available in the locations specified in the Join View File.

When a join is saved, a Join View File (.join.xml) is created. This can be supplied to Map Designer

as a Source file or used to create further joins.

While a join is limited to two Source files, you can use another join as a Source, thus building nested

joins to any level of complexity.

This document does not have exercises or courseware on Join Designer. There are exercises in the

Advanced course available from Pervasive Training Services.

Page 14: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

14

MetaData Tools

The design tools create artifacts that are XML files (except Extract Schema Designer). The Metadata

tools organize these file during development, and manipulate these files to be used for production.

Repository Explorer

The Repository Explorer is the central location from which the user can launch all of the Designers,

including the Map Designer, Process Designer, Join Designer, Extract Schema Designer, Structured

Schema Designer, Source View Designer and Document Schema Designer.

The User can also open any Repository that has been created, and then open Transformations,

Processes or Schema files in that Repository list.

The Repository Explorer can also access the version control functionality of CVS or Microsoft

Visual SourceSafe, and can check files in and out of repositories using commands in Repository

Explorer.

There is courseware about the Repository Explorer in this document.

Repository Manager

Repository Manager is designed to facilitate the tasks of managing large numbers of Pervasive

design documents, contained in multiple repositories in multiple workspaces.

Repository Manager provides a single application to directly access any number of Pervasive design

documents, view their contents, make simple updates, bundle them into a package, and generate

reports.

The features of Repository Manager include:

Open and work with any number of defined Workspaces.

Browse the hierarchy of Workspaces, Repositories, Collections, and Documents.

Search for documents based on text strings, regular expressions, date ranges, Document

Types, document-specific fields.

Make minor updates to documents.

Generate an impact analysis of proposed document modifications.

Import and export Documents and Collections.

Package Processes and related documents into a single entity (DJAR) that can be more

easily managed and transported.

View and print documents and Reports.

This document does not have exercises or courseware on Repository Manager, though there is an

exercise in the Advanced course available from Pervasive Training Services.

Page 15: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

15

Production Tools

These are the tools that allow the user to automate their Transformations and Processes in their

production environment.

Integration Engine

Integration Engine is an embedded data Transformation engine used to deploy runtime data

replication, migration and Transformation jobs on Windows or Unix-based systems. Because

Integration Engine is a pure execution engine with no user interface components, it can perform

automatic, runtime data transformations quickly and easily, making it ideal for environments where

regular data transformations need to be scheduled and launched. Integration Engine supports the

following operating systems:

Windows 2000, Windows XP, Windows Server 2003, HPUX, Sun Solaris, IBM AIX, and Linux.

The Integration Engine has the capability to work with multiple threads if a multi-threaded license is

purchased.

There is courseware about the Integration Engine in this document.

Integration Server

Integration Server is actually an SDK that is installed by default when the integration platform is

installed. The core components of the Integration Server SDK are the Engine Controller, Engine

Instances (Managed Pool), and the Client API that accesses the Engine Controller through a proxy.

Server stability is maintained, scalability enhanced, and resources are spared through the use of a

control-managed pool of EngineExe objects. This allows the Integration Engine to be called as a

service.

This document does not have exercises or courseware on the Integration Server, though there is a

one-day course available from Pervasive Training Services that covers the Integration Server and the

Integration Manager.

Integration Manager

Through a browser-based interface, Integration Manager performs deployment, scheduling, on-going

monitoring, and real-time reporting on individual or groups of distributed Integration Engines. Since

all management is performed from a single administration point, Integration Manager improves

operational efficiency in the management of geographically distributed Integration Engines. With the

ability to remotely administer any number of integration points throughout the organization,

customers can build out their integration infrastructure as required, using a flexible and scalable

architecture designed for easy manageability. In other words, the Integration Manager allows the

user to schedule and deploy multiple packages (DJAR) amongst multiple Integration Servers across

an enterprise.

Page 16: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

16

This document does not have exercises or courseware on the Integration Manager, though there is a

one-day course available from Pervasive Training Services that covers the Integration Server and the

Integration Manager.

Page 17: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

17

Inside a Simple Integration

Page 18: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

18

Page 19: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

19

Course Setup Instructions

Page 20: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

20

Course Setup Instructions

Installing the Software

When installing on a Windows system you may be required to log on as a local administrator for the

installation to succeed. Exit all programs before running the setup. Run the setup, and follow the

wizard instructions. Select one of the 2 following options for installation.

1. Design Studio – Installs the Designers, utilities, and the Integration Engine.

2. Integration Engine – Installs the Integration Engine and its utilities.

For the purposes of this course we will install the Design Studio. If you are taking this class on site

with Pervasive, the software has already been installed on the training modules.

Launch the software by clicking on the Repository Explorer 9 icon on the desktop. You will be

prompted to load a valid license.

For more information on Windows Default Installation Locations see the Reference Section.

Licensing

A temporary license file will be provided to you by the training services manager. This temporary

license will allow you to utilize all of the capabilities of the Integration Platform for at least two

weeks. If you are receiving training on site with Pervasive software, the license may appear on your

desktop. The license file will have a “.slc” extension.

You may store your license in any directory that you wish. The default location for storing a license

on a Windows machine is C:\Documents and Settings\All Users\Application

Data\Pervasive\Cosmos9\Common\License. You can store the license to this directory. After you

have determined where the license will reside, double click on the Repository Explorer9 icon on

your desktop to launch the software. Choose the option to Browse to a valid license file on disc, and

browse to the location where you have stored the license.

Setup Course Directory Structure

Create a folder directly on the C Drive and name it Cosmos9_ Work. We will use this directory to

store all of the course materials. You should posses a zip file named Fundamentals9.zip. This zip

file should be provided for you via download by the training services manager. If you are taking this

class on site with Pervasive the zip should be on your desktop. Unzip the contents of the

Fundamentals9.zip into the Cosmsos9_Work directory that was just created. The resulting directory

structure should be C:\Cosmos9_Work\Fundamentals. See the image below.

Page 21: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

21

Configuring Database Connectivity for Hands-on Exercises

The exercises in the Solutions folder of the Fundamentals training bundle are built using an ODBC

connection. To set up, the student must establish an ODBC connection called “TrainingDB” to their

preferred database. Any relational non-production database can be used in the classroom. Be aware

that the login used for the connection must have sufficient permissions for creating and deleting

tables in the database.

There is an Access database provided in the training bundle. The Access database requires no

additional software. Follow the steps below to create the ODBC connection to the Access Database

in the training bundle.

While ODBC allows us to use a more flexible middleware connection to databases, please be

aware that you will generally have better performance and more functionality if you use the native

client interfaces instead of an ODBC driver.

1. From the Start Menu choose Programs › Administrative Tools › Data Sources (ODBC),

or from the Start Menu choose Control Panel › Administrative Tools › Data Sources

(ODBC).

2. In the ODBC Data Source Administrator create a new User DSN by clicking Add. See

image below.

Page 22: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

22

3. Choose the Microsoft Access Driver (*.mdb ).

4. Set the Data Source name to “TrainingDB”, and click the Select button.

5. Browse to the folder C:\Cosmos9_Work\Fundamentals\Data and select the TrainingDB.mdb

database.

6. Click OK.

Page 23: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

23

Page 24: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

24

Workspaces and Repositories

Page 25: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

25

Workspaces and Repositories Defined

Workspaces

A workspace is a directory location on your system that allows you to organize your integration

designs. All workspaces will reside inside of a common Workspace Root Directory. Your default

workspace root directory is C:\Documents and Settings\username\Cosmos9_Work. Your default

workspace location is C:\Documents and Settings\username\Cosmos9_Work\Workspace1. Every

workspace must have at least one repository.

Repositories

Repositories are used to store the maps, processes, and schemas that make up your integration

designs. The repository is typically a folder in a workspace directory; however the repository folder

does not have to physically reside within the workspace folder to belong to the workspace. You may

have many repositories within a workspace. You are required to have at least one. A default

repository is created within the default workspace. There is more information about repositories

contained in the next section. Your default repository location is

C:\Documents and Settings\username\Cosmos9_Work\Workspace1\xmldb.

Page 26: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

26

Repository Explorer

Page 27: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

27

Repository Explorer - Defined

Repository Explorer is at the heart of the integration product design environment. In this central

location, you can launch all of the Design Tools. These tools include Map Designer, Process

Designer, Join Designer, Extract Schema Designer, Structured Schema Designer, Source View

Designer and Document Schema Designer.

Repository Explorer allows you to create and explore multiple Repositories for any given

Workspace. This functionality allows you to separate your metadata according to your specific

project specifications. The following are two scenarios that you might choose.

1. Create separate repositories for the Development, QA, and Production phases of your

project, and promote your specification files from one repository to the next as you advance

through each of these phases. You should choose this scenario when projects are defined to

belong to separate Workspaces. Projects should belong to separate workspaces when the

data that is being transformed for projects does not access the same input and output

directories, or the same databases.

2. Create separate repositories representing different projects altogether. You should choose

this scenario when projects are designed to belong to the same Workspace. These projects

should have common threads. I.E. the same input/output directories or they are accessing the

same databases.

Change the Current Workspace Root Directory

Select File › Manage Workspaces (Ctrl+Alt W). Change the Workspaces Root Directory to the

Cosmos9_Work folder that was created on your C Drive. This will allow you to use a list of

Repositories and Macro definitions specific to your current Workspace.

Modify the Default Repository in the Current Workspace

Click on the Repositories button in the bottom right-hand corner of the Workspaces dialog box.

When you change the Root Directory, a default Workspace and Repository will be created. We are

going to modify the default for use during training. Change the name “xmldb” to “Fundamentals”

and navigate to the folder C:\Cosmos9_Work\Fundamentals by clicking the Find button.

We will use this Repository to store all of the XML schema and metadata for the training exercises.

Page 28: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

28

Splash Screen – Licensing and Version Information

Description

Splash Screen - Shows the Splash Screen for Repository Explorer.

Credits - Gives a list of credits for third party software components used by the Product.

Version - Displays the following sections:

o License Name: Displays the PATH to the Product License file and the License file

name.

o Serial Number: Displays the Product serial number.

o Version: Displays the Product build version number.

o Subscription Ends: Displays the date the license file will expire.

o Users: Displays the number of users licensed for the Product.

o Single User License For: Name: Name of the person licensed for the Product.

Company: Name of the company licensed for the Product.

Licensed Features - Displays all of the Connectors, Features and Products that are licensed

in the Product.

Support - Displays the Technical Support address, phone/fax number, and web address.

Page 29: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

29

Page 30: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

30

Map Designer – Fundamentals of Transformation

Page 31: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

31

Map Designer – The Foundation

The Map Designer delivers the ease of an intuitive GUI for visually and directly mapping Source

data to Target structures while allowing the user to manipulate the data in virtually limitless ways.

The Map Designer tool enables the user to create the specifications for a transformation. A

transformation reads one or more source files record by record, applies to each record whatever

calculations, filters, checks, etc., are defined and then may write one or more records to one or more

target files.

The user employs a three-tab, graphical interface to describe the source(s), target(s) and processing

logic. Source “connectors” describe the source file(s) and target “connectors” describe the target

file(s).

Page 32: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

32

Interface Familiarization

Objectives

The Map Designer icons offer you shortcuts when you are creating, modifying, and viewing maps.

Here is information pulled from the Help File about the icons and their descriptions.

Descriptions

Page 33: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

33

Basic Map

Objectives

At the end of this lesson you should understand the Source and Target tabs and be able to use the

new Simple Map view to create a Transformation.

Keywords: Drag and Drop Mapping

Description

In this exercise we will follow the basic steps in the flow chart below and create a simple map.

Exercise

Define the Source:

1. Open Map Designer.

2. There are 3 tabs. The first tab is selected for defining the source.

3. Locate the textbox labeled Source Connection, click the down arrow. This will open the

Select Connection Dialog box pictured below. Notice there are three additional tabs.

Page 34: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

34

Note: The first time you open this it will open on the Factory Connections tab. Afterwards it

will default to the Most Recently Used tab. We will discuss the User Defined Connection tab

in a future exercise.

4. Choose the ASCII (Delimited) connector and click OK.

5. Next to the textbox labeled Source File/URI, click the down arrow to select a file.

Browse to the Accounts.txt file in the C:\Cosmos9_Work\Fundamentals\Data folder.

6. In the ASCII (Delimited) Properties box on the right side of the Source tab, find the

Header property and set it to True. Then click the Apply button under the Properties

list.

Note: Any time you make a change in the source or target properties, you will have to

click Apply to save the changes.

7. Use the toolbar Icon to open the Source Data Browser. If you see data records, then you

have connected to the source. Close the Browser.

Define the Target:

8. Click on the Target Connection tab.

9. In steps 3- 6 above, we chose a source connection. Create a Target connection similar to

the way we created a Source Connection. This time choose ACSII (Fixed) as the

connector type.

10. In the “Target File/URI” drop down browse to the

C:\Cosmos9_Work\Fundamentals\Data folder.

11. Type Accounts_Fixed.txt as the file name, and click Open.

Note: This file does not exist and will be created when we run the transformation.

Map the Fields:

Page 35: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

35

12. Click on the Map Tab (Yellow Tab).

13. If you see two quadrants on this page, then you are set to the Map Fields view and you

will need to follow the next steps. If not, you can skip to step 16.

14. From the Menu click View › Preferences. Click the General tab. Check Always show

Map All view.

We will be working in the Map All view for the remainder of the course.

15. To return to the Simple Map View, simply click on the Simple Map View icon in the

toolbar.

16. To map the fields, drag the asterisk from the box labeled All Fields in the source, and

drop it under the Target field name header.

17. Notice that the target has been filled out with field names identical to the source, and

that the Target Field Expressions are filled out as well. Validate the Transformation

using the check mark icon on the toolbar.

18. If the map is valid click OK.

19. Save the Map as m_BasicMap.map.xml in the

C:\Cosmos9_Work\Fundamentals\Development folder.

20. Click the Run Map Icon to run the transformation.

21. Click the Target Data Browser and note your results.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: C:\Cosmos9_Work\Fundamentals\Data\Accounts.txt

Page 36: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

36

Source Options: Header = True

Define the Target:

Target Connector: ASCII(Fixed)

Target Data: C:\Cosmos9_Work\Fundamentals\Data\Accounts_Fixed.txt

Target OutputMode: Replace

Target Field Expressions

R1.Account Number Records("R1").Fields("Account Number")

R1.Name Records("R1").Fields("Name")

R1.Company Records("R1").Fields("Company")

R1.Street Records("R1").Fields("Street")

R1.City Records("R1").Fields("City")

R1.State Records("R1").Fields("State")

R1.Zip Records("R1").Fields("Zip")

R1.Email Records("R1").Fields("Email")

R1.Birth Date Records("R1").Fields("Birth Date")

R1.Favorites Records("R1").Fields("Favorites")

R1.Standard Payment Records("R1").Fields("Standard Payment")

R1.Payments Records("R1").Fields("Payments")

R1.Balance Records("R1").Fields("Balance")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 37: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

37

Page 38: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

38

Connectors and Connections – Methods of Accessing Data

The connectors are what the Integration Platform uses to read and write data in Map Designer and

the other design tools. They are an integral part of the software in that all of the low-level, complex

data access programming has been abstracted to a simple form for the user to complete by using

drop-down menus and pick lists.

Page 39: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

39

Factory Connections

Objectives:

At the end of this lesson you will be able to find and use the appropriate data access Connector.

Keywords: Connectors List, Connection Menu, and Source

Connection tab

Description

Factory Connections contains a list of all of the Connectors available to you in Map Designer. Type

the first letter of a Connector name to jump to that Connector in the list (or the first one in the list

with that letter). For instance, you want to choose Btrieve v7. Type "B", and BAF will appear. From

there, you can scroll down to Btrieve v7 and select it.

The Map Designer Connector Toolbar offers you shortcuts to this dialog.

Here are the icons and their descriptions:

New - Allows you to clear the Source tab and define a new source connection.

Open Source Connection – Allows you to open the Select Connection dialog to access the:

Page 40: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

40

o Most Recently Used Tab

o Factory Connections Tab

o User Defined Connections Tab

Save Source Connection - Allows you to save the selected connector type, and any properties

hat you have defined for a source as a sc.xml file. The advantage of doing this is that you can reuse

the Connection in any subsequent Map design in the future. This saved connection will become a

User Defined connection. We will discuss user Defined Connections in the next topic

Source Connector Properties - opens the Source Properties dialog box. These are the same

properties available via the Source Connection tab, and are dependent upon the Connector to which

you are connected. This icon will be active only when you are on the Map tab.

Page 41: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

41

Macro Definitions

Objectives

At the end of this lesson you will be able to define and use Macros in connection strings.

Keywords: Macros, Macro Definition File, Workspace

Description

Macros are symbolic names assigned to text strings, and are usually used to represent file paths.

You should use macros as a tool to aid in the movement of integration files from one life cycle

environment to the next.

A macro definition file is an XML file that contains name value pairs. This file is named

macrodef.xml and resides in your Workspace directory. Each workspace will only read one

macrodef file. Therefore, the scope of macros contained in a single macrodef file is across a

workspace.

These macro names can be used throughout a map or process to provide connection information. For

example, a macro name can be substituted in the following connector options:

Server Name or IP Address

Database name

UserID

Password

File or table connection paths

We will create a new macro that we can use to represent the Data sub-directory for our Training

Repository. This will allow us to port the schema files more readily from one workstation to another

or deploy to servers for execution by Integration Engine.

Exercise

1. Select the menu item Tools › Define Macros. Notice there is already a macro that is set

to the default location of the current Workspace.

2. Click New.

3. Enter a Macro Name value as FUN_DATA.

4. Click the Macro Value drop-down button and navigate to our workspace and highlight

the C:\Cosmos9_Work\Fundamentals\Data folder, and click OK.

5. Add a back slash “\” to the end of the macro value.

6. Enter a description if you wish and click OK.

7. On the source connection tab, highlight the portion of the connection string you wish to

replace (e.g., C:\Cosmos9_Work\Fundamentals\Data\).

8. From the menu bar, select Tools › Paste Macro String.

9. Click on the row of the Macro you want to use (e.g., FUN_DATA).

Map Designer uses the syntax $(FUN_DATA) to represent the entire path to the Data

folder.

Page 42: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

42

Root Macro

If you will be selecting files from the same directory often you can set the “Root Macro” for

automatic substitution.

Click Tools › Define Macros. Highlight the Macro you want to use as the root directory and click

the Set as Root button.

Page 43: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

43

Then set the automatic substitution switch in Map Designer › View › Preferences › Directory

Paths: Choose Substitute root MACRO. This step is optional, and is a design preference.

Page 44: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

44

User Defined Connections

Objectives

At the end of this lesson you will be able to define and reuse a User Defined Connection.

Description

User Defined Connections are created by saving a Source or Target Connection along with any

property options. The Connections are saved as either a “.sc.xml” (Source) or “.tc.xml” (Target) file

in your Workspace/connections directory. User Defined Connections are reusable. You can create

as many as you would like.

Exercise

1. Reopen the Transformation built previously named m_BasicMap.map.xml and view the

Source Connection tab.

2. Use the Macro created in the last exercise for the path to the Accounts.txt file on the Source

Tab.

3. Using the Connector Toolbar to the right side of the Connection field, click the Save icon.

4. Save the source connection as Accounts_Delimited.sc.xml.

5. Close the current Map and open a new map design.

6. Select the Source Connection dropdown and click the User Defined Connections tab. Click

on the Connections folder and select the Accounts_Delimited.sc.xml connection.

Page 45: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

45

Page 46: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

46

Page 47: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

47

Basic Transformation Features

This section describes certain features for manipulating data that are built into the Map Designer

such as sorting, filtering, updating or deleting data.

Page 48: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

48

Source Data Features – Sort

Objectives

At the end of this lesson you should be able to apply a sorting function to your source data.

Keywords: Source Key and Sorting Dialog

Description

We view sorting in our transformations from two angles. First, it is often necessary that the target

file be in a certain order. While this doesn’t usually matter in database targets, it can be essential

when other file structures are being produced. Secondly, and perhaps more importantly,

transformations may be designed much more efficiently if we can rely on the source file being in a

certain sequence. Assume that you have a code of some sort in each source record and that you must

do a lookup or some complicated processing using that code. If the source file were in code

sequence, we could perform this logic only once for each code, and then save and use the results

until a new code was encountered.

At the outset, we realize that it is not possible to sort the target file itself. Transformations write

target records one at a time. However, it is quite possible to sort the source file before it is processed.

Doing so will achieve either of the requirements for sorting mentioned above. If the source file is

already in the sequence needed for the target, then writing the target one record at a time is no

problem. Also, we could sort the source file into a sequence that would enable us to minimize

processing time.

To sort the source file before processing, we simply use the Source Keys and Sorting dialog. In this

dialog we can specify the field(s) by which we want to sort the source file before processing. We can

even sort on a constructed or calculated value. We should realize, however, that when we use the

Source Data Browser to view the file, we will not see it in its sorted order, since the sorting is

performed once the transformation begins and is done dynamically, in memory. The original file is

not changed.

Sorting has its own overhead. Extremely large files can take a long time to sort. If this time becomes

a factor, then other strategies may need to be employed. But the benefits gained from having the

source file in sequence can be even greater. We will learn in later lessons how sorted source files are

a requirement for “on data change” processing- a processing strategy that can dramatically reduce

the execution time of a transformation.

Exercise

1. Connect to the ASCII Delimited file, Accounts.txt as your source. Hint: Use the User

Defined Connection created in class in an earlier exercise.

2. Click the Source Keys and Sorting icon in the toolbar.

3. On the Sort Options Tab, click in the Key Expression box to see the down arrow.

Click on the down arrow.

4. Choose the State Field to use as a key. Note: You can choose Build if you want to build

a key using an expression to parse out or concatenate parts of different fields. Also, the

sort will default to ascending order. If you would prefer to sort in descending order,

select "Descending" from the dropdown list.

Page 49: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

49

5. Create a target connection to an ASCII Delimited file called

AccountsSortedbyState.txt. This file doesn’t yet exist, so you’ll have to type in the file

name.

6. Set the header to true and click the Apply button.

7. Go to the Map Step by clicking on the Map All Tab.

8. Validate the Map.

Note: You may see a dialog box that looks like this. We will go into greater detail on the

“Default Event Handler” and Event Handlers in general later in this courseware.

9. Click OK to accept the Default Event Handler.

10. Save this Map as m_SourceDataFeatures_Sort.map.xml in the Development folder.

11. Run the Map.

12. Notice the results in the status bar.

13. Open the Target Data Browser and notice that the records are sorted by state.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Automatic Transformation Feature:

Sort Fields on Source Data: Fields("State")

type=Text

ascending=yes

length=2

Page 50: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

50

Define the Target:

Target Connector: ASCII(Delimited)

Target Data: $(FUN_DATA)AccountsSortedByState.txt

Target Options: Header = True

Target OutputMode: Replace

Target Field Expressions

R1.Account Number Records("R1").Fields("Account Number")

R1.Name Records("R1").Fields("Name")

R1.Company Records("R1").Fields("Company")

R1.Street Records("R1").Fields("Street")

R1.City Records("R1").Fields("City")

R1.State Records("R1").Fields("State")

R1.Zip Records("R1").Fields("Zip")

R1.Email Records("R1").Fields("Email")

R1.Birth Date Records("R1").Fields("Birth Date")

R1.Favorites Records("R1").Fields("Favorites")

R1.Standard Payment Records("R1").Fields("Standard Payment")

R1.Payments Records("R1").Fields("Payments")

R1.Balance Records("R1").Fields("Balance")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 51: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

51

Source Data Features – Filter

Objectives

At the end of this lesson you should be able to apply simple filters to your source data.

Keywords: Source Filter Window, Sample Size, and Target Filter

Window

Description

There are two ways to restrict the target file to contain only certain source records. The most flexible

way is to supply processing logic in the body of your transformation. Using this approach allows you

to implement any desired business rules for filtering data. For example, you might want to exclude

from the Target file any account records with invalid Zip codes.

The second way to restrict the number of Source records that are placed in the Target file is to use a

filter. You can do almost anything in a filter that you can do in processing logic, however the virtue

of filters is that they are usually easier to establish, change and remove.

For example, you may be testing a new Transformation against a file with more than a million

records. You have a complex calculation that needs to work properly, but you won’t be able to tell if

it is working until you look at the very first record in the Target file. You will not have to process all

the records just to see the results for the first one.

Filters are available for this type of situation. A source or target filter is a simple criterion that

determines whether a source record is to be processed or if a target record is to be written. The user

has the option of using one of four methods to test each source or target record to see if it should be

processed or written. You may (1) process/write only the first N records, (2) process/write all

records from record number X to record number Y, (3) process/write every Nth record or (4) supply

an expression which, if evaluated to “True”, causes the record to be processed/written. All of these

options are controlled through the Source Filters and Target Filters dialogs.

The user can use either type of filter or even both types at once. Using both types in the same

transformation, however, requires some thought. If your objective is to obtain a target file with 100

records, you can use either a source or target filter. You will get the result you want, but only if you

do not bypass any records in your own processing logic. As another example, if you filter a 5000-

record source file to process only the first 1000 records, and then also supply a target filter to write

every 10th record to the target, you will only get 100 target records, not 500. The target filter will be

applied to those source records that make it through the source filter.

As in sorting, filtering is performed dynamically when the transformation runs. Therefore source

filter results are not shown when the Source Data Browser is used.

Exercise

1. Connect to the ASCII Delimited file, Accounts.txt as your source.

2. Click the Source Filters icon in the toolbar.

Note: The radio buttons in the bottom of the window where it says “Define Source

Sample”. We can choose a range of records. We can choose to process ever Nth record

from the source. (The behavior of this is that you always get the first record, then every

Nth record like so, 1, N+1, 2N+1, 3N+1…)

Page 52: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

52

3. In this exercise we will filter all Account Records from the state of Texas. We will use

the Source Record Filtering Expressions box. This allows us to use the RIFL Scripting

Language (see The RIFL Script Editor section) to write an expression that will evaluate

to True or False. We will process the records that cause the expression to evaluate to

True. The expression to use is: Records("R1").Fields("State") == "TX" .

4. Create a target connection to an ASCII Delimited file called AccountsinTX.txt. This

file does not exist, so type in the file name.

5. Set the header property to true and click the Apply button.

6. Go to the Map Step. Drag all Source Fields to the Target.

7. Validate the Map.

Note: You may see the dialog box pictured below. We will go into greater detail about

the Default Event Handler and Event Handlers in general later in this course.

8. Click “OK” to accept the Default Event Handler.

9. Save this Map as m_SourceDataFeatures_Filter.map.xml in the Development folder.

10. Run the Map, and notice the Results Status Bar.

11. Open the Target Data Browser and notice that the target data set only contains records

from Texas.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Automatic Transformation Feature:

Filter Expression: Records("R1").Fields("State") == "TX"

Page 53: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

53

Define the Target:

Target Connector: ASCII(Delimited)

Target Data: $(FUN_DATA)AccountsinTX.txt

Target Options: Header = True

Target OutputMode: Replace

Target Field Expressions

R1.Account Number Records("R1").Fields("Account Number")

R1.Name Records("R1").Fields("Name")

R1.Company Records("R1").Fields("Company")

R1.Street Records("R1").Fields("Street")

R1.City Records("R1").Fields("City")

R1.State Records("R1").Fields("State")

R1.Zip Records("R1").Fields("Zip")

R1.Email Records("R1").Fields("Email")

R1.Birth Date Records("R1").Fields("Birth Date")

R1.Favorites Records("R1").Fields("Favorites")

R1.Standard Payment Records("R1").Fields("Standard Payment")

R1.Payments Records("R1").Fields("Payments")

R1.Balance Records("R1").Fields("Balance")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 54: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

54

Target Output Modes - Replace, Append, Clear and Append

Objectives

At the end of this lesson you should be able to understand and implement each of the target output

modes: Replace, Append, and Clear Append.

Keywords: Output Mode, Replace Mode, Append Mode, Clear and

Append Mode, and Schema Mismatch

Description

The target output mode Replace is used in two situations. In the first, the file or table does not yet

exist, and in this case Map Designer creates it using the layout you have specified on the Map tab. In

the second situation, where the file or table already exists, the replace mode deletes the file (or drops

the table) first, and then recreates it using the layout you have specified on the Map tab.

The target output mode Append adds additional rows to a target file or table that already exists. If

you are working with flat files as your targets, then the only available output modes are Replace and

Append.

The output modes are different when a database is the target. Database tables can have indexes and

constraints built into them and there is a critical difference between Replace and Clear and Append.

That difference is that Replace mode effectively drops the table and then recreates it, whereas the

Clear and Append mode truncates the table only. When you drop a table, you also drop any indexes

or constraints that the table might have, while truncation preserves them. You can use Clear and

Append as an output mode even if the table does not exist, and the table will be created

automatically.

Usually when mapping to a database, one will choose an existing table from a dropdown of the

tables that the database contains. As soon as you choose a table the Target Output Mode will change

to Append and the structure of the table will be defined on the Map Tab. You can then change the

Output Mode to Clear and Append and map your target fields on the Map Tab.

Exercise

1. Connect to the ASCII Delimited file, Accounts.txt as your source.

2. Create a target connection to the TrainingDB database that we have set up previously.

The table is called tblAccounts. Note that when we connect to this table, the output

mode was automatically set to Append because the table already existed. Let’s change

the output mode to Replace.

3. Go to the Map Step.

Note: In this case we already have target fields defined. This metadata (Field names, Field

lengths, and Data types) is defined by the database. Notice also that some fields are mapped

and some are not. The Simple Map view does an automatic Match by Name that pulls in

field names that are exact matches from source to target. We will have to do the rest by

hand.

4. For the AccountNumber field we click inside the target field expression, and then click

the down arrow.

5. We can then choose Account Number (note the space that is not there in the target field.

That’s why Match by Name failed).

Page 55: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

55

6. Now we do the same for each of the remaining fields. Look at the charts below for

specific mapping if needed.

7. Alternatively, we could have right clicked in the AccountNumber Target Field

Expression and chosen Match by Position. In this case, we would have mapped all of

our source fields into the target fields correctly. However, it is not always the case that

filed names will be in perfect position order between the source and target.

8. Run the map by clicking the Run button.

9. Accept the Default Event Handler if necessary.

10. Notice the results in the Target Data Browser. Note the number of records in the table.

11. Now let’s go back to the Target Connection Tab and set the Output Mode to Append.

12. Click the Run button.

13. Notice Results in the Target Data Browser. Note the number of records in the table.

14. Now change the Output Mode to Clear File/Table contents and Append.

15. Run the map and note the results.

16. Save this map as m_OutputModes_Clear_Append.map.xml.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

Table: tblAccounts

Target Options: none

Target OutputMode: Clear File/Table Contents and Append

Target Field Expressions

R1.AccountNumber Fields("Account Number")

R1.Name Fields("Name")

Page 56: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

56

R1.Company Fields("Company")

R1.Street Fields("Street")

R1.City Fields("City")

R1.State Fields("State")

R1.Zip Fields("Zip")

R1.Email Fields("Email")

R1.BirthDate Fields("Birth Date")

R1.Favorites Fields("Favorites")

R1.StandardPayment Fields("Standard Payment")

R1.LastPayment Fields("Payments")

R1.Balance Fields("Balance")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 57: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

57

Target Output Modes – Delete

Objectives

At the end of this lesson you should be able to understand and implement the target output mode:

Delete.

Keywords: Output Mode, Delete

Description

The Delete mode is only available when your target is a relational database or an ODBC data

Source. When you select Delete From File/Table, Map Designer will search Target data for a match

in a key field or fields which you have defined. Therefore, when you select Delete File/Table, you

must also define a key using the Target Keys/Index window.

When you want to delete specific records from an existing table, you should use the target output

mode Delete. Using the Delete mode requires that at least one field in the existing target contains

values that match those in one field in the source file.

Since the target table already exists, as soon as you specify it and set the output mode to Delete on

the Target Tab, you will find the target file’s fields listed on the Map Tab. Map Designer assumes

that the first target field is the key field. If there are additional key fields, highlight them, right-click

in the highlight, and choose the Set as Action Key option. (If you need to remove a key, simply

highlight the field, right-click in the highlight and choose the Unset Action Key option. Next, you

must map values to each of the fields that have action keys. Finally, use the Target Keys and Output

Mode Options button to specify whether you want to delete all matching records from the target or

just the first one found.

When the ClearMapPut Action is triggered, the contents of the key field(s) in the target buffer are

compared to all records in the target file, and either the first match or all matches are deleted.

Exercise

1. Connect to the ASCII Delimited file, InactiveAccounts.txt as your source.

2. Set the Header Property to True and click Apply as we have done previously.

3. Create a target connection to the TrainingDB database that we have set up previously.

Connect to tblAccounts. Note that when we connected to this table, because it already

existed, our output mode was automatically set to Append. Let’s change it to Delete.

4. Note that Target Filed “AccountNumber” was automatically set as the key field. Map

the Source Field “Account Number” to the Target Field “AccountNumber”.

5. Validate the map, and accept the Default Event Handler.

6. Click the Run button.

7. Notice Results in the Target Data Browser. Note the number of records in the table.

8. Be aware that you will only see results the first time you run the Map. This is because

we will remove the matching records the first time and they will no longer exist. You

will need to load the original source records into the target table before you run the

Delete Mode map a second time. Assuming that you correctly ran the previous Map in

Clear and Append mode, you can run it again to prime the table.

9. Save this map as m_ OutputModes_Delete.map.xml.

Page 58: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

58

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)InactiveAccounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

Table: tblAccounts

Target Options: none

Target OutputMode: Delete

Target Field Expressions

R1.AccountNumber Fields("Account Number")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 59: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

59

Target Output Modes – Update

Objectives

At the end of this lesson you should be able to explain exactly what the Target Output Mode

“Update” does. You should also be able to write a transformation that will update specific records in

an existing file or table.

Keywords: Output Mode, Update File and Schema Mismatch

Description

The Update mode is only available when your target is a relational database or an ODBC Data

Source. When you select Update File/Table, Map Designer will search the target data for a match

with a key field or fields that you have defined. You must also define a key using the Target

Keys/Index window.

Update Mode is similar in operation to Delete Mode. The user must indicate which of the current

target fields will be used as the key to identify records to be updated. Each of these key fields must

have a mapping expression. You must also determine whether the target table may contain records

with duplicate keys. You may wish to update all of them or just the first one found.

When Map Designer finds a matching record, the options set in the Target Keys and Output Mode

Options dialog control whether and how an update is performed. You may update just the first

matching record found or all of them (if the target contains records with duplicate keys). For each of

those options, you may decide to insert new records (ones that don’t match any record in the target)

or not. Finally, you can ignore matching records and simply insert those that don’t match any record

currently in the Target file.

Finally, and most importantly, you must specify in your design which fields will be updated, and

with what new values. Your options are to update each Target field with the current value in the

Target Field Expression (even if that result is null) or to just update the fields that actually have

Target Field Expressions. This is faster if the number of fields you need to update is a small subset

of all the fields in the Target file.

Mapping plays a much more critical role in Update Mode than in Delete Mode where no mapping

other than the key fields has any meaning. In Update, you can choose to update all the fields or just

the ones with expressions. You should be careful, however, when choosing the Update All Fields

option. Although you may want to do this, it is not the common practice, so you will have to click

the radio button in the Target Keys, Indexes and Options dialog box that is marked Allow null

values to overwrite data in target fields. When you do, fields that don’t have expressions won’t

simply be left alone- they will be cleared.

Exercise

From within the Transformation Map Designer:

1. Connect to the ASCII Delimited file, AccountsUpdate.txt as your source.

2. Set Header to true and apply as we have done previously.

3. Create a target connection to the TrainingDB database that we have set up previously.

The table is called tblAccounts. Note that when we connected to this table, because it

already existed, our output mode was automatically set to “Append”. Let’s set it to

“Update”.

Page 60: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

60

4. Go to the Map Step.

Note: In this case the target fields are already defined. This metadata (Field names, Field

lengths, Datatypes) is coming from the description of the table in the database.

5. Click inside the target field expression for the Account Number field, and then click the

down arrow.

6. Choose Fields(“Account Number”). Note the space that is not there in the target field.

Thus selecting Match by Name to map the data will fail.

7. Choose the corresponding source field names for each Target Field Expression as we

have just done. Look at the charts below for specific mapping if needed.

8. Alternatively we could have right clicked in the “AccountNumber” Target Field

Expression and chosen Match by Position. In this case, we would have mapped all of

our source fields into the target fields correctly. However it will not always be the case

that source fields and target fields will be in the same position.

9. Notice that the Target Field “AccountNumber” was automatically set as the key field.

10. Open the Target Keys, Indexes and Options dialog box. Note all the options that are

possible using Update Mode. In this case the defaults “Update all matching records and

insert non-matching records” and “Update only mapped fields” are sufficient. Although

the “Update All fields” would give us the same results since we have mapped all fields.

11. Click the Run button.

12. Accept the Default Event Handler.

13. Notice Results in the Target Data Browser. Note the number of records in the table.

14. When we run this map we will be updating the records, so unless you restore the table to

its original contents before you run the map again, you won’t see any change. You can

just run the map we created for the Clear and Append Mode exercise and then run the

Delete mode map before re-running this map.

15. Save this map as m_ OutputModes_Update.map.xml.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)AccountsUpdate.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

Table: tblAccounts

Page 61: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

61

Target Options: none

Target OutputMode: Update

Target Field Expressions

R1.AccountNumber Fields("Account Number")

R1.Name Fields("Name")

R1.Company Fields("Company")

R1.Street Fields("Street")

R1.City Fields("City")

R1.State Fields("State")

R1.Zip Fields("Zip")

R1.Email Fields("Email")

R1.BirthDate Fields("Birth Date")

R1.Favorites Fields("Favorites")

R1.StandardPayment Fields("Standard Payment")

R1.LastPayment Fields("Payments")

R1.Balance Fields("Balance")

Define Events: Source R1 Event Handlers

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 62: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

62

Page 63: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

63

The Rapid Integration Flow Language (RIFL) Script Editor

The RIFL Script Editor is the location where you can write your own scripts (expressions) to include

with your Transformations.

This Editor includes a list of all of the functions available to you in the Rapid Integration Flow

Language (RIFL). In addition, it gives you the syntax for each function. Examples for each function

are included in the help files. The RIFL Script Editor allows a user to use point and click and drag

and drop with very little typing to create accurate and valid RIFL Scripts to manipulate and validate

data during transformations.

Page 64: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

64

RIFL Script - Functions

Objectives

At the end of this lesson you should be able to use one-line RIFL scripts that are simple function

calls or pre-defined functions like NamePart, DateValMask, or concatenating two source fields

together into a single target field. Local variables, line continuation and comments will also be

discussed.

Keywords: Function Builder, Len, Trim, NamePart, Datevalmask,

comments, continuation character.

Description

In this exercise we will manipulate our data as we run the Transformation. To do this we will use

RIFL in the Target Field Expressions of the fields that we want to manipulate. We’ll work with both

the Field Mapping Wizard and the RIFL Script Editor. We’ll be working with the name and the date

field.

Exercise

In Map Designer:

1. Connect to the ASCII Delimited file, Accounts.txt as the source using the user defined

connection Accounts_Delimited.sc.xml.

2. Create a target connection to the TrainingDB database using the ODBC 3.x connector. The table

is tblAccounts. Set the Outputmode to Clear File/Table contents and Append.

3. Go to the Map Step.

4. Map all fields correspondingly except for the “Name” and the “BirthDate” fields.

The first field that we’ll work with is the “Birthdate” field. In our source, the birth date field has

string data in it that appears as: “11/12/1975”. Most databases will not accept a string value into

a date or datetime field. We will have to convert the date using the RIFL function,

Datevalmask, in the Target Field Expression.

5. Double Click in the Birth Date fields Target Field Expression, or select the drop down and

choose Build Expression.

6. The RIFL Script Editor for the Birth Date field will open. A list of built in functions is listed in

the lower right hand side

7. Find the function DateValMask. And select it. In the windows below you will see a description

of the function, and its parameters.

Page 65: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

65

8. Double Click on the function to add it to the Script Editor. The function will appear along with

its parameters. Use the next steps to define the parameters.

9. In the Script, highlight the parameter DateString. In the lower left pane click on Source R1.

Then in the lower right pane double click Birth Date.

10. Highlight Mask and type “mm/dd/yyyy”.

Masks are used in many RIFL functions. In order to know what values to use for masks, look in

the Help files for the topic Picture Mask.

11. The next field we will manipulate is the “Name” target field. The source data names are in the

format, First Middle Last. A sample from the first record is George P Schell. We would like the

Name Field in the target to have the format, Last, First Middle Initial. Example: Schell, George

P.

12. In the RIFL Script Toolbar Click on the Show Expression Tree icon.

Page 66: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

66

13. Select the “Name” field for the Target.

14. Delete the Fields(“Name”) value or any other value in the Editor pane so that it’s blank.

15. In the lower right pane select the NamePart function.

16. Double click the NamePart function to add it to the Scripting Editor.

17. In the editor window select the “Mask” parameter. Type in “l” ( a lower case L in double

quotes) .

18. Select the “Name” parameter. Pull in the source field “Name” as we did above for the “Birth

Date”. (See step 9.)

19. The script that we have created will return only the last name. We will have to parse the other

parts of the name and use the concatenation icon to create the full name in the desired

format.

20. Use the concatenation operator to add a comma and whitespace to the name format. Write the

following script:

NamePart(“l”, Records(“R1”).Fields(“Name”)) & “, “ & _

NamePart(“f”, Records(“R1”).Fields(“Name”)) & “ “ & _

NamePart(“mi”, Records(“R1”).Fields(“Name”))

Note: For logic purposes this script would need to be all one line. We use the space and the

underscore characters as a continuation that allows us to write script on the next line. This makes

the script easier to read.

21. Validate the Script Syntax by selecting the Validation Icon.

Troubleshooting Tips:

Page 67: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

67

Verify that the Script is written as it appears above.

Make sure there aren’t any trailing spaces after the continuation characters (underscores)

22. Click OK in the RIFL Script Editor and save this map as m_RIFLScript_Functions.map.xml

in the Development folder.

23. Run the Map and note the results.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

Table: tblAccounts

Target Options: none

Target OutputMode: Clear File/Table contents and Append

Target Filed Expressions:

R1.AccountNumber Fields("Account Number")

R1.Name NamePart("l", Records("R1").Fields("Name")) & ", " & _

NamePart("f", Records("R1").Fields("Name")) & " " & _

NamePart("mi", Records("R1").Fields("Name"))

R1.Company Fields("Company")

R1.Street Fields("Street")

R1.City Fields("City")

R1.State Fields("State")

R1.Zip Fields("Zip")

R1.Email Fields("Email")

Page 68: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

68

R1.BirthDate DateValMask(Fields("Birth Date"), "mm/dd/yyyy")

R1.Favorites Fields("Favorites")

R1.StandardPayment Fields("Standard Payment")

R1.LastPayment Fields("Payments")

R1.Balance Fields("Balance")

Define Events: Source R1 Event Handlers

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 69: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

69

RIFL Script – Flow Control

Objectives

At the end of this lesson you should be able to use multi-line RIFL scripts that utilize one of the

flow-control structures like an If-Then-Else statement.

Keywords: Flow Control, If then Else, Discard, IsDate, and

DateValMask Functions, Editor Properties

Description

Flow Control is the management of Data flow. As used in RIFL, it is the management of where

and/or how a particular piece of source data is mapped into the target. The most commonly used

Flow Control function is the If Then Else statement:

If this statement about my data is true then

Execute this statement.

Else

Execute this statement.

End if

This exercise will evaluate the dates in the source file to determine if they are valid. If the date for a

record is valid then we will write the record to the target. If the date is not valid, we will write a

message to the log file and discard the record so it isn’t written to the target.

Exercise

1. Create a new Map and connect to the source and target listed below.

2. On the Map tab, map all fields as before except for the “Birth Date” field.

3. Open the RIFL Script Editor in the “Birth Date” field.

4. In the lower left pane of the RIFL Script Editor, above ‹All Functions›, click Flow

Control. In the lower right pane, double click If…Then…Else.

Page 70: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

70

Notice that the RIFL Script Editor puts the syntax for the If Then Else Statement into the

editor window. We will replace condition with a statement that will evaluate to true or false. The

statement block one will become actions that will take place if the statement is true. The statement

block two will become actions that take place if the statement is false.

5. Enter the following script, replacing what is in the editor.

Dim d

d = Records("R1").Fields("Birth Date")

If IsDate(d) then

DateValMask(d, "mm/dd/yyyy")

Else

Logmessage("Warn", "Account Number " & Records("R1").Fields("Account Number")

& _

" has an invalid date: " & d)

Discard()

End if

Page 71: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

71

Line 01 declares a local variable d that will be available to us only in this script.

Line 02 sets d to the value contained in the Birth Date field in the source.

Line 04 uses the IsDate function to determine if the string can be converted to a valid date.

Line 05 converts the date for use in the target using the DateValMask function.

Lines 07 and 08 use the LogMessage function. The first parameter of a LogMessage function is

always either “Info”, “Warn”, “Error”, or “Debug”. The second parameter is the string written to the

log file. In this case there are a combination of literal strings and data contained in the source record.

Note the continuation character at the end of line 7.

Line 09 uses the Discard function which causes the source record not to be written to the target.

6. Click the Validate icon . We should see “Expression contains no syntax errors” at

the bottom of the RIFL Script Editor. Click OK.

7. Validate the map and save it as m_RIFLScript_FlowControl.map.xml.

8. Run the Map and note results in the target. There are only 201 records in the target.

9. Click on the Transformation Log icon . Note the results of the LogMessage

function.

Map Summary:

Page 72: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

72

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

Table: tblAccounts

Target Options: none

Target OutputMode: Clear File/Table contents and Append

Target Field Expressions:

R1.AccountNumber Fields("Account Number")

R1.Name Fields("Name")

R1.Company Fields("Company")

R1.Street Fields("Street")

R1.City Fields("City")

R1.State Fields("State")

R1.Zip Fields("Zip")

R1.Email Fields("Email")

R1.BirthDate Dim d d= Records("R1").Fields("Birth Date") If IsDate(d) then DateValMask(d, "mm/dd/yyyy")

Else Logmessage("Warn", "Account Number " & _ Records("R1").Fields("Account Number") & "has an invalid date: " & d) Discard() End If

R1.Favorites Fields("Favorites")

R1.StandardPayment Fields("Standard Payment")

Page 73: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

73

R1.LastPayment Fields("Payments")

R1.Balance Fields("Balance")

Define Events: Source R1 Event Handlers

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 74: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

74

Transformation Map Properties

The property-sheet tool bar button accesses the Properties dialog for all global settings.

Using the Transformation and Map Properties dialog affects many areas of the

Transformation Map. These areas include the log file settings, runtime execution properties, error

handling and definitions of external code-modules.

Page 75: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

75

Reject Connection Info

Objectives

Create an additional target file that contains rejected records.

Keywords: Reject function, Reject Connect Info, Connection

String, and OnReject Event Handler.

Description

The Reject Connect Info dialog (pictured below) allows you to specify a reject file. You must

specify connection string to define where the rejected records will be written. You can manually type

the connection string, or you can use the buttons to Build New Connection String, Build

Connection String from Source, Build Connection String from Target or Clear Reject.

Exercise

1. Using the previous Map, change the Discard() function call to a Reject() function call.

2. Go to the Map Properties dialog and click Build Connection String from Source.

3. Change the file name in the connect string to “BadDateRejects.txt”.

4. Using the Target Event Handler OnReject, add a ClearMapPut Record action.

5. Change the target name parameter from “Target” to “Reject”.

6. Save the map as m_RIFL_RejectConnectInfo.map.xml.

7. Execute the map by clicking the Run icon.

Page 76: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

76

8. Note the results in the Target Data Browser.

9. Navigate to the reject file BadDateRejects.txt. You should see all records that contain

invalid dates.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

Table: tblAccounts

Target Options: none

Target OutputMode: Clear File/Table contents and Append

Target Field Expressions:

R1.AccountNumber Fields("Account Number")

R1.Name Fields("Name")

R1.Company Fields("Company")

R1.Street Fields("Street")

R1.City Fields("City")

R1.State Fields("State")

R1.Zip Fields("Zip")

R1.Email Fields("Email")

R1.BirthDate Dim d d = Records("R1").Fields("Birth Date") If IsDate(d) then Datevalmask(d, "mm/dd/yyyy")

Else Logmessage("Warn", "Account Number " & _ Records("R1").Fields("Account Number") & " has an invalid date: " &

Page 77: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

77

d) Reject()

End If

R1.Favorites Fields("Favorites")

R1.StandardPayment Fields("Standard Payment")

R1.LastPayment Fields("Payments")

R1.Balance Fields("Balance")

Define Events: Source R1 Event Handlers

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Define Events: Target R1 Event Handlers

Event Name Event Actions Event Parameters

OnReject ClearMapPut Record target name Reject

record layout R1

Page 78: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

78

Event Handlers & Actions

The event handling capabilities in the Map Designer are designed to allow tremendous flexibility in

the handling of data. Actions can be triggered at virtually any point in the Transformation process.

Messages can be logged, expressions executed, possible errors can be traced, normal data

manipulation and memory clearing can be done, and the Transformation itself can be terminated.

You have complete control over when these Actions occur, what Actions occur, and how many

Actions occur.

Page 79: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

79

Understanding Event Handlers

Objectives

At the end of this lesson you should understand the relationship between an event and an Event

Handler, between an Event Handler and Event Actions and between Event Actions and Event Action

Parameters.

Keywords: Event Action, Event Handlers, Event Precedence,

Default Event Handler, ClearMapPut Action, Execute Action

Event Concepts

An Event is a point in time in the life of a transformation, similar to an event in your lifetime. As in

your own life, some events only occur once (e.g., you graduate from high school) while other events

occur repeatedly (e.g., you have a birthday). In a transformation, two events are at the start and end

of the transformation, each of these events only occurs once. Another event might be when a source

record is read, which will probably occur many times. A transformation may be thought of as a long

sequence of events. Some events occur one time while others occur many times. And other groups of

events may repeat over and over.

As part of your transformation design process, you may choose to perform one or more tasks when

one or more of these events occur. Your transformation will use at least one event and that event will

perform at least one task. The tasks that events perform are called “Actions.” There is a wide range

of actions available for each event. When you decide to perform an action, you have the ability to

control just how that action is performed. These control specifications are called “Action

Parameters.”

As a simple example, you might decide to use the event that occurs every time a source record is

read (the AfterEveryRecord event) and you might decide to perform the action that causes a target

record to be written (the ClearMapPut action). But you might have multiple target record layouts

from which to choose, so you might supply an action parameter for the action to specify the target

record layout you wish to write to. The diagram below demonstrates a logical flow of an event, its

actions, and the action parameters.

Event A source record has been read from the source file.

Event Action Transform the data and write a record to the target file.

Action Parameter Fire this action only if Fields(“Status”) of the source record == “Active”.

Using an Event

The first task is to choose an event to use. Events are grouped in a number of places. There are

events that apply to the transformation as a whole (e.g., BeforeTransformation). These can be found

in the Transformation and Map Properties dialog. Next, there are source record events that apply to

each specific source record type (e.g., AfterEveryRecord). These can be found in the source

hierarchy on the Map Tab under each record types heading. Next, there are source record events that

apply to each and every source record that is read, and these can be found under the “General Event

Handlers” heading in the source hierarchy. Finally, there are two groups of target record events- one

group that applies to target records of a specific type and one that applies to each and every target

Page 80: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

80

record (no matter what type). These are found in the target hierarchy under headings like those for

the source record events.

The Default Event Handler

A transformation must have at least one event, and that event must have at least one action. To

ensure that your transformations meet this requirement, Map Designer will define an event and an

action if you do not create any events. The event that it creates is the AfterEveryRecord event for the

source file, and the action that it supplies is the ClearMapPut action for the target file. This event and

its associated action are collectively referred to as the “Default Event Handler.” This Default Event

Handler will automatically read every source record and clear the target buffer, execute all of the

mapping expressions and then write the target buffer contents to the target file for each source

record.

When Map Designer supplies this default event handler, you are informed via an on-screen message

box. However, Map Designer supplies the default event handler ONLY if you do not create any

event handlers. If you do, then Map Designer WILL NOT ADD the default event handler. Map

Designer will, however, warn you when you are about to run a transformation that has no event

action that will cause a target record to be written.

Commonly Used Events

Some events are very basic and are used frequently. Most of these events will be discussed and used

in the exercises in this course module. You should be aware of these events and when they occur.

BeforeTransformation: This is the first event that occurs in any transformation, and is very

useful for all the housekeeping and set-up tasks that you may wish to perform.

After Transformation: This is the last event that occurs before a transformation ends, and

it is very useful for accessing final totals and other values, and performing housekeeping and

clean-up tasks.

Specific AfterEveryRecord: The word “specific” refers to an event that is tied to a

particular source or target record type. This event occurs whenever a source record of a

specific type is read, and is the ideal place to perform the action you want to do using the

values from each source record.

Specific AfterFirstRecord: This event only occurs when the first record of a specific type

is read, and it is the ideal event in which to perform housekeeping and set-up tasks that relate

to a single record type.

General AfterFirstRecord: The word “General” refers to an event that is not tied to a

particular source or target record type. This particular event occurs only when the first

record is read from the source file and is again a great place to perform general

housekeeping and set-up tasks that relate to all record types.

General AfterEveryRecord: This event occurs whenever a source record is read from the

source file- no matter what type it may be. It is the best place to put common tasks- those

that will apply to all source records.

Commonly Used Actions

There are many actions that you can perform when a particular event occurs. Some actions are used

very often and are common to many events. The two most common actions are:

Page 81: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

81

ClearMapPut: This action is three actions in one. The first action clears the target buffer

(for the record type specified in its “Layout” parameter). Next, it executes all the mapping

expressions that you have supplied for each field in the target buffer, in effect filling the

target buffer fields with the data specified by the Target Field Expressions. Finally, it writes

the contents of the buffer to the target file. A visual representation of these actions is

pictured in the next topic.

Execute: This action executes a script created with the RIFL Script Editor. The scripts you

write and execute perform the work of your transformation.

For additional documentation of using Event Handlers, read the Event Management

Guide: http://docs.pervasive.com/products/integration/download/events.pdf.

Page 82: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

82

Source and Target Buffers – ClearMapPut Action

I. Initial Buffer State before the transformation begins.

II. State of Buffers after n number or records are processed.

Page 83: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

83

III. State of Buffers after a new Source record is read and the data is stored in the Source Buffer.

IV. The first action of the ClearMapPut causes the Target Buffer to be cleared.

Page 84: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

84

V. The second action of the ClearMapPut causes the data in the Source Buffer to be mapped

to the Target Buffer.

VI. The third action of the ClearMapPut causes the data to be written to the Target File or Table.

Page 85: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

85

Event Sequence Issues

Objectives

At the end of this lesson you should understand how to define Events, and have a general

understanding of the rules governing the sequence in which Events occur in a typical transformation.

Keywords: Event Precedence, Null Connector, Global variables

Description

There are many Events available in a transformation. Using the appropriate events will depend on

the sequence in which they are activated. There is an Event Precedence Framework that dictates the

sequence in which events will occur based on the Map instructions provided. The Events that are

activated depend on the Events you have chosen to utilize, and the data in the source file.

In order to derive the general rules of the Event Precedence Framework, you can perform your own

tests. You can create a transformation that uses the Null source connector and that produces a target

file with two fields: (1) The record number from the source; (2) The contents of a global variable.

Then, choose the events you are interested in testing. For each one, set the global variable equal to

the name of the event, and then write a target record. When you examine the target file, you will see

the order in which the events were activated.

This exercise will introduce the use of global variables. Using the Global Variables option,

you can specify scalar variables, internal objects, or ActiveX objects at the Private or Public level in

your Transformations. Global variables are defined in the Map Properties dialog.

Page 86: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

86

Exercise

1. Create a map based on the specifications given below. Save the Map as

m_Events_SequenceTest.map.xml.

2. Run the map and observe the results.

Most of our exercises make some attempt to mimic a real world situation in a simplified

fashion. This exercise, however, is pure classroom.

Map Summary:

Define the Source:

Source Connector: Null

Source Options: Record count = 5

Define the Target:

Target Connector: ASCII(Delimited)

Target Data: $(FUN_DATA)EventNames.txt

Target Options: Header = True

Target OutputMode: Replace

Variables:

Name Type Public Value

eventName Variant no

Define Events: Transformation Events

Event Name Event Actions Event Parameters

Before

Transformation

Execute Expression:

eventName = "Before Transformation"

ClearMapPut Record target name Target

record layout R1

Page 87: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

87

After Transformation Execute Expression:

eventName = "After Transformation"

ClearMapPut Record target name Target

record layout R1

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord Execute Expression:

eventName = "R1 AfterEveryRecord"

ClearMapPut Record target name Target

record layout R1

Define Events: Source General Events

Event Name Event Actions Event Parameters

AfterEveryRecord Execute Expression:

eventName = "General AfterEveryRecord"

ClearMapPut Record target name Target

record layout R1

BeforeFirstRecord Execute Expression:

eventName = "General BeforeFirstRecord"

ClearMapPut Record target name Target

record layout R1

OnEOF Execute Expression:

eventName = "General OnEOF"

ClearMapPut Record target name Target

record layout R1

Note: The following target fields should be created manually through the user interface.

Page 88: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

88

Target Schema: Record R1

Name Type Length Description

RecordNumber Text 16

EventName Text 30

Target Field Expressions

R1.RecordNumber Fields("Record Number")

R1.EventName eventName

Page 89: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

89

Using Action Parameters – Conditional Put

Objectives

At the end of this lesson you should be able to open the action list for an event, choose an action and

add it to the list, supply mandatory and optional parameters for the action and place the action in the

correct sequence within the action list. For those actions that allow it, you should also know how to

make the execution of an action “conditional.”

Keywords: AfterEveryRecord event, and ClearMapPut action

Description

Actions are controlled by setting Action Parameters. For example, changing the Target Record

layout parameter determines what expressions are used and what kind of target record will be

written.

Many actions can be performed conditionally. These actions will have a Count and/or Counter

Variable parameters. The Count parameter accepts any expression, the result of which must be a

numeric value. When this value is zero, the action is not performed; when it is one, the action is

performed. When the value is greater than one, the action is performed repetitiously based on the

value returned. By default the Count Parameter has a value of 1. The Counter Variable parameter

provides and index for the current repetition count.

This exercise will use the Count Parameter to write an expression that checks for invalid birthdates

in the source file. By returning a 0 when we find a record with an invalid date, ClearMapPut will not

fire and the source record will not appear in the target. By returning a 1when the date is valid, the

ClearMapPut will fire once, and we will see the record in the target.

In order to become more familiar with the use of global variables, we will increment a variable that

is keeping track if the number of invalid dates we have. At the end of the Map we will display a

message box to see how many birthdates are invalid.

Exercise

1. Create our map based on the specifications given below.

2. Save the map as m_Events_ConditionalPut.map.xml.

3. Run the map and observe the results.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Page 90: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

90

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

Table: tblAccounts

Target Options: none

Target OutputMode: Clear File/Table contents and Append

Variables:

Name Type Public Value

varBadDates Variant no 0

Target Filed Expressions

R1.AccountNumber Records("R1").Fields("Account Number")

R1.Name Records("R1").Fields("Name")

R1.Company Records("R1").Fields("Company")

R1.Street Records("R1").Fields("Street")

R1.City Records("R1").Fields("City")

R1.State Records("R1").Fields("State")

R1.Zip Records("R1").Fields("Zip")

R1.Email Records("R1").Fields("Email")

R1.BirthDate DateValMask(Records("R1").Fields("Birth Date"),"mm/dd/yyyy")

R1.Favorites Records("R1").Fields("Favorites")

R1.StandardPayment Records("R1").Fields("Standard Payment")

R1.LastPayment Records("R1").Fields("Payments")

R1.Balance Records("R1").Fields("Balance")

Page 91: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

91

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

count

Dim d

d = Records("R1").Fields("Birth Date")

' Use flow control to test for a valid date

If IsDate(d) Then

' Enable the Put action by returning 1

1

Else

' Invalid date, log a message

Logmessage("Error", "Account number: " & Records("R1").Fields("Account Number") & _

" has an invalid date: " & d)

' Increment counter

varBadDates = varBadDates + 1

' Suppress the Put action by setting to zero

0

End If

Page 92: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

92

Using OnDataChange Events

Objectives

At the end of this lesson you should be able to use an OnDataChange event to execute certain actions

whenever the value of a field by which the input file is in order changes. You should also be able to

manipulate the first and last data change events to achieve your desired results.

Keywords: OnDataChange, and Record Type Event Handlers

Description

When source files are sorted by one or more data items, Map Designer gives you the ability to

monitor the value of one of those sort keys as source records are put into the source buffer. When the

value changes from one record to the next, you have the ability take whatever actions you wish. This

is accomplished by suing an OnDataChange Event and defining actions for that event.

Processing of this type is very common. There are three situations in which it is often used.

To produce summary information in the target file.

To optimize transformations performing lookups.

When the target has a hierarchical structure such as an XML file.

When using an OnDataChange Event, first specify the source field by which the source file is sorted.

The transformation will then monitor the position that field occupies in the source buffer. Whenever

a new source record is placed in the source buffer, the transformation will compare the value of that

field in the new record to the value of the field in the previous record. When the values are different,

the OnDataChange Event Handler will execute the list of actions specified.

For all of this to work, the source file should be in order by the value(s) being monitored. If it is not,

you can either (1) physically sort it prior to its input into the transformation or (2) allow the

transformation to dynamically sort it. For flat files, using the Source Keys and Sorting dialog will

perform this dynamic sort. For an SQL source, you can use the “Order By” clause in your SQL

query.

It is true that sorting the data at the beginning of your transformation increases execution time, but

the reductions in execution time that are possible with the OnDataChange strategy will usually far

outweigh the overhead of the sort itself. This is particularly true when IO-intensive operations, such

as lookups, are involved.

You can monitor up to five different data items in a single transformation. There is also an event that

is activated whenever any monitored field changes and an event that is only activated when all

monitored fields change at the same time.

This exercise builds a map that sorts our Accounts.txt file by state. Our target file will have one

record for every state in the source file. Each record will have three fields, the state, the number of

accounts in that state, and the total balance of all records in that state.

Exercise

1. Create our map based on the specifications given below. Save the Map as

m_Events_OnDataChange.map.xml.

2. Run the map and observe the result.

Page 93: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

93

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Automatic Transformation Feature:

Sort Fields on Source Data: Fields("State")

type=Text

ascending=yes

length=2

Define the Target:

Target Connector: Excel 2000 or Excel XP

Target Data: File: $(FUN_DATA)AccountSummariesByState.xls

Sheet: Sheet1

Target Options: Header Record Row: 1

Target OutputMode: Replace

Target Schema

Field Name Type Length Description

State Text 16

Number_of_Accounts Text 16

Total_Balance_of_Accounts Text 16

Total 48

Variables:

Name Type Public Value

Page 94: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

94

varState Variant no

varCounter Variant no 0

varBalance Variant no 0

Target Field Expressions

R1.State varState

R1.Number_of_Accounts varCounter

R1.Total_Balance_of_Accounts varBalance

Define Events: Source R1 Event Handlers

Event Name Event Actions Event Parameters

AfterEveryRecord Execute Expression:

' Set the state value for the current record since it will

change when we are ready to write the data to the target

varState = Records("R1").Fields("State")

' Increment the counter

varCounter = varCounter + 1

' Accumulate the balance

varBalance = varBalance +

Records("R1").Fields("Balance")

There are also two special situations that should be considered. When the very first record is

placed in the buffer the value of the field being monitored will have changed since the source buffer

is always filled with null values at the start of a transformation. So the OnDataChange Event will fire

after the first record is read. Similarly, when the source buffer is cleared after the last record has

been processed, the value of the field being monitored will change from some real value to a null

value, and again the OnDataChange Event will be fired. However, these situations may or may not

be useful in any given transformation. Therefore, you have the option of suppressing one or the

other, or both of them. This is controlled in the Data Change Event Management Options.

Page 95: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

95

Define Events: Source R1 OnDataChangeEvent

Monitor: Records("R1").Fields("State")

Management: Suppress first ODC event, Fire Extra ODC event at EOF

Event Name Event Actions Event Parameters

OnDataChange1 ClearMapPut Record target name Target

record layout R1

Execute Expression:

' Reset the variables for the records belonging to

the next state

varCounter = 0

varBalance = 0

After reviewing the target data you may notice that the precision of the decimal places is not

formatted correctly. This precision becomes distorted because the Balance field, which is stored as a

text value is converted to a numeric value, addition of the values is performed, and then the data is

again converted into text. In order to fix the precision, we can change the Source Data type to

Decimal and set the number of decimal places to 2.

Page 96: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

96

Trapping Processing Errors with Events

Objectives

At the end of this lesson you should be able to use the OnError event handler to trap and handle

processing errors yourself, including using file management functions to record information in a file

independent of the Source, Target or Reject connections.

Keywords: Error Message Reference Chart, Error and Event

Preferences, Chr, MacroExpand, FileAppend, and File Functions

Description

In order to handle errors, we need to be aware of the types of errors that can exist, the options we

have in dealing with them, the list of specific error codes that we may wish to deal with and then

some strategies for dealing with the records that may cause these errors.

In general, there are three types of errors that we may be concerned with. “Fatal” errors are those

that will cause a map to terminate. “General” errors are those that are not fatal but may affect the

transformation process. An example might be a read error for a specific source record. “Warnings”

are not necessarily errors, and include data truncation, field name changes, loss of precision, and so

on. In the Error Logging Preferences, we can set these errors (and other messages) to be logged or

not. We also have some control over when the transformation terminates.

In other cases, we may wish to deal with certain errors, and it would be useful to know what all the

errors are and how they can be identified. There are too many to list here, but you can find a

complete list of the errors and error codes in the Help System in the series of pages entitled “Errors”.

To give you flexibility in handling these errors, there are a number of individual events that are tied

to specific errors; when that error occurs, the Map Design checks the matching event handler to see

whether you have supplied actions to be performed. If so, they are executed. If not, then whatever

would have happened as regards that error (e.g., the transformation aborts) will happen.

When we wish, we can use the specialized error event handlers, such as the OnTruncateError Event.

The transformation will automatically take care of identifying the error and will transfer control to

that event handler instead of aborting the transformation. In the event handler, we can perform any

tasks we wish to deal with the error. Once we have done so, we can either allow the transformation

to terminate or cause the transformation to pick up where it left off by using the Resume action.

In this exercise, our source will be the Accounts.txt file. We will create a target file to show how

many months each customer will take to pay off their balance if they continue to make payments of a

certain amount. We will derive this value by dividing the “Balance” field by the “Payments” field.

However, there will be a problem if we have a customer that has a 0 value in the payment field since

we will be attempting to divide by zero. In this case Map Designer will throw an error. We can catch

this error using the error handling event handlers.

Another goal of this exercise is to write an additional file that contains the values of the payments

and balances that cause the error. We will do this with the FileAppend function as well as other file

manipulation functions.

Exercise

1. Create our map based on the specifications given below. Save the map as

m_Events_OnError_Event.map.xml.

2. Run the map and observe the results.

Page 97: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

97

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ASCII(Delimited)

Target Data: File: $(FUN_DATA)PaymentsRemaining.txt

Target Options: Header = True

Target OutputMode: Replace

Target Fields:

Name Type Length Description

Account Number Text 9

Payments Text 7

Balance Text 6

Payments Remaining Text 16

Target Field Expressions

R1.Account Number Records("R1").Fields("Account Number")

R1.Payments Records("R1").Fields("Payments")

R1.Balance Records("R1").Fields("Balance")

R1.Payments Remaining Dim pmt, bal

pmt = Records("R1").Fields("Payments")

bal = Records("R1").Fields("Balance")

If Int(bal/pmt) == bal/pmt then

bal/pmt

Else

Page 98: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

98

Int(bal/pmt) + 1

End if

Variables

Name Type Public Value

flagFirstTime Variant no 0

errorFile Variant no

Define Events: Source General Events

Event Name Event Actions Event Parameters

BeforeFirstRecord Execute Expression:

‘Set the value of the file variable

errorFile =

MacroExpand("$(FUN_DATA)DivideByZero.txt")

If FileExists(errorFile) Then

FileDelete(errorFile)

End If

Note: This example shows the functionality of the MacroExpand. FileExists and FileDelete

functions, though similar results could be had by using:

FileWrite(errorFile, "AcctNumber" & sep & "Payt" & sep & "Bal" & crlf)

‘where sep = “|” and crlf = Chr(13)&Chr(10)

This would replace any existing file with a file that contains only the header. This would also make

the flagFirstTime variable unnecessary.

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 99: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

99

Define Events: Target General Events

Event Name Event Actions Event Parameters

OnError Execute Expression:

Dim sep, crlf

sep = "|"

crlf = Chr(13) & Chr(10)

If flagFirstTime == 0 Then

'Write the header for the error file

FileAppend(errorFile , "AcctNumber" & sep & "Payt" & sep & "Bal"

& crlf)

' set flag to 1 so header will not be written next time

flagFirstTime = 1

End If

FileAppend(errorFile, Records("R1").Fields("Account Number") & sep

& _ Records("R1").Fields("Payments") & sep & _

Records("R1").Fields("Balance") & crlf)

Resume none

Do not forget the Resume Action. The Resume Action is what causes the map to continue

processing the remaining records after the error is handled.

3. Observe the DividebyZero.txt file that is created in the Data folder.

Page 100: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

100

Error and Exception Handling Review

Types of Errors and Log Messages

Errors occur during data integration at design time and run time.

Design-Time Errors

Design-time errors occur while you are using an application such as Map Designer and

Process Designer.

Run-Time Errors

Run-time errors occur during the execution of a transformation or process. Because the

designers can run transformations and processes through Integration Engine, run-time errors

can also be displayed in the designer interfaces.

Run-time errors are generated from the following places:

Designers

RIFL scripts

Integration Engine command line console

SDK code

Page 101: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

101

Error Log Messages

All errors originate from the Integration Engine, but the log to which they are written

depends on the interface being used. For instance, if you are using Map Designer, then errors

are logged to an error file. If you are using Integration Engine, then error messages are

displayed in the console and written to a log file. See the following topics for more

information on error logging.

Map Designer

Errors that occur in Map Designer are displayed in a dialog box and logged to a

TransformMap.log file.

Process Designer

Errors that occur in Process Designer are logged to a process log file named by the designer.

Integration Engine

In the Integration Engine, the last error message logged while loading, changing or running a

transformation or process is logged to the command line interface console. All error

messages are logged to a log file. The default name for the log file depends upon which

interface you are calling. For instance, if you are running a transformation on the command

line, errors are logged to the TransformMap.log file; if you are running a process, errors are

logged to a process log file.

RIFL

The Rapid Integration Flow Language (RIFL) includes functions and statements that return

information about errors to the error log files.

For instance, you can use the LogMessage Function to write messages to an error log file,

and the On Error GoTo Statement to trap run-time errors.

Page 102: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

102

Comprehensive Review

To test our knowledge and review the introductory module for the Cosmos Integration Essentials

courses we want to design a Map to load data in the Accounts.txt file into a target database table.

Basic Map specifications:

Source Connector: ASCII (Delimited)

Source File: Accounts.txt

Header property: True

Target Connector: ODBC 3.x

Data Name Source: TrainingDB

Table: tblIllini

Output Mode: Replace Table

Exercise

1. Map the four target fields with the appropriate data from the source.

2. Use the appropriate Event and Action that will write all source records to the target.

Hint: This is also the Default Event Handler.

3. In the Target BirthDate Field, use an appropriate Date/Time function to convert the

formatted date strings into a real date-time data type.

4. Test for invalid dates using the IsDate function, and reject the invalid records to an

ASCII Delimited file named Reject_Accounts.txt.

5. Reject all records from the state of Illinois (IL) into the Reject_Accounts.txt file as well.

Hint: You will have to use a Target Event to write the rejected record to the file.

6. Aggregate the Balances from all rejected records using a global variable.

7. Report the aggregated balance (total balance) in the log file using the LogMessage

function.

The solution to this review is in the Solutions folder. It is named

m_Comprehensive_Review.map.xml. Open it and look only if you get stuck. It should be noted that

the solution map shows only one way to complete this exercise. There are several.

Page 103: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

103

Page 104: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

104

Metadata – Using the Schema Designers

Page 105: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

105

Structured Schema Designer

The Structured Schema Designer provides a visual user interface for designing structural data files.

The resulting metadata is stored as Structured Schema files with an “.ss.xml” extension. The

“.ss.xml” files include schema, record recognition rule and record validation rule information.

In the Structured Schema Designer, you can create or modify schemas that can be accessed in the

Map Designer to provide structure for Source or Target files.

You can also use the Data Parser to manually parse Binary, fixed-length ASCII, or any other files

that do not have internal metadata. The Data Parser defines Source record length, defines Source

field sizes and data types, defines Source data properties, assigns Source field names, and defines

Schemas with multiple record types.

Page 106: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

106

No Metadata Available (ASCII Fixed)

Objectives

At the end of this lesson you should be able to define and create a structured schema.

Keywords: Data Parser, Modify a Schema, and Hex Browser

Description

The first step will be to tell the Structured Schema what type of file is being defined, so pick the

appropriate connector first.

You can change the name of the default record type from R1 to something more meaningful for your

own task. This is done by choosing Record Types in the hierarchy and overtyping the existing

default name.

To enter the field information for your record layout, select the Fields entry in the navigation tree

and enter the first field name. Continually tab through the grid to enter a description, if desired,

select a data type from the dropdown and enter the field length.

This method assumes you have documentation that describes the structure of the file.

If you do not already know the structure of the file, you can use the visual parser to make educated

guesses until the file is parsed correctly.

When you’re done, you can browse the data to ensure that your definitions were accurate and then

save the structured schema using a name of your choosing.

Exercise

1. Start a New Structured Schema design and choose the ASCII Fixed connector.

2. Click the Visual Parser toolbar button (Red Knife).

3. Navigate to the file named Payments.txt.

4. Click in the current row (blue highlight) between the fields and name the fields by

overtyping in the Field Name drop down list.

5. Save the Structured Schema as s_Payments.ss.xml.

Record Layouts

Record R1

Name Type Length

AccountNumber Text 9

PaymentDate Text 8

Amount Text 10

Total 27

Page 107: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

107

External Metadata (Cobol Copybook)

Objectives

At the end of this lesson you should be able to use a Cobol Copybook to create a new schema.

Keywords: Copybook

Description

You can quickly import the structure from an external definition file.

If you do this from a Map Design session, it will not create a Structured Schema for reuse later.

If you do this from a Structured Schema Design session, you will have the “ss.xml” file for reuse.

If you decided to use a Cobol Copybook to create a new schema, you will define this schema in the

Enter External Connection Info Window.

External Connector Section

This displays the connector you choose from the Connection pull down. You cannot change the

connector from this pull down but must make the change in the toolbar options pull down,

Connections.

Layout/Record Name

When you choose the External File to duplicate, the data will populate the Layout/Record Name

section. You will need to select layouts required for the schema by clicking each item in the Add to

Layouts column.

There are two buttons to aid in selecting Layout/Record items:

Select all

Click Select all to choose all of the Layout/Record items.

Unselect all

If you need to make a change to the Layout/Record Name section, Click Unselect all to start over.

Exercise

1. Start a New Structured Schema design session and choose the Binary connector.

2. Using the drop-down menu in the upper right hand of the window, choose Cobol 01.

3. Navigate to the file named Accounts_Cobol.cbk.

4. Click the Layout/Record Name(s) you want to import.

5. Click OK.

6. Review the structure in the grid view.

7. Save the Structured Schema as s_CobolCopyBook_Accounts.ss.xml.

Page 108: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

108

Record Layouts

Record ACCOUNT_INFO

Name Type Length

ACCTNUM Display 9

NAME Display 21

COMPANY Display 31

STREET Display 35

CITY Display 16

STATE Display 2

POSTCODE Display 10

EMAIL Display 25

BIRTHDATE Display 10

FAVORITES Display 11

STDPAYT Display sign leading 6

LASTPAYT Display sign leading 6

BALANCE Display sign leading 6

Page 109: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

109

Page 110: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

110

Extract Schema Designer

The Extract Schema Designer is a software product that has the ability to read complex text files of

many kinds. The amount of computer data is exploding all around us and grows vastly each year,

and much of it is provided in raw text formats. Some examples of the many sources handled by the

Extract Schema Designer follow:

Printouts from programs captured as disk files

Reports of any size or dimension

ASCII or any type of EBCDIC text files

Spooled print files

Fixed length sequential files

Complex multi-line files

Downloaded text files (e.g., news retrieval, financial, real estate...)

HTML and other structured documents

Internet text downloads

E-mail header and body

On-line textual databases

CD-ROM textbases

Files with tagged data fields

XML

HL7

Swift

Extract Schema Designer does NOT use the XML repository that all of our other Design Tools use.

Extract Schema Designer saves extracts in two ways. The first is in a script file in Content Extractor

Language with a “.cxl” extension. This file is only useful as part of a Source Connection in Map

Designer. It cannot be imported into Extract Schema Designer to be edited. The second way that an

Extract is saved is in an Access Database. The default path and filename for this database is

C:\Program Files\Pervasive\Cosmos9\Common\extractor900.mdb. . Extracts stored here can be

reopened and edited.

Content Extractor Language is very rich and expressive, and provides many advanced data

manipulation and formatting capabilities. CXL can be used to create or customize complex scripts

necessary for text files whose patterns and rules may be beyond the functionality of the user

interface supplied with the Extract Schema Designer. More information about this language is

available in the Content Extraction Language Help file under the SDK Help Files. The default path

and filename for this file is C:\Program Files\Pervasive\Cosmos9\Common\Help\SDKs\cxl_sdk.pdf.

Former users of Data Junction Content Extractor should be aware that the script files are no longer

called “DJP” files. They are known as “CXL” files now.

Page 111: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

111

There are several legacy names that may be used in place of the default connector name of Extract

Schema Designer’s Connector. This list includes: Cambio, Content Extractor, Extractor, and Report

Reader.

There are also two connectors that have a pre-designed script included with the software that parse

statistical information from the log file automatically. These are Data Junction Log File and

Integration Log File.

Page 112: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

112

Interface Fundamentals & CXL

Keywords: Extract Schema Designer Mechanics: Line Styles,

Fields, Accept Record, Automatic Parsing

Description

The first file that we will be parsing is Purchases_Phone.txt. We should take a look at it first in a

text viewer. Although it might be possible to use this report file as a direct input for a transformation,

we would have to define it as a multiple-record-type file. With so many record types and so much

processing involved with them, writing the transformation would be time consuming. So what we

will do is use the Extract Schema Designer to create an extract specification that will transform the

report file into a more familiar row/column format, and then use that formatted data as input to the

transformation that adds these purchases to the database table. We don’t even have to have a two-

step procedure; neither do we have to read the report file twice. Once the extract schema is defined,

we can create a transformation, specify the report file as the Source, and apply the Extract Schema to

it. The file will then be presented to the transformation in simple rows and columns- complete with

headers.

Exercise

Start Extract Schema Designer.

1. From the Repository Explorer, select New Object › Extract Schema.

2. At the prompt, navigate to the file you will be working with, in this case,

Purchases_Phone.txt.

3. Choose OK to accept the Source Options defaults.

4. Highlight the word “Category” on one of the Category lines and right-click in the

highlight.

5. Select Define Line Style › New Line Style.

6. Verify that all defaults are acceptable and click Add. We’ve now defined a Line Style

for the Category field.

7. Highlight the Category code on one of the Category lines and right-click in the

highlight.

8. Select Define Data Field › New Data Field.

9. Change the field name to Category.

10. Verify that all other defaults are acceptable and click Add. We’ve now defined the

Category Data Field.

11. Highlight a ProductNumber and the rest of the spaces on the line and right-click in the

highlight.

12. Select Define Data Field › New Data Field.

13. Change the field name to ProductNumber.

14. Verify that all other defaults are acceptable and click Add.

Page 113: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

113

15. Highlight a Quantity and all but one of the spaces between the actual digits of the

Quantity and the colon following the literal “Quantity” (if any).

16. Right-click in the highlight and select Define Data Field › New Data Field.

17. Change the field name to Quantity.

18. Verify that all other defaults are acceptable and click Add.

Now let’s ensure that Source Options will allow parsing:

19. Select Source › Options from the Menu bar.

20. On the Extract Design Choices tab, look in the Tag Separator dropdown to see if there

is a character sequence that matches the sequences used in your data to separate Line

Style “tags” from actual data. If there is, select it. If there is not, then automatic parsing

is not available. Also on this tab, ensure that the Trim Leading and Trailing Spaces

checkbox is selected.

21. On the Display Choices tab, ensure that the Pad Lines checkbox is selected.

22. Click OK to accept the selections.

Now let’s define the UnitCost Line Style and Data Field simultaneously.

23. Highlight an entire UnitCost line in the data and right-click in the highlight.

24. Select Define Data Field › Parse Tagged Data.

Note: When Line Styles and Fields are defined in this way, the default name for the Field is

exactly the same as that for the Line Style, so no change to the field name is usually

necessary. If a change is desired, however, point your cursor to the actual field data in the

display and double-click on the data. This will bring up the Field Definition dialog box and

you can change the name (or other characteristics) here.

Now we’ll define the TotalCost and ShipmentMethodCode Line Styles and Data Fields

simultaneously.

25. Highlight an entire TotalCost line and ShipmentMethodCode line in the data.

26. Right-click in the highlight and select Define Data Field › Parse Tagged Data.

The next thing is to define the Line Style that determines the end of a row of data for the

Extract File.

27. Locate the Line Style that contains the Field that will be the last column in each row in

the eventual extract file (in this case, ShipmentMethodCode).

28. Double-click on the Line Style name to bring up the Line Style Definition dialog.

29. On the Line Action tab, choose ACCEPT Record, and accept the remaining defaults.

30. Click Update.

Test the Extract to ensure that your definitions are correct.

31. Click on the Browse Data Record button.

32. Choose OK to allow assignment of all Fields to the Extract File.

33. Examine the data to ensure that your Field definitions are correct.

Page 114: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

114

34. Close the browser window.

35. Use the Parse Tagged Data functionality to define the Account Number, Purchase

Order Number and PODate fields.

36. Double-click on a Purchase Order Number to access the Field Definition dialog.

Note: The options at this dialog determine how the Extract Schema Designer will process the data in

this particular field from record to record. The use of these options makes a distinction between the

data fields and the contents of those fields. When the Extract Schema Designer is collecting data

fields, it collects all the fields that have been defined on lines of text whose line action is either

COLLECT Fields or ACCEPT Record and assembles those fields into a data record. The options at

this dialog determine how data within a data field is handled.

37. On the Data Collection/Output tab, ensure that Propagate Field Contents has been

selected.

38. Double-click on a PODate to access the Field Definition dialog.

39. On the Data Collection/Output tab, select Flush Field Contents.

40. Click Update.

41. Click on the Browse Data Record button.

42. Choose OK to allow assignment of all Fields to the Extract File.

43. Examine the data to see the effect of Propagate versus Flush.

44. Close the browser window.

45. Redefine the PODate field to propagate it as well.

46. Browse the data record again to ensure the data is being propagated.

Page 115: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

115

Note: In this case we do want the data to propagate, but you will need to decide which

behavior you want for any situation.

We can specify an order for the columns in your Extract File (if desired).

47. Choose Field › Export Field Layout from the menu bar.

48. To reposition a column, left-click and drag a column name up or down in the list,

dropping it on top of another column name.

Note: When you drag “upward,” the column you are dragging will be placed before the

column on which you drop it. When you drag “downward,” the column you are dragging

will be placed after the column on which you drop it.

49. Put the six columns in the order they appear in the source file.

50. Click OK.

51. Exclude columns from the Extract File (if desired).

52. Select Record › Edit Accept Record from the menu bar.

53. Clear the check boxes for the columns that you do not wish to appear in the Extract File.

54. Click Update.

55. Save the Extract Schema Definition:

If the Extract Schema Definition has already been saved before, click the Save

Extract button to save it again under the same name. You may also choose File

› Save Extract to perform the same function.

If the Extract Schema Definition has not yet been saved, click the Save Extract

button. In the “Save Extract” dialog, supply the name Phone_Purchases.cxl and

verify the location where the Definition will be stored (changing it if necessary).

You may also choose File › Save Extract to perform the same function.

If the Extract Schema Definition has been saved before, but you have modified it

and want to save it as a different Definition, then choose File › Save Extract As. In

the “Save Extract” dialog, supply a name for the Definition and verify or supply the

save location.

56. Close the Extract Schema Designer.

57. Open Map Designer and establish a source connection based on the information below.

58. Open the Source Data Browser and note the results. Note that this source could now be

used in the same way that any other source would be in a transformation.

59. Close Map Designer without saving.

Page 116: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

116

Data Collection/Output Options

Keywords: Data output properties: Flush or Propagate field

contents

Description

Most of the time, all you will want to extract from files such as our report file is the actual data that

describes the business objects- in this case, the purchases. But sometimes, there will be other

information in the file that you would also like to capture. For example there may be “header” or

“footer” information that you would like to have available in the transformation. With the Extract

Schema Designer we can define header and/or footer information, and add it, as additional columns,

to the row/column file specification.

Exercise

1. From the Repository Explorer, select New Object › Extract Schema.

2. At the file selection prompt, click Cancel.

3. Double-click on the Purchases_Phone.cxl script to open it.

4. Choose File › Save Extract As and save the extract again as Purchases_Phone2.cxl.

5. Highlight the first slash in the ReportDate.

6. Right-click in the highlight and select Define Line Style › New Line Style.

7. Change the proposed name to ReportDate.

8. Choose Add.

9. Highlight the second slash in the ReportDate.

10. Right-click in the highlight and choose Define Line Style › Append Line Pattern.

11. Double-click on the ReportDate line style name to view the results.

Note: This Line Style definition will be sufficient so long as there is no other line of

information in the file that has slashes in positions 24 and 27 and which does not contain a

Report Date. If there were, we could use the same procedure to add the spaces in front of

and after the actual date. If that were still not sufficient, then we could use additional

techniques that we will learn in later exercises to make the Line Style definition a unique

one.

12. Highlight the Report Date.

13. Right-click in the highlight and select Define Data Field › New Data Field.

14. In the Field Definition dialog, change the name of the Field to ReportDate.

15. Click Add.

16. Use the Browse Data Record button to view the results.

17. Highlight the entire Order File Creator text line at the bottom of the file.

Page 117: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

117

18. Right-click in the highlight and select Define Data Field › Parse Tagged Data.

19. Double-click on the Order_File_Creator Line Style to change its name (if desired).

20. Double-click on the actual email address to open the Field Definition dialog.

21. Change the Field Name to OrderFileCreatorEmailAddress.

22. Click Update.

23. Use the Browse Data Record button to view the results.

24. Close the browser then Double-click on the Order_File_Creator Line Style name to

open the Line Style Definition dialog.

25. On the Line Action tab, change the action to ACCEPT Record.

26. Click Update.

27. Choose Record › Edit Accept Record from the menu bar.

28. Choose Order_File_Creator for the Current Accept Record.

29. Select the OrderFileCreatorEmailAddress checkbox.

30. Choose ShipmentMethodCode for the Current Accept Record.

31. De-select the Order_File_Creator checkbox.

32. Click Update.

33. Use the Browse Data Record button to view the results.

34. Save the Extract Schema Design as Purchases_Phone2.cxl and close the Extract

Schema Designer.

Note: When an Extract Schema Design like this one is used as part of the Source specification for

a transformation, the transformation Map tab will look as if the input file had been defined to

have multiple record types. The email address will be in the last record read by the

transformation, of course. If your requirements dictate that the email address be available as

actual purchase records are processed, then you will have to use other techniques in a more

complex transformation.

Page 118: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

118

Extract Schema Designer: Extracting Variable Fixed Field Definitions

Keywords: Extract Schema Designer: Multiple Fields per Line

Style (variable)

Description

The next file that we will be parsing is Purchases_Fax.txt. We can examine it in a text viewer.

Notice that this file has fields with variable lengths so that any given field may not occupy the same

column position as it did in the previous record. What we plan to do is use the Extract Schema

Designer to create an extract specification that will transform the report file into a more familiar

row/column format, and then use that formatted data as input to the transformation that adds these

purchases to the database table. As before, we don’t require multiple passes of the input file. We

will just create the extract schema and apply it to the input on the Source tab of our eventual

transformation.

Exercise

1. From the Repository Explorer, select New Object › Extract Schema.

2. At the prompt, navigate to the file you will be working with, in this case,

Purchases_Fax.txt.

3. In the Source Options dialog, choose OK to accept the defaults.

4. Highlight the literal Order Header and right-click in the highlight.

5. Select Define Line Style › Auto New Line Style › Action - Collect fields.

6. Highlight an Account Number and Right-click in the highlight.

7. Select Define Data Field › New Data Field.

8. Change the Field Name to AccountNumber.

9. For the Start Rule, choose Floating Tag.

10. Enter the tag “Account Number(“.

11. Use first tag starting at position 0.

12. For the End Rule, choose Floating Tag.

13. Enter the tag “)” (a single closing parenthesis).

14. Use first tag starting at position 0.

15. Choose Add.

16. Highlight a PO Number and right-click in the highlight.

17. Select Define Data Field › New Data Field.

18. Change the Field Name to PONumber.

19. For the Start Rule, select the first floating tag of “PO Number(“ starting at position 0.

20. For the End Rule, select the first floating tag of “)” starting at position 0.

21. Choose Add.

Page 119: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

119

Note: When working with Floating Tags, the starting position for the End Rule is relative to

the beginning of the Field being defined- not the beginning of the record. So even though

the closing parenthesis for the PONumber is the second one from the beginning of the file, it

is only the first one from the beginning of the PONumber.

22. Highlight a PO Date, right-click and select Define Data Field › New Data Field.

23. Change the Field Name to PODate.

24. For the Start Rule, select the first floating tag of “PO Date: ” starting at position 0.

Please note that there is a space after the colon.

25. For the End Rule, choose End of Line.

26. Choose Add.

27. Highlight the literal Item and right-click in the highlight.

28. Select Define Line Style › Auto New Line Style › Action - Collect fields.

29. Highlight a Category and right-click in the highlight.

30. Select Define Data Field › New Data Field.

31. Change the Field Name to Category.

32. Choose Add.

33. Highlight a Product Number and right-click in the highlight.

34. Select Define Data Field › New Data Field.

35. Change the Field Name to ProductNumber.

36. For the Start Rule, select the first floating tag of “/” starting at position 0.

37. For the End Rule, select the first floating tag of “ ” (a single space) starting at position 0.

38. Choose Add.

39. Highlight a Quantity, right-click and select Define Data Field › New Data Field.

40. Change the Field Name to Quantity.

41. For the Start Rule, select the third floating tag of “ “ (a single space) starting at position

0.

42. For the End Rule, select the first floating tag of “/” starting at position 0.

43. Choose Add.

44. Highlight a Unit Cost, right-click and select Define Data Field › New Data Field.

45. Change the Field Name to UnitCost.

46. For the Start Rule, select the second floating tag of “/ “ starting at position 0.

47. For the End Rule, select the first floating tag of “/ ” starting at position 0.

48. Choose Add.

49. Highlight a Shipment Method Code, right-click and select Define Data Field › New Data Field.

50. Change the Field Name to ShipmentMethodCode.

51. For the Start Rule, select the third floating tag of “/ “ starting at position 0.

Page 120: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

120

52. For the End Rule, choose End of Line.

53. Choose Add.

54. Locate the Line Style that contains the Field that will be the last column in each row in

the eventual extract file (in this case, Item).

55. Double-click on the Line Style name to bring up the Line Style Definition dialog.

56. On the Line Action tab, choose ACCEPT Record, and accept the remaining defaults.

57. Click Update.

58. Click on the Browse Data Record button.

59. Choose OK to allow assignment of all Fields to the Extract File.

60. Examine the data to ensure that your Field definitions are correct.

61. Close the browser window.

62. Ensure that the Fields are in the order they appear in the input data.

63. Save the Extract Schema Design as Purchases_Fax.cxl.

64. Close the Extract Schema Designer.

65. Remember that this schema can be used as part of a source connection in Map Designer.

Page 121: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

121

Process Designer for Data Integrator

Process Designer is a graphical data transformation management tool you can use to arrange your

complete transformation project. With Process Designer, you can organize Map Designer

Transformations with logical choices, SQL queries, global variables, Microsoft's DTS packages, and

any other applications necessary to complete your data transformation. Once you have organized

these Steps in the order of execution, you can run the entire workflow sequence as one unit.

IntegrationArchitect_ProcessDesigner.ppt

Page 122: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

122

Process Designer Fundamentals

The heart of the integration product tool set is Map Designer. The main function of Map Designer is

to transform data from one format, layout, or application to another. Process Designer integrates the

Transformations created by Map Designer with any other applications or processes that need to be

done to complete an entire job.

In order to create a Process, first consider what is necessary to accomplish the complete

transformation of your data. You should form a general idea of the logical steps to reach your goal.

This includes the applications you will need, and what decisions must be made during the Process.

Once you have a good idea of what will be involved, open Process Designer (via the Start Menu or

Repository Explorer) and begin. Remember that Process Steps can be re-arranged, deleted, added, or

edited as you build your design.

Page 123: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

123

Creating a Process

Objectives

At the end of this lesson you should be able to create a simple Process Design.

Keywords: Process Designer, Transformation Map, and Component

Description

Process Designer can be used from beginning to end to make your data transformation task simpler

and more streamlined. Map Designer is one of the applications that can be called from within

Process Designer. Process Designer allows you to create new Transformations, use existing

Transformations, or use a copy of an original transformation file; where the original transformation

file remains unchanged. Follow the steps below to create a simple process.

Exercise

1. Open Process Designer.

2. Add a Transformation step to the Process Design.

3. Right Click on the Transformation Map and choose Properties.

4. Click Browse and choose m_OutputModes_Clear_Append.map.xml from a previous

exercise or from the solutions folder.

Note: A Process Designer SQL Session is a particular method of connecting to the given

SQL application's API. We can use the same session in multiple steps or create new sessions

wherever needed. We must have at least one session if any connection to a relational

database is made during the process.

Page 124: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

124

5. A SQL Session is created based upon the maps target connection. Accept the default

session for the target and click OK.

6. Name this step Load_Accounts.

7. Add another Transformation step to the Process Design.

8. Right Click on the Transformation Map and choose Properties.

9. Click New to open the Map Designer.

10. Create a new map that loads the ASCII Delimited file Category.txt into the

tblCategories table in the TrainingDB Database. Use the report below for specifications.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Category.txt

Source Options: Header = False

Define the Target:

Target Connector: ODBC 3.x

Page 125: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

125

Target Data: Database: TrainingDB

Table: tblCategories

Target Options: None

Target OutputMode: Clear File/Table contents and Append

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPutRecord target name Target

record layout R1

11. Accept the default for the Transformation Step dialog.

12. Choose “Use an existing session for the target” in the Sessions Dialog.

13. Name step Load_Categories.

14. Create a new map that loads ShippingMethod.txt into the tblShippingMethod table in the

TrainingDB Database. Use the report below for specifications.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)ShippingMethod.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Field Expressions

R1.Code Fields("Field1")

R1.Category Fields("Field2")

R1.ProductManager Fields("Field3")

Page 126: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

126

Target Data: Database: TrainingDB

Table: tblShippingMethod

Target Options: None

Target OutputMode: Clear File/Table contents and Append

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPutRecord target name Target

record layout R1

15. Accept the default for the Transformation Step dialog.

16. Choose “Use an existing session for the target” in the Sessions Dialog.

17. Name step Load_ShippingMethod.

18. Establish the Step Sequence as described below (Use the corresponding image if

necessary).

19. Start → Load_Accounts → Load_Categories → Load_ShippingMethod → Stop

20. Validate the Process Design.

21. Save the Process as p_Load_Tables.ip.xml.

22. Run the Process Design.

23. Examine the Target Tables.

Target Field Expressions

R1.Shipping Method Code Fields("Shipping Method Code")

R1.Shipping Method Description

Fields("Shipping Method Description")

Page 127: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

127

Page 128: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

128

Parallel vs. Sequential Processing

Objectives

At the end of this lesson you should be able to create a parallel process.

Keywords: Multi-threaded, Single-threaded

Description

The Integration Engine can execute single-threaded or multi-threaded processes, depending on your

license.

Process Designer now utilizes the power and speed of multithreading when running a Process. If you

own the multithreaded Integration Engine, you can allow the operating system to control the load

balancing across CPUs for more efficient processing. This will even work to spool multiple threads

off of a single processor.

If you set up the Maps within your Process to run in parallel, the Process Designer will launch each

Map in parallel on its own thread. There is no need to code anything within your Maps or Processes.

It is all done for you behind the scenes as long as you set the Max Concurrent threads property in the

Process Design.

Multithreading allows parallel execution of Process Designer Steps, where several transformation

steps in Process Designer can be simultaneously executed across multiple CPUs on a server.

Exercise

1. Open the p_Load_Tables process from the previous exercise or from the solutions

folder.

2. Run the process and check the log file for the length of time the process took to run.

3. Change the format of linking the steps in the process as pictured in the figure below.

4. Create a separate SQL Session for the target in each map.

5. Open the Process Properties Dialog and set Max Concurrent Execution Threads to 3.

6. Validate the process, and then save it as p_ParallelProcessing.ip.xml.

7. Run the process and check the log file for the length of time it took to run.

Page 129: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

129

There is a limit to the number of execution threads that can be executed per license. The Max

number of execution threads allowed can be found on the Splash Screen. From the Toolbar choose

Help › About Process Designer › Licensed Features. Under the list of Features you will find the

feature Max allowed threads.

Page 130: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

130

Conditional Branching – The Step Result Wizard

Objectives

At the end of this lesson you should be able to add a conditional statement to the Decision Step

Keywords: Error Handling; Conditional Branching; Metadata

Execution Variables; Step Result Wizard; Boolean Expressions

Description

The Decision Step allows you to design a conditional expression to determine which work

flow path the Process will follow. Generally, this is done with a Boolean expression. In this exercise

we will use the Sep Result Wizard to create a Boolean expression that will determine the work flow

path for the steps in the process pictured below.

Exercise

1. Open Process Designer.

2. Add a Transformation step to the Process Design.

3. Right Click on the Transformation Map and choose Properties.

Page 131: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

131

4. Click Browse and choose m_Reject_Connect_Info.map.xml from a previous exercise

or from the solutions folder.

5. Accept the default for the Transformation Step and the Sessions dialog.

6. Name step LoadAccounts_CheckDates.

7. Add a Decision step to the Process Design.

8. Right-click on the Decision icon and select Properties.

9. Name the step Eval_RejectRecordCount.

10. Using the Step Result Wizard, create and add the following code:

project("LoadAccounts_CheckDates").RejectRecordCount > 0

11. Click OK to close.

12. Add a Scripting Step to the Process Design.

13. Right-click on the Scripting icon and select Properties.

14. Name the step NotificationBadDates.

15. Use the Build button to build an expression that will display “There are STILL invalid

dates!!" in a message box with a stop icon and an OK button and the title “Invalid Date

Warning”:

MsgBox("There are STILL invalid dates!!", 16, "Invalid Date Warning")

16. Click OK to close.

17. Link the steps as follows:

18. Start → LoadAccounts_CheckDates → Eval_RejectRecordCount → (False) Stop

19. Link the remaining steps as follows:

20. Eval_RejectRecordCount → (True) NotificationBadDates → Stop

21. Validate the Process Design.

22. Save your Process Design as p_ConditionalBranching_StepResultWizard.ip.xml

23. Run the Process and observe the results.

Page 132: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

132

FileList - Batch Processing Multiple Files

Objectives

At the end of this lesson you should be able to build a Filelist that gathers a list of file names and

stores them in an array variable.

Keywords: Change Source Action, “NUL:” connection string, File

List Function, Array Variables, DefineMacro, and Looping

Description

Builds a list of user-specified file types.

Returns a 'Type Mismatch' error if the results parameter is not an array.

The FileList Function returns both file AND DIRECTORY names within a given directory. If you

want to work only with file names, you will need to test the return names using the IsFile Function to

determine which files you want to use.

Note: You cannot use FileList to return a list of files via FTP.

Page 133: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

133

Exercise

1. Create process variables as described below.

Variables

Name Type Public Value Description

files Variant - Array No Array. Elements contain names of files passed from the FileList function.

fileCounter Variant No -1 Counter. Index counter for the files() array.

filePath Variant No Variable. Stores the path of the Inbox directory where batch files are located.

currentFile Variant No Variable. Stores the name of the next file to be processed.

2. Add a Transformation step onto the Canvas.

3. Name the step LoadAccountsTable.

4. Click Browse to locate the m_OutputModes_Clear_Append.map.xml from a previous

exercise or from the solutions folder.

5. Accept the defaults in the Sessions dialog to Create a New Session for the target.

6. Add a scripting step as described below. Name the step BuildFileList:

Expression:

' Set directory for incoming files.

' Consider using lookup or user input for this value.

filePath = MacroExpand("$(FUN_DATA)") & "Inbox\"

' Gather list of file names. Use wildcards if needed.

FileList(filePath & "AddrChg*.*", files())

' Set array index counter (Zero based).

fileCounter = UBound(files)

7. Add a decision step as described below. Name the step GotFiles?:

Expression: fileCounter > -1

Page 134: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

134

8. Add a scripting step as described below. Name the step Notification_NoFiles:

9. Use the Line Builder to connect the steps created thus far.

10. Start → LoadAccountsTable → BuildFileList → GotFiles→ (False)

Notification_NoFiles → Stop

11. Add a scripting step as described below. Name the step SetCurrnentFile:

Option Explicit

' Trap runtime errors (e.g., Array Index Out of Bounds)

ON ERROR GOTO ErrorScript

' Set variable for the current file. Define Macro for use within the map.

currentFile = filePath & files(fileCounter)

DefineMacro("SOURCE_FILE", currentFile)

' Verification...

Dim f

f = Ubound(files) - fileCounter

MsgBox("Processing File: " & files(fileCounter) & ". File " & f + 1 & " of " &

Ubound(files)+1)

' Use the Return statement to exit this module

Return

' Error handler

ErrorScript:

' Get the error info and check variable values

LogMessage("ERROR","Err.Number = " & Err.Number & " " & _

"Err.Description = " & Err.Description & " " & "FileDirectory=" & filePath & " " & _

"fileCounter=" & fileCounter & " " & "CurrentFile=" & files(fileCounter))

Terminate()

12. Add a Transformation step and name the step UpdateAddresses.

13. Click New and build a map based on the specifications below:

Expression: MsgBox("No Files to Process: Exiting")

Page 135: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

135

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(SOURCE_FILE)

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

Table: tblAccounts

Target Options: None

Target OutputMode: Update

Source Schema

Filed Name Type Length Description

Account Number Text 9

New Street Text 34

Total 43

Target Field Expressions

R1.AccountNumber Records("R1").Fields("Account Number")

R1.Address Records("R1").Fields("New Street")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPutRecord target name Target

record layout R1

Page 136: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

136

14. Save the map as m_UpdateAddresses.map.xml and close Map Designer.

15. Use the SQL session as was created in the first transformation step.

16. Add a decision step and name the Step SuccessCheck.

17. Use the Step Result Wizard to build the expression below:

18. Add a scripting step as described below. Name the step UpdateFileCounter:

19. Add a scripting step as described below. Name the step Notification_UpdateFailure:

20. Connect the remaining steps as in the screen shot above the exercise instructions.

21. Validate the process.

22. Save the process as p_FileListLoop.ip.xml.

23. Run the process and observe the results.

Expression: Project("UpdateAdds").ReturnCode == 0

' Decrement the file counter variable

Expression: fileCounter = fileCounter – 1

Expression: MsgBox("Update Address Map Failed")

Page 137: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

137

Pervasive Integration Engine

Pervasive Integration Engine™ is an embedded data Transformation engine used to deploy runtime

data replication, migration and Transformation jobs on Windows or UNIX-based platforms quickly

and easily without costly custom programming. It fills the need for a low-cost, universal data

transformation engine.

The Integration Engine is a 32-bit data transformation engine written in C++, containing the core

data driver modules that are the foundation for the transformation architecture. Because the

Integration Engine is a pure execution engine with no user interface components, it can perform

automatic, runtime data transformations quickly and easily, making it ideal for environments where

regular data transformations need to be scheduled and launched on Windows or UNIX-based

systems.

Maps and Processes can be scheduled through the command line or invoked through an API. APIs

are documented in the Integration Engine SDK.

Page 138: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

138

Syntax: Version Information

Objectives

This lesson shows how to retrieve version and licensing information from the engine via the

command line interface.

Keywords: djengine, Executable and Version Information

Exercise

1. Open a command window by typing cmd in the Windows Run dialog.

2. Use a cd command to navigate to directory where the engine is installed.

3. The default directory is: C:\Program Files\Pervasive\Cosmos9\Common.

4. To get the current engine version information, type:

djengine –version

Page 139: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

139

Options and Switches

Objectives

This lesson shows how to get the usage syntax and all options, or switches, available through the

command line interface of Integration engine.

Keywords: Syntax and Option Overrides

Exercise

View the different options and parameters available for executing transformations and processes by

using the “-?” switch.

To see all the available options, at the command prompt type: djengine –help

Page 140: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

140

Page 141: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

141

Execute a Transformation

Objectives

This lesson demonstrates how to execute a Transformation Map via the command line interface.

Keywords: Executing a Map

Description

At the command prompt type:

djengine MapName.tf.xml .

Note: Be sure to use the file that has the extension “.tf.xml.” This is the transformation file. The

transformation file contains all of the connection information that the engine needs to connect to the

source and target. It also contains a link to the map file. If you provide the engine with the name of

the map file (the file with a “.map.xml” extension), you will receive errors.

Tip: You can browse to the file in Windows Explorer and drag and drop the file onto the command

line.

Add –verbose at end of command to get statistics printed to the console during runtime.

At the command prompt type:

djengine C:\Cosmos9_Work\

Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTest.tf.

xml -verbose

Page 142: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

142

Using a “-Macro_File” Option

Objectives

At the end of this lesson you should be able to utilize a Macro Definition file for porting Maps and

Processes from one Integration Engine installation to another.

Keywords: Macro Definition, Macro Manager and Macro File

Description

There are two ways to use a Macro to define a connection on the command line.

The –Macro_File (-mf) command shows the path to the Macrodef.xml file (the default location for

that file is in the Workspace1 folder) that holds the values of all the macros that we have defined. If

you have multiple macros defined, this may be your preferred method.

At the command prompt type:

djengine -Macro_file C:\Cosmos9_Work\Workspace1\macrodef.xml

C:\Cosmos9_Work\Fundamentals\Solutions\IntegrationEngine_CommandLin

e\EngineTestwithMacro.tf.xml –verbose

The –Define_Macro command allows you to define individual Macros on the command line.

At the command prompt type:

djengine -Define_Macro FUN_DATA=C:\Cosmos9_Work\ Fundamentals\Data\ C:\Cosmos9_Work\Fundamentals\Solutions\IntegrationEngine_CommandLin

e\EngineTestwithMacro.tf.xml -verbose

Page 143: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

143

Executing a Process

Keywords: Using the Process Design Option

Command syntax is djengine -process_execute file name (include path)

At the command prompt type (code below is wrapped around the command line):

djengine –pe –verbose -Macro_File

C:\Cosmos9_Work\Workspace1\macrodef.xml

C:\Cosmos9_Work\Fundamentals\Solutions\ProcessDesigner_DataIntegrat

or\CreatingAProcess.ip.xml

Notes

We are using the Macro_File command because some of the Maps in the process uses a

Macro as part of the source connection.

Every process should contain the –pe switch as the first switch. It should always be used

even if you notice there are times when a process runs without it.

The process name being called should always be the last item in the command.

Any extra switches used should be entered after the –pe switch and before the path to the

process.

Page 144: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

144

Additional Sample Exercises – Integration Engine

Page 145: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

145

Command Line Overrides – Source Connection

Keywords: Dynamic Override for Source File

Let's substitute a different source file from the original file defined in the Transformation to show

how overrides can be performed at execution time. The syntax of the command is:

djengine -Source_Connect_Info string (include path)

At the command prompt type:

djengine -Source_Connect_Info C:\Cosmos_Work\

Fundamentals\Data\AccountsSmall.txt C:\Cosmos_Work\

Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTestw

ithMacro.tf.xml -verbose

AccountsSmall.txt is a file that has the same format as Accounts.txt, but it only has 54 records.

Note that only 54 records were written. Note also that we did not need to define the Macro or the

path to the Macro File. The Macro in the map was only used in the source connection and we

defined a new source with a complete path. So the Macro was no longer relevant.

Page 146: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

146

Ease of Use: Options File

Keywords: Using Text Editor for Command Line Options

Type the command from the previous exercise (leave out the first word, DJEngine) in a text editor

and save the file as Options.bas in Cosmos root directory.

This is called an Optfile.

Note: If you did not save this file in the Cosmos root directory, you’ll have to include the path of the

file as well as the file name in the command.

At the command prompt type:

djengine @Options.bas

Including the DJEngine command in the batch file will allow you to use the batch file with third

party scheduling tools.

Save the file Options.bas as “Options.bat”.

Include the djengine call in Options.bat so that the entire text of the file reads:

djengine –pe –verbose -Macro_File

C:\Cosmos9_Work\Workspace1\macrodef.xml

C:\Cosmos9_Work\Fundamentals\Solutions\ProcessDesigner_DataIntegrat

or\CreatingAProcess.ip.xml

If you are using a Windows machine, use Windows Task Scheduler or schtask to schedule this

process.

Tip: You may choose to add a pause command at the bottom of the script so that the command

prompt will remain open, and you can verify that the process ran.

Page 147: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

147

Checklist – Integration Engine

Troubleshooting

• Review the Integration Engine Command Line Interface Error Messages in the Help Files as

well as the Error Code Reference

• Check the command line syntax

• Verify that the tf.xml is being used for executing maps

• Be sure the map or process file is specified last in the command

• Check spelling

• Verify the license has not expired (run “djengine –V” from the command line)

• Confirm the appropriate version is installed

• Does the process/map run from the Design Tools?

• Are your Environment variables setup correctly (i.e. PATH, connector specific such as

Oracle, Java paths)? Use the SET command to see a quick list of Windows environment

variables.

• Try a backup or previous copy of your file

• Are you using the correct case? The following can be case sensitive

• Macro names

• Platforms (Unix)

• Switches (i.e. -V vs. –v)

• Does it run on one platform and not on another ?

• Check your file path slashes

• Windows - back slashes: “\”

• Unix - forward slashes: “/”

Setting an Environment Variable in Windows

This setting allows the user to call the DJEngine command from any path and eliminates the need to

include the full path the command each time.

1. Right Click My Computer, and Choose the Properties option

2. Click on Advanced tab

3. Click the Environment Variables button

4. Under System variables, scroll down to Path

5. Double click Path

6. In Variable Value, put the following path followed by a semicolon in front of the first path:

Page 148: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

148

C:\Program Files\Pervasive\Cosmos9\Common;

Illustration: Setting Environment Variable in Windows

Engine Profiler

The Engine Profiler is a tool designed to fine tune your Transformations and Processes. There is an

excellent document that goes into detail of the functionality and use of the Engine Profiler at

C:\Program Files\Pervasive\Cosmos9\Common\Help\PDF\engine_profiler.pdf

Page 149: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

149

Page 150: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

150

Intermediate Mapping Techniques

This section explores the capabilities of Transformation Map Designer in more detail.

Page 151: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

151

Multiple Record Type Structures

Page 152: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

152

Multiple Record Type – 1 One-to-Many

Objectives

At the end of this lesson you should be able to create a target file that has multiple record types from

a source file that has a single record type. You will also become more familiar with the

OnDataChange event.

Keywords: OnDataChange Event, Data Change Event, Parse and

Format functions

Description

There are two possible scenarios in creating a multiple-record-type target file from a single-record-

type source. In the simplest, you want to break down each source record into n-different target

records. You might want to take source fields 1-5 from the source and put them in target record “A”

and take source fields 6-10 and put them in target record “B.” To perform this task, you define your

target record types as you learned in an earlier lesson and then, in an AfterEveryRecord event, just

perform two “ClearMapPut” actions- one for each target record type.

The second and more complex situation occurs when you don’t necessarily want to create both target

records “A” and “B” from each source record. Let’s assume that the source file contains customer

information in fields 1-5 and sales information in fields 6-10. We’ll assume that the source is in

order by field 1- the customer number. In this case, we only want to write a target record “A” when

we encounter a new customer, although we want to write a target record “B” for each source record.

In order to perform this task, we can use the OnDataChange Event. For our solution, we’ll set up the

event to monitor the customer number. Each time it changes (including the change from an empty

source buffer to the valid value from the very first record), we will write out a target record “A.”

We’ll use our AfterEveryRecord event to write out target record “B.”

Keep in mind that this transformation is assuming a single target file with multiple record types. This

is very similar to a target with a single database with two tables, and in this latter situation different

techniques would be used, though the events will be very similar.

In this exercise, the source file contains records that have employee demographic information and

vehicle lease information. If an employee has a lease to more than one car, there is one source record

for every car. Thus the employee demographic information is redundantly written in these records.

In our target file we will eliminate the redundancy by creating two distinct record types. We will

write one target record for each employee and we will write one “child” record of a different record

type for each vehicle.

For example our source data has the following format:

Employee1 Data, Auto1 Data

Employee1 Data, Auto2 Data

Employee1 Data, Auto3 Data

Employee2 Data, Auto1 Data

Employee2 Data, Auto2 Data

The resulting target file will have the following format:

Page 153: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

153

Employee1 Data

Auto1 Data

Auto2 Data

Auto3 Data

Employee2 Data

Auto1 Data

Auto2 Data

In order to achieve this it will be pertinent to know which event handlers require a ClearMapPut

Action to write the target records. You can only make this decision by knowing what is contained in

the source buffer or buffers. (The Source Buffer is the internal object that stores the values that have

just been read in from a source record. There is one buffer for each source record type.)

As a general rule, you will create at least one write action (usually a ClearMapPut) for every record

type in the target.

Exercise

1. Create a map based on the specifications given below.

2. Save the map as m_One_to_Many.map.xml.

3. Run the map and observe the results.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Autos_Sorted.txt

Source Options: Header = True

Define the Target:

Target Connector: ASCII(Fixed)

Target Data: File: $(FUN_DATA)Autos_MultiRecType.txt

Target Options: None

Target OutputMode: Replace

Page 154: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

154

Create 2 record types in the target through the Map Designer user interface. The layouts for both

record types are described below:

Record Employee

Name Type Length Description

RecordID Text 1

Initials Text 2

Phone Text 10

City Text 9

State Text 2

Total 24

Record Auto

Name Type Length Description

RecordID Text 1

Initials Text 2

Year Text 4

Make Text 10

Color Text 5

Total 22

Target Field Expressions Employee

Employee.RecordID "E"

Employee.Initials Records("R1").Fields("Initials")

Employee.Phone Records("R1").Fields("Phone")

Employee.City Records("R1").Fields("City")

Employee.State Records("R1").Fields("State")

Page 155: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

155

Target Field Expressions Auto

Auto.RecordID "A"

Auto.Initials Records("R1").Fields("Initials")

Auto.Year Records("R1").Fields("Year")

Auto.Make Records("R1").Fields("Make")

Auto.Color Records("R1").Fields("Color")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout Auto

Define Events: Source R1 OnDataChangeEvent

Monitor: Records("R1").Fields("Initials")

Management: Fire first ODC event, Suppress Extra ODC event at EOF

Event Name Event Actions Event Parameters

OnDataChange1 ClearMapPut Record target name Target

record layout Employee

Note: It would be a good idea to create recognition rules for each target record type. It would also

be a good idea to save the schema that was created in the target through the Map Designer Interface.

The trainer can walk you through these steps before moving on to the next exercise.

Page 156: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

156

Multiple Record Type – 2 Many-to-One

Objectives

At the end of this lesson you should be able to work with a multiple record type source file and

create a single record type target file. You’ll gain an understanding of the special nature of the

source buffer for these transformations, and the relationship of the various event handlers to their

individual record types.

Keywords: Multi to one record type, and multiple record

layouts

Description

When you specify a source file that contains multiple record types (either by applying an existing

multiple-record-type structured schema to it or by creating a new multiple-record-type structured

schema) you will find that the Source Hierarchy on the Map Tab will display the individual record

types and give you access to the individual fields for each record type.

To create a single-record-type target file, simply map the fields from their record types within the

Source to the target field list- just as you do in any other Map Design. The Map Designer will take

care of precisely identifying each field with its name and also the source record type it belongs to.

The key to mapping multiple-record-type source files is an understanding how the source buffer is

structured. When the source file specifies multiple record types, your transformation will

automatically set up a large source buffer that contains a section for each different record type. Each

section has a holder for each field defined in that record type. As your transformation reads the

source file, it uses the structured schema and the recognition rules to identify the record type for each

record. Once the record type is identified, it is placed into its proper section of the source buffer.

Another key to working with multiple-record-type source files is using the right event handler. Each

source record type has its own set of event handlers. For example, you may perform a set of actions

each time a record of a particular type is read, or after the first occurrence of each record type is read

and so on. (Using the General Event Handlers, you can also perform actions globally for all record

types.) If your target layout is going to contain fields from three different record types, you will not

want to attempt to write a target record until the source buffer sections for all three of those record

types have been filled.

If we assume that the source file always contains all three record types for each object (e.g.,

customer, account, sale), and if we know that the order is always 1-2-3, then we can simply use the

AfterEveryRecord Event for record type 3 to write a target record. The situation is a bit more

complex if some record types might not exist for a given object yet we want to write a target record

with the data we do have. Assuming the same order restriction, if record type 2 is the one that is

missing, then the problem is that data from a previous record type 2 may still be present in the source

buffer, and we may have to clear that section of the source buffer ourselves. But if record type 3 is

missing we have a different problem. The AfterEveryRecord Event will not be triggered and no

target record will be written. The trainer can discuss methods for solving these problems.

Similar problems exist if the sequence of records in the source file can change from object to object,

but again, these problems can be solved if we understand the operation of the source buffer, use the

right event handlers, and also manually clear sections of the source buffer when necessary.

In this exercise we’ll take the file that we created in the last exercise and change it back into the

format it had before.

Page 157: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

157

Exercise

1. Create a map based on the specifications given below.

2. Save the map as m_Many_to_One.map.xml.

3. Run the map and observe the results.

Map Summary:

Define the Source:

Source Connector: ASCII(Fixed)

Source Data: File: $(FUN_DATA)Autos_MultiRecType.txt

Source Options: none

Source Schema: s_Autos_MultiRecType.ss.xml

Define the Target:

Target Connector: ASCII(Delimited)

Target Data: $(FUN_DATA)Autos_Combined.txt

Target Options: Header = True

Target OutputMode: Replace

Target R1 Record Layout

Name Type Length Description

Initials Text 2

Phone Text 10

City Text 9

State Text 2

Year Text 4

Make Text 10

Color Text 5

Total 42

Page 158: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

158

Target Field Expressions

R1.Initials Records("Employee").Fields("Initials")

R1.Phone Records("Employee").Fields("Phone")

R1.City Records("Employee").Fields("City")

R1.State Records("Employee").Fields("State")

R1.Year Records("Auto").Fields("Year")

R1.Make Records("Auto").Fields("Make")

R1.Color Records("Auto").Fields("Color")

Define Events: Source Auto Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 159: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

159

User Defined Functions

The Rapid Integration Flow Language (RIFL) encompasses functions, statements and keywords that

are used in Source/Target Filters, Target Field Expressions, and Code Modules. You will recognize

VBScript and Visual Basic functions and some SQL Statements. Map Designer also employs many

unique functions, which were designed to help you get the most out of your data.

One of the powerful features of this language is the ability to abstract and reuse scripts in the form of

User-Defined Functions. These functions can be stored and edited in a text file (code module) in a

centralized location so that all of your Maps have access to them.

Page 160: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

160

Code Reuse – Save/Open a RIFL script Code Modules

Objective

At the end of this lesson you should be able to save and reopen an extract script.

Keywords: RIFL Script Editor

Description

The first level of code reusability is simply to save a script to file. You will need to make any

necessary changes when you reopen it in a different map but the script is still intact.

Exercise

1. Simply open any RIFL Script in the Editor Window and click the Save button on the toolbar.

This saves a text file with a RIFL extension somewhere on your network.

2. To reuse the script, click the Open Folder toolbar button in another Script editor window.

You will need to manually change any parameters for use in the new Script window.

3. Next, we will show you how to make the functions more flexible by abstracting them into

User Defined Functions and storing them in Code Modules.

Page 161: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

161

Code Reuse - Code Modules

Objectives

At the end of this lesson you should be able to call a user-defined function from a code module.

Keywords: User Defined Functions, Code Modules, and RIFL

Script Editor

Description

You may call user-defined functions from an external Code Module in Map Designer. Code modules

may be saved as text-only files with a RIFL (Rapid Integration and Flow Language) file extension.

Expressions may be written using the RIFL expression language, and saved with a RIFL extension.

External code modules can be moved to any other machine with the Map Designer or Integration

Engine without a problem. This will allow you to develop a user-defined "library" for use among

different members of your team.

Exercise

1. Create a map based on the specifications given below. Save the map as

m_CodeReuse.map.xml.

2. Run the map and observe the results.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: File: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Page 162: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

162

Define the Target:

Target Connector: ASCII(Delimited)

Target Data: $(FUN_DATA)ZipReport.txt

Target Options: Header = True

Target OutputMode: Replace

Target R1 Record Layout

Name Type Length Description

Account Number Text 9

Zip Text 10

ZipReport Text 25

Total 44

Define Code Modules:

Code Modules: $(FUN_DATA)Scripts\ZipCodeLogic.rifl

Target Field Expressions

R1.Account Number Records("R1").Fields("Account Number")

R1.Zip Records("R1").Fields("Zip")

R1.ZipReport zipTest(Records("R1").Fields("Zip"))

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 163: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

163

Lookup Wizards

Lookup Wizards automate the process of creating lookups for your Transformations. You select that

data that needs to be looked up, browse to those files or tables to automatically build connection

strings, and select the key and returned fields. After using the Lookup Wizard, a reusable code

module is created in your workspace containing the functions you need for performing lookups. The

Code Module files generated by these wizards can then be reused in any Map you create.

There are three types of Lookup methodologies and each has their advantages in certain situations.

They are:

1. Static Flat File Lookups are fast but not very portable or dynamic.

2. Dynamic SQL Lookups are portable and dynamic but not very fast.

3. Incore Table Lookups are extremely fast and can be made more dynamic with extra

RIFL code but they use core memory to store the data.

Page 164: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

164

Incore Table Lookup

Keywords: Lookup Wizard, Incore Memory Table & Lookup, Count

& Counter Variable parameters, One-to-Many records (unrolling

occurrences), and referencing Target Field values

Description

An Incore memory table lookup can be utilized when speed is of the utmost importance. The

primary method of creating the incore table is through use of a DJImport object. The memory table

will then be accessed to perform the lookup. For the purposes of this exercise we will use the

Lookup Wizard to create a Code Module with the desired functions. These functions will create the

incore table, allow us to reference values in the table, and clear the table for memory when we are

finished using it.

Exercise

1. Create a map based on the specifications given below.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: File: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

tblFavoriteInfo

Target Options: none

Target OutputMode: Clear File/Table contents and Append

Note: The following code module should be built through the Lookup Wizard. Follow the

instructions below to use the Lookup Wizard.

Define Code Modules:

Code Modules: $(FUN_DATA)Scripts\Categories.itable.rifl

Page 165: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

165

2. From the Menu click Tools › Define Lookup Functions to open the Lookup Wizard.

3. Choose the Incore Table Lookup Wizard and click Next.

4. Create a new Incore Table Definition named Categories and click Next.

5. Click Build to build a new Connection String and then click Next.

6. Connect to the data source defined below for the lookup and click Next.

Define the Connection String:

Connector: Access 2000

File: C:\Cosmos9_Work\Fundamentals\Data\TrainingDB.mdb

Table: tblCategories

Properties: None

7. Choose the appropriate Key Field, and Fields that should be returned by the lookup.

Page 166: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

166

8. Click Finish.

9. The Wizard will create several Incore Table lookup functions in a code module. Use the

following functions in the appropriate event handlers as described below.

Categories_Init() – Initializes the DJImport object, makes the connection to the data source as defined by the connection string, and builds the Incore table.

Categories_Category_Lookup(KeyValue, DefaultValue) – Creates the SQL call needed to retrieve a value from the Category field based on a Key value.

Categories_ProductManager_Lookup(KeyValue, DefaultValue) – Creates the SQL call needed to retrieve a value from the ProductManager field based on a Key value.

Categories_Clear() – Clears the Incore Table from memory.

Define Events: Transformation and Map Properties Events

Event Name Event Actions Event Parameters

BeforeTransformation Execute Expression: Categories_Init()

AfterTransformation Execute Expression: Categories_Clear()

Page 167: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

167

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord

ClearMapPut Record target name Target

record layout R1

count CharCount("|",Records("R1").Fields("Favorites")) + 1

counter variable cntr

Target Field Expressions

R1.FavoritesID Serial()

R1.Account Number Records("R1").Fields("Account Number")

R1.CategoryCode parse(cntr, Records("R1").Fields("Favorites"), "|")

R1.CategoryLiteral Categories_Category_Lookup _

(Targets(0).Records("R1").Fields("CategoryCode"), "NoMatches")

R1.ProductManager Categories_ProductManager_Lookup _

(Targets(0).Records("R1").Fields("CategoryCode"), "NoManagers")

Page 168: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

168

Relational Database Management System (RDBMS) Mapping

Page 169: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

169

Select Statements – SQL Passthrough

Keywords: SQL Select Statements

The Transformation Map designer source connectors allow for passing Select statements through to

a database server to obtain a row set. The resultant row set that is returned by the query then

becomes the source data for your Map.

Alternatively, you can use the SQL script that generates this source record set by using the SQL File

connection option and pointing to the matching SQL Script file in the Data folder.

This exercise creates a simple transformation that takes only the records from Texas and puts them

into our target.

Exercise

1. Create a map based on the specifications given below.

2. Save the map as m_SQL_Passthrough.map.xml.

3. Run the map and observe the results.

Map Summary:

Define the Source:

Source Connector: ODBC 3.x

Source Data: Database: TrainingDB

SQL Statement: SELECT * FROM tblAccounts WHERE State = „TX‟

Source Options: none

Define the Target:

Target Connector: ASCII(Delimited)

Target Data: $(FUN_DATA)TXAccounts.txt

Target Options: Header = True

Target OutputMode: Replace

Target Field Expressions

R1.AccountNumber Fields("AccountNumber")

R1.Name Fields("Name")

Page 170: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

170

R1.Company Fields("Company")

R1.Street Fields("Street")

R1.City Fields("City")

R1.State Fields("State")

R1.Zip Fields("Zip")

R1.Email Fields("Email")

R1.BirthDate Fields("BirthDate")

R1.Favorites Fields("Favorites")

R1.StandardPayment Fields("StandardPayment")

R1.LastPayment Fields("LastPayment")

R1.Balance Fields("Balance")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 171: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

171

DJX in Select Statements – Dynamic Row sets

Keywords: Integration Query Builder, DJX Syntax, and Dynamic

Row Sets via User Interaction, InputBox

Description

DJX is used to escape into the RIFL expression language to design SQL statements dynamically.

This allows you to use variables and macros in SQL Statement. This exercise will select records

from the tblAccounts table that are from a particular state. That state will be determined at runtime.

Exercise

1. Create a map based on the specifications given below.

2. Save the map as m_SQL_DynamicRowset.map.xml.

3. Run the map and observe the results.

Map Summary:

Define the Source:

Source Connector: ODBC 3.x

Source Data: Database: TrainingDB

SQL Statement:SELECT * FROM tblAccounts WHERE State = „DJX(varState)‟

Define the Target:

Target Connector: HTML

Target Data: $(FUN_DATA)AccountsByState.html

Target Options: index = false; mode = table; table border = true

Target OutputMode: Replace

Variables

Name Type Public Value

varState Variant no

Page 172: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

172

Target Field Expressions

R1.AccountNumber Fields("AccountNumber")

R1.Name Fields("Name")

R1.Company Fields("Company")

R1.Street Fields("Street")

R1.City Fields("City")

R1.State Fields("State")

R1.Zip Fields("Zip")

R1.Email Fields("Email")

R1.BirthDate Fields("BirthDate")

R1.Favorites Fields("Favorites")

R1.StandardPayment Fields("StandardPayment")

R1.LastPayment Fields("LastPayment")

R1.Balance Fields("Balance")

Define Events: Transformation and Map Properties Events

Event Name Event Actions Event Parameters

BeforeTransformation

Execute Expression:

varState = InputBox("Enter the two letter code for the State:", "State Input", "TX")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 173: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

173

Multimode Introduction

Keywords: Multimode Functionality, Insert Action, and Count

Parameter

Multimode is a functionality that allows us to write to more than one table in the same database

within the same Transformation. The use of the Multimode connector provides us with greater

capabilities when mapping to a database. Since we now have the option to map to multiple tables

within a database there isn’t an option to set output modes such as Replace, Append, Clear and

Append, Update, or Delete. We may want to append records to one table, but delete records from

another. Therefore, this functionality now exists as Actions that can be taken with specific record

layouts and table names.

The Account Numbers in the Accounts.txt file all start with either “01” or “02”. The ones that start

with “01” are trading partners. We want to create a Transformation that will insert those records into

the tblTradingPartners table in the TrainingDB Database. The records that start with “02” are

individual customers, and we will insert them into the tblIndividuals table.

Exercise

1. Create a map based on the specifications given below.

2. Save the map as m_Multimode_Intro.map.xml.

3. Run the map and observe the result.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x Multimode

Target Data: Database: TrainingDB

Tables: tblIndividuals, tblTradingPartners

Target Options: none

In order to remove any previous data residing in these tables, we can use a SQL Statement

Action to write literal SQL deleting all records in these tables.

Page 174: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

174

Define Events: Transformation Events

Event Name Event Actions Event Parameters

BeforeTransformation SQL Statement target name Target

statement

Delete from tblIndividuals;

Delete from tblTradingPartners

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapInsert Record target name Target

record layout tblIndividuals

count

If Left(Records("R1").Fields("Account Number"), 2) == "02" Then

1

Else

0

End if

ClearMapInsert Record target name Target

record layout tblTradingPartners

count

If Left(Records("R1").Fields("Account Number"), 2) == "01" Then

1

Else

0

End if

Target Field Expressions tblIndviduals

tblIndividuals.Account Number Records("R1").Fields("Account Number")

tblIndividuals.Name Records("R1").Fields("Name")

tblIndividuals.Street Records("R1").Fields("Street")

tblIndividuals.City Records("R1").Fields("City")

Page 175: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

175

tblIndividuals.State Records("R1").Fields("State")

tblIndividuals.Zip Records("R1").Fields("Zip")

tblIndividuals.Email Records("R1").Fields("Email")

tblIndividuals.Birth Date DatevalMask(Records("R1").Fields("Birth Date"),

"mm/dd/yyyy")

tblIndividuals.Favorites Records("R1").Fields("Favorites")

tblIndividuals.Standard Payment Records("R1").Fields("Standard Payment")

tblIndividuals.Payments Records("R1").Fields("Payments")

tblIndividuals.Balance Records("R1").Fields("Balance")

Target Field Expressions tblTradingPartners

tblTradingPartners.Account Number Records("R1").Fields("Account Number")

tblTradingPartners.Name Records("R1").Fields("Name")

tblTradingPartners.Company Records("R1").Fields("Company")

tblTradingPartners.Street Records("R1").Fields("Street")

tblTradingPartners.City Records("R1").Fields("City")

tblTradingPartners.State Records("R1").Fields("State")

tblTradingPartners.Zip Records("R1").Fields("Zip")

tblTradingPartners.Email Records("R1").Fields("Email")

tblTradingPartners.Standard Payment Records("R1").Fields("Standard Payment")

tblTradingPartners.Payments Records("R1").Fields("Payments")

tblTradingPartners.Balance Records("R1").Fields("Balance")

Page 176: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

176

Multimode – Data Normalization

Keywords: Comprehensive exercise, Create Unique Indexes

(Action Keys), Primary & Surrogate keys, On Error & On

Constraint Error event handling.

The Map Designer has a rich set of Event Handlers with predefined Actions that can make quick

work of complex mapping problems. In this exercise we will normalize data from Accounts.txt as

we load it directly to the target database. A single record will be written to three different target

tables and in the case of the “Favorites” column, we will write one-to-many records again. As we

map to the three different tables, we need to map foreign keys and generate primary keys so we will

be able to relate the data downstream. We can also “de-dupe” the data by placing unique indexes on

the load tables and checking constraints as we insert rows. Finally, we will utilize more of the target

Event Handlers to catch exception records. However, in this case we will not use the Reject

Connection Info functionality. We will insert exception records to a reject table in the target database

and add our own text for the reject reason.

Exercise

1. Create a map based on the specifications given below.

2. Run the map and observe the result.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x Multimode

Target Data: Database: TrainingDB

Tables: tblEntity, tblFavorites, tblPaymets, tblRejects

Target Options: none

Variables

Name Type Public Value

rejectReason Variant no "NoReason"

Page 177: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

177

Don’t forget to set Action Keys!

Target Field Expressions: tblEntity

tblEntity.Account Number

Records("R1").Fields("Account Number")

tblEntity.Name Records("R1").Fields("Name")

tblEntity.Company Records("R1").Fields("Company")

tblEntity.Street Records("R1").Fields("Street")

tblEntity.City Records("R1").Fields("City")

tblEntity.State Records("R1").Fields("State")

tblEntity.Zip Records("R1").Fields("Zip")

tblEntity.Email Records("R1").Fields("Email")

tblEntity.Birth Date DateValMask(Records("R1").Fields("Birth Date"),

"mm/dd/yyyy")

Target Field Expressions: tblFavorites

tblFavorites.Account Number

Records("R1").Fields("Account Number")

tblFavorites.FavoritesID Serial(0) ' Starts at 1 each execution. Consider using a lookup to

get Max Value first.

tblFavorites.Favorites Parse(cntFavorites, Records("R1").Fields("Favorites"), "|")

Target Field Expressions: tblPayments

tblPayments.Account Number

Records("R1").Fields("Account Number")

tblPayments.PaymentID Serial(0) 'Starts at one each execution. Consider using lookup

for Max Value

tblPayments.Payments Records("R1").Fields("Payments")

tblPayments.Balance Records("R1").Fields("Balance")

Target Field Expressions: tblRejects

Page 178: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

178

tblRejects.Account Number Records("R1").Fields("Account Number")

tblRejects.RejectID Serial(0) 'Starts at one each execution. Consider using lookup for

Max Value

tblRejects.RejectReason rejectReason

tblRejects.Name Records("R1").Fields("Name")

tblRejects.Company Records("R1").Fields("Company")

tblRejects.Street Records("R1").Fields("Street")

tblRejects.City Records("R1").Fields("City")

tblRejects.State Records("R1").Fields("State")

tblRejects.Zip Records("R1").Fields("Zip")

tblRejects.Email Records("R1").Fields("Email")

tblRejects.Birth Date Records("R1").Fields("Birth Date")

tblRejects.Favorites Records("R1").Fields("Favorites")

tblRejects.Standard Payment

Records("R1").Fields("Standard Payment")

tblRejects.Payments Records("R1").Fields("Payments")

tblRejects.Balance Records("R1").Fields("Balance")

Define Events: Transformation Events

Event Name Event Actions Event Parameters

BeforeTransformation DropTable target name Target

table name tblEntity

DropTable target name Target

table name tblFavorites

DropTable target name Target

table name tblPayments

DropTable target name Target

table name tblRejects

CreateTable target name Target

record layout tblEntity

table name tblEntity

Page 179: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

179

CreateTable target name Target

record layout tblFavorites

table name tblFavorites

CreateTable target name Target

record layout tblPayments

table name tblPayments

CreateTable target name Target

record layout tblRejects

table name tblRejects

CreateIndex target name Target

record layout tblEntity

table name tblEntity

index name idxEntity

unique True

CreateIndex target name Target

record layout tblFavorites

table name tblFavorites

index name idxFavorites

unique True

CreateIndex target name Target

record layout tblPayments

table name tblPayments

index name idxPayments

unique False

CreateIndex target name Target

record layout tblRejects

table name tblRejects

index name idxRejects

unique False

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapInsert Record target name Target

record layout tblEntity

Page 180: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

180

table name tblEntity

ClearMapInsert Record target name Target

record layout tblFavorites

table name tblFavorites

count:

Charcount(“|”, Records(“R1”).Fields(“Favorites”)) +1

counter variable: cntFavorites

ClearMapInsert Record target name Target

record layout tblPayments

table name tblPayments

Define Events: Target Events

Event Name Event Actions Event Parameters

OnConstraintError Execute Expression:

rejectReason = "General-OnConstraintErr"

ClearMapInsert Record target name Target

record layout tblRejects

table name tblRejects

Resume none

OnError Execute Expression:

rejectReason = "General-OnError"

ClearMapInsert Record target name Target

record layout tblRejects

table name tblRejects

Resume none

Do not forget the Resume Action. The Resume Action is what causes the map to continue

processing the remaining records after the error is handled.

Page 181: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

181

Multimode Implementation with Upsert Action

Objectives

At the end of this lesson you should be able to use the Upsert Action.

Keywords: Multimode, Change Source, Upsert

Description

The Upsert Action is used by Multimode Connectors only. This Action updates records where there

is a key match and Inserts records where there is not.

This Map uses Multimode to load two tables and then a Change Source Action to load a second file

into the same tables. It utilizes the Upsert Action to either Insert or Update the records into the target

tables.

Exercise

1. Create our map based on the specifications given below.

2. Save the map as m_Multimode_Upsert.map.xml.

3. Run the map and observe the result.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x Multimode

Target Data: Database: TrainingDB

Tables: tblIndividuals, tblTradingPartners

Target Options: None

Variables

Name Type Public Value

varChngSrc Variant no 0

Page 182: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

182

Define Events: Transformation Events

Event Name Event Actions Event Parameters

BeforeTransformation SQL Statement target name Target

statement

Delete from tblIndividuals;

Delete from tblTradingPartners

Target Field Expressions tblIndviduals

tblIndividuals.Account Number Records("R1").Fields("Account Number")

tblIndividuals.Name Records("R1").Fields("Name")

tblIndividuals.Street Records("R1").Fields("Street")

tblIndividuals.City Records("R1").Fields("City")

tblIndividuals.State Records("R1").Fields("State")

tblIndividuals.Zip Records("R1").Fields("Zip")

tblIndividuals.Email Records("R1").Fields("Email")

tblIndividuals.Birth Date DatevalMask(Records("R1").Fields("Birth Date"),

"mm/dd/yyyy")

tblIndividuals.Favorites Records("R1").Fields("Favorites")

tblIndividuals.Standard Payment Records("R1").Fields("Standard Payment")

tblIndividuals.Payments Records("R1").Fields("Payments")

tblIndividuals.Balance Records("R1").Fields("Balance")

Target Field Expressions tblTradingPartners

tblTradingPartners.Account Number

Records("R1").Fields("Account Number")

tblTradingPartners.Name Records("R1").Fields("Name")

tblTradingPartners.Company Records("R1").Fields("Company")

tblTradingPartners.Street Records("R1").Fields("Street")

tblTradingPartners.City Records("R1").Fields("City")

Page 183: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

183

tblTradingPartners.State Records("R1").Fields("State")

tblTradingPartners.Zip Records("R1").Fields("Zip")

tblTradingPartners.Email Records("R1").Fields("Email")

tblTradingPartners.Standard Payment

Records("R1").Fields("Standard Payment")

tblTradingPartners.Payments Records("R1").Fields("Payments")

tblTradingPartners.Balance Records("R1").Fields("Balance")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMap target name = Target

record layout = tblIndividuals

count

If Left(Records("R1").Fields("Account Number"), 2) == "02" Then

1

Else

0

End if

Upsert Record target name Target

record layout tblIndividuals

table name tblIndividuals

count

If Left(Records("R1").Fields("Account Number"), 2) == "02" Then

1

Else

0

End if

ClearMap target name Target

record layout tblTradingPartners

count

If Left(Records("R1").Fields("Account Number"), 2) == "01" Then

1

Page 184: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

184

Else

0

End if

Upsert Record target name Target

record layout tblTradingPartners

table name tblTradingPartners

count

If Left(Records("R1").Fields("Account Number"), 2) == "01" Then

1

Else

0

End if

Consider creating variables to use as flags the count parameter. The variables can be set in an

Execute Statement in the AfterEveryRecord Event. This method keeps the logic for writing to each

table in one location which would be easier to update in the long run.

Define Events: Source Events

Event Name Event Actions Event Parameters

OnEOF ChangeSource If varChngSrc == 0 Then

varChngSrc = 1

"+File=$(FUN_DATA)AccountsUpdate.txt"

End if

Page 185: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

185

Reference

Page 186: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

186

Checklist – Starting Your Integration Project

Scoping Your Project

Before you start working with the integration tools, it is worthwhile to do some initial planning and

preparation. When scoping a new data integration project, review the following checklist.

Preparing for the Initial Project

Integration Type o Is your project primarily a migration, extract, transform and load (ETL), application

integration, business to business transactions (B2B), data profiling, or other?

Data Objects o What is the number of transformations, processes, source files or tables, target files

or tables that you plan to use?

o Will this number vary between transformations and processes, or will it always be

the same?

Direction o Is the direction of your data connection one-way or bidirectional?

Volume o What is the volume of your data?

o How many records do you plan to process?

o Find a unit of measure to count.

Frequency o Do you want to run your processes and transformations for a single impromptu

purpose, in scheduled engine runs, or continuously in real time?

Project Planning Integration Design

Connectivity options o What source or target connector or component do you plan to use?

Server address, User IDs, Passwords. o Gather information on server addresses, and get user IDs and passwords ready.

Shared Data Objects o Do you plan to use any code modules, RIFL scripts, SQL queries, or statements that

were used in a previously designed transformation or process?

Shared Transformations or Processes o Are there any existing transformations or processes that you can use in the project?

Data o Make a list of any data files, tables, or entities that you plan to use.

o Will the data require special handling, such as encoding or Unicode?

Results Management o Do you need to build notifications of results or events, such as e-mail notification,

custom log files, or data archival?

Page 187: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

187

o Gather e-mail information and confirm checkpoints that require special logging.

Identify Platform and Software Needs o What operating system platforms do you plan to use? Is client software needed for

connectivity?

o Do you need special expertise to set up and configure the software which would

require a database administrator or Pervasive professional services?

Naming Conventions for Specification Files, Variables, and Objects

Invoke Integration. How will you call your maps and process (batch, real-time with

Integration Server)?

Integration Design

Performance (Lookups, Parallel and Multithreaded)

How To (record x – to filter or use multiple record type; error handling)

Reference:

See Best Practices: http://docs.pervasive.com/products/integration/download/best_practices.pdf

Page 188: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

188

Upgrading from 8.x to 9.x

You cannot install 9.x in the same folder as 8.x. You may install 9.x in another folder on the same

machine, or on another machine.

After installing 9.x follow the steps below to bring 8.x files forward to 9.x:

1. In the 8.x installation, back up all of the repositories the Cosmos_Work directory or any

other directory where files related to the repositories may exist.

2. In the 8.x installation, back up the InstallDir\Common800\extractor800.mdb.

3. After installing version 9.x you will be prompted to accept aCosmos9_Work directory as the

9.x workspace root directory location. Even if this directory does not exist, it will be created

for you. Do one of the following:

a. Accept this default location and copy all of your files from the 8.x “Cosmos_Work”

folder to the new 9.x “Cosmos9_Work” folder.

b. Change the workspace location to the 8.x workspace location. Choosing this option

will allow you to use both versions to run the same files in a single workspace.

4. If you are using Extract Schema Designer or Extract Scripts also do one of the following:

a. Run the 9.x version of Extract Schema Designer. Choose File › Change Database

to point to Install\Common800\extractor800.mdb instead of

Install\Common\extractor900.mdb.

b. Change the name of the extractor800.mdb to extractor900.mdb, and use it to replace

the extractor900.mdb in the InstallDir\Common folder.

Page 189: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

189

Cosmos.ini Settings

The cosmis.ini file contains the startup information required to launch the integration products. This

file is available in InstallDir\Common, where InstallDir is the installation directory for the

integration tool set. For more information, see the next page “Windows Default Installation

Locations”, and see the “Installation Locations” topic in the release notes.

Page 190: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

190

Windows Default Installation Locations

The following tables provide a brief description of what is stored in each default installation folder

on Windows XP/ Server 2003/ and Windows Vista. The Pervasive and the Cosmos9 folder names

may be overridden by setting option values in the setup.ini file prior to installation.

Table 1-1 Windows XP/Server 2003 Default Installation Locations

Component Default Installation Directory and Path

cosmos.ini C:\Documents and Settings\All Users\Application

Data\Pervasive\Cosmos9\Common

Integration Platform Designers

and Other Executables

C:\Program Files\Pervasive\Cosmos9\Common

Integration Server C:\Program Files\Pervasive\Cosmos9\IntegrationServer

C:\Documents and Settings\All Users\Application

Data\Pervasive\Cosmos9\IntegrationServer

Repository Manager C:\Program Files\Pervasive\Cosmos9\RepositoryManager

Component SDK C:\Program Files\Pervasive\Cosmos9\Common\ComponentSDK

Target and Source Connectors C:\Program Files\Pervasive\Cosmos9\Common\connections

Product Documentation (PDFs

and Help)

C:\Program Files\Pervasive\Cosmos9\Common\Help

C:\Program Files\Pervasive\Cosmos9\Common\Help\PDF

SDK Documentation C:\Program Files\Pervasive\Cosmos9\Common\Help\SDKs

License Files C:\Documents and Settings\All Users\Application

Data\Pervasive\Cosmos9\Common\License

Components (Plug Ins) C:\Documents and Settings\All Users\Application

Data\Pervasive\Cosmos9\Common\Plug-Ins

Samples .msi file C:\Program Files\Pervasive\Cosmos9\Common\Samples

SDKs for Content eXtraction

Language and Engine SDK

C:\Program Files\Pervasive\Cosmos9\Common\SDKs

Table 1-2 Windows Vista and Windows Server2008 Default Installation Locations

Component Default Installation Directory and Path

cosmos.ini C:\ProgramData\Pervasive\Cosmos9\\Common

Integration Platform Designers C:\Program Files\Pervasive\Cosmos9\Common

Page 191: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

191

and Other Executables

Integration Server C:\Program Files\Pervasive\Cosmos9\IntegrationServer

C:\ProgramData\Pervasive\Cosmos9\\IntegrationServer

Repository Manager C:\Program Files\Pervasive\Cosmos9\RepositoryManager

Component SDK C:\Program Files\Pervasive\Cosmos9\Common\ComponentSDK

Target and Source Connectors C:\Program Files\Pervasive\Cosmos9\Common\connections

Product Documentation (PDFs

and Help)

C:\Program Files\Pervasive\Cosmos9\Common\Help\

C:\Program Files\Pervasive\Cosmos9\Common\Help\PDF

SDK Documentation C:\Program Files\Pervasive\Cosmos9\Common

License Files C:\ProgramData\Pervasive\Cosmos9\\Common\License

Components (Plug Ins) C:\ProgramData\Pervasive\Cosmos9\\Common\Plug-Ins

Samples .msi file C:\Program Files\Pervasive\Cosmos9\Common\Samples

SDKs for Content eXtraction

Language and Engine SDK

C:\Program Files\Pervasive\Cosmos9\Common\SDKs

Page 192: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

192

Design Tool User Interfaces

Map Designer

Page 193: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

193

Process Designer

Page 194: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

194

Setting Properties

Setting Map Properties

To set the Map Tab to show the Navigation Tree with Events, from the Menu click View ›

Preferences. Click the General tab. Check Always show Map All view.

Setting RIFL Script Properties

To set the editor to show a line number for each line of the scripts, on the menu bar choose View ›

Editor Properties. Click on the Misc tab. In the lower left see “Line numbering”. In the “Style”

dropdown choose “Decimal”. Change the “Start at” to 1. Click OK.

Page 195: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

195

Reading a Log File

The Log File Browser allows you to view the contents of log files from Map Designer and Process

Designer.

To view the error and event log:

1. In the main toolbar, click the View >> TransformationMap.log icon

2. The Log File Browser displays the contents of the log file. Each of the designers creates

their own log file. For instance, the Map Designer creates a TransformMap.log, and Process

Designer creates a process log.

The following screenshot shows a TransformMap.log file generated by the Map Designer user

interface:

Note: The Log File Browser displays a maximum of 32,000 lines. If your log file is very long, you

will be able to see only the last 32,000 lines of it in the browser. If you want to see the rest, open it in

a text editor, such as WordPad.

3. Click Search to display the Find Text dialog box. It allows you to search the error and event

log file for a particular string of text.

4. Click Clear Log to delete the log file.

To change names of log files:

You can set the name of the .log file in three places.

In Map Designer, you open Transformation and Map Properties, click Error Logging, and

type a new log file name.

In Process Designer, select File > Process Properties, click the Logging Tab, type a name for

the log, and click OK.

For the Engine, type the following at a command prompt:

-logfile newlogfilename

Page 196: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

196

Transformation Log Codes

The transformation log file displays the following information:

Date Time Error Type

Internal Code

Direction Code

Source

08/25/2006 14:08:10 1 0 O Global

Error Type

1 – Informative

2 – Warning

4 – General Error

8 – Fatal Error

16 – Debug Message

Internal Code

This code is related to 255xx codes.

Direction Code

I – Import

C – Component

E – Export

M – Message component

O – Other

U – Unknown component

Source

The source of the log message can be global, the name of a connector, name of a component,

or some other indication of the origin of the message

Page 197: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

197

Examples of Complex Process Layouts

Page 198: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

198

Page 199: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

199

Additional Documentation Resources

Best Practices:

http://docs.pervasive.com/products/integration/download/best_practices.pdf

Product Documentation:

http://docs.pervasive.com/products/integration/di/wwhelp/wwhimpl/js/html/wwhelp.htm#href=conta

ct/contact.html

Integration Support:

http://www.pervasiveintegration.com/support/Pages/submit_a_support_ticket.aspx

Integration Forums:

http://cs.pervasive.com/forums/16.aspx

Documentation and Downloadable Samples:

http://www.pervasiveintegration.com/support/documentation/Pages/documentation_and_samples.as

px

Event Management Guide:

http://docs.pervasive.com/products/integration/download/events.pdf

Product Updates and Connectivity Packs:

http://www.pervasiveintegration.com/support/Pages/product_downloads.aspx

Integration Manager Pages:

http://www.pervasiveintegration.com/products/Pages/integration_manager.aspx

Page 200: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

200

Glossary

Glossary of Integration Product Terminology

Page 201: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

201

A

Action One of the options in Event Handlers (Map tab, upper left quadrant in Map Designer). For example,

ClearMapPut Record is the default Action automatically set when you do not override the option.

Some other Actions in the drop down list include: Execute, MapPut Record, Map, Put Record, Insert

Record, Clear, and ClearInitialize.

Array In programming, a series of objects, all of which are the same size and type. Each object in an array

is called an array element. For example, you could have an array of integers or an array of characters

or an array of anything that has a defined data type. The important characteristics of an array are:

� Each element has the same data type (although they may have different values).

� The entire array is stored contiguously in memory (that is, there are no gaps between elements).

Arrays can have more than one dimension. A one-dimensional array is called a vector; a two-

dimensional array is called a matrix.

Arithmetic operators The +, -, *, /, and ( ) are operators used to construct arithmetic expressions.

ASCII The most common format for text files. In an ASCII file, each alphabetic, numeric, or special

character is represented with a 7-bit number (a string of seven 0s or 1s), with 128 characters defined.

Unix and older DOS-based operating systems use ASCII for text files. Newer Windows systems use

an encoding standard called Unicode. IBM System 390 servers use a proprietary 8-bit code called

EBCDIC. Transformation programs allow different operating systems to change a file from one

encoding standard to another. The American National Standards Institute (ANSI) oversees ASCII

Standards.

B

Binary File A computer file that contains machine-readable information that must be read by an application; the

characters use all 8 bits of each byte.

Boolean logic The type of an expression with two possible values, True and False. Also, a variable of Boolean type

or a function with Boolean arguments or result. The most common Boolean functions are And,

Or and Not. In computer operation with binary values, Boolean logic can be used to describe

electromagnetically charged memory locations or circuit states that are either charged (1 or true) or

not charged (0 or false). The computer can use an AND gate or an OR gate operation to obtain a

result that can be used for further processing.

C Comma-delimited A data format in which each piece of data is separated by a comma. This is a popular format for

transferring data from one application to another, because most database systems are able to import

and export comma-delimited data.

Concatenate

Page 202: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

202

To merge the records from two or more files into a single file. Also, to add a string of data to other

data that already exists in a field. In Map Designer, you can concatenate fields into a single field by

using an expression.

Connection String A list of key = value pairs. The keywords are either names of connection information fields, or

Connector property names. The key=value pairs are separated by a semi colon (;). Connector Name of the type of connection at the Source or Target tab. ASCII (Delimited), MySQL, and Oracle

9i are some examples of Connectors. In early versions of Map Designer, the term for connector was

spoke.

Constraint An object used to place rules on data in a relational database. Constraints are used to control the

allowed data in a column, are created at the column level, and are used to enforce referential

integrity (parent and child table relationships).

Conversion Called a transformation in more recent versions of Map Designer, and the basic unit for all data

transfer and manipulation. A conversion (transformation) is one set of source, target, and

mapping specifications. When these specifications are set, the data transformation process can be

run.

D

Data (1) Distinct pieces of information, usually formatted in a special way. Software is divided into two

general categories: data and programs. Programs are collections of instructions for manipulating

data.

(2) The term data is often used to distinguish binary machine-readable information from textual

human-readable information. For example, some applications make a distinction between data files

(files that contain binary data) and text files (files that contain “ASCII” data).

(3) In database management systems, data files are the files that store the database information,

whereas other files, such as index files and data dictionaries, store administrative information,

known as “Metadata”.

Data integrity Refers to the validity of data. Data integrity can be compromised in a number of ways:

� Human errors when data is entered

� Errors that occur when data is transmitted from one computer to another

� Software bugs or viruses

� Hardware malfunctions, such as disk crashes

� Natural disasters, such as fires and floods

There are many ways to minimize these threats to data integrity. These include:

� Backing up data regularly

� Controlling access to data via security mechanisms

� Designing user interfaces that prevent the input of invalid data (such as Input Boxes for user input)

� Using error detection and correction software when transmitting data (error trapping, reject tables)

Data structure

Page 203: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

203

A Schema (Map tab in Map Designer). In previous versions of Map Designer, data structures were

also called Record Layouts. A data structure is the arrangements of fields in a record within a

particular data file, either source or target. This includes field length, record length, field data types,

and other field properties such as Decimal and Scale.

Data type The classification of data a field can contain. Some data types include text, numeric, datetime, float,

packed decimal, Boolean, and 16-bit binary.

Database An organized collection of information, stored systematically in tables or files.

Default What the integration product automatically does in the absence of an overriding command. For

example, if no After Every Record events are selected in Map Designer, the ClearMapPut Record

Action is automatically invoked when a transformation is run.

Delimited ASCII data ASCII data has fields that are separated by some character, often a comma. Field entries frequently

begin and end with double quotation marks ("), and records are often separated by a carriage return-

line feed (CR-LF). Records and fields are not usually a fixed length.

Delimiter A character or combination of characters used to separate one item or set of data from another. For

example, in comma-delimited records, a comma is used to separate each field of data. In the Map

Designer ASCII Delimited connector, the source and target Property default setting is comma-

delimited.

Design time Activities performed when designing a transformation or process. It includes specifying source and

target connection information, reading and applying metadata, specifying transformation events,

options, execution paths, errors, defining mapping expressions, and exception handling.

Discriminator A discriminator is the data within a file that indicates record type.

DJAR Data Junction Archive (DJAR) is a package that contains processes and dependents of the processes

such as Maps, Functions, Executables, etc.

DJImport Object An internal object designed to provide a generic interface to Map Designer Connectors. It is used to

read data to be utilized as a source.

E EBCDIC An IBM code for representing characters as numbers. Although it is common on large IBM

computers, most other computers, including PCs and Macintoshes, use ASCII codes.

Expression

Page 204: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

204

An Expression (also called a “Script”) is a combination of Operator, literal values, field names,

Statement, Variable, and Function. They are used to perform calculations, enter a specific value,

concatenate data, or otherwise modify data in a particular field.

Expression Builder Now called RIFL Script Editor, this is the functional area of Map Designer where you can write

your own scripts to include with your transformations. RIFL Script Editor includes a list of all of the

functions and operators available to you in RIFL (Rapid Integration Flow Language).

F Field A labeled or unlabeled column of information in a data file or table; a field contains the same kind of

information for each record in the data file or table.

File format A format for encoding information in a file. Each different type of file has a different file format.

The file format specifies first whether the file is binary or ASCII, and second, how the information is

organized.

Filter A set of criteria applied to a range of records. In the Map Designer, both the source and target filters

sift through data and return a subset of records specified in the filter options. The number of records

processed can also be specified in these filters.

Fixed ASCII file An ASCII data file that has fixed field and record sizes, but no delimiter (except possibly a record

separator).

Fixed length Having a set length that never varies. In database systems, a field can have a fixed or a variable

length. A variable-length field is one whose length can be different in each record, depending on

what data is stored in the field.

The terms fixed length and variable length can also refer to the entire record. A fixed-length record

is one in which every field has a fixed length. A variable-length record has at least one variable-

length field.

Flow Control Management of data flow between computers, devices, or network nodes to maintain efficient use of

data.

Function A small section of a program designed to perform a specific task. Many functions return a value

based on the results of a calculation or other operation. Some functions operate as a procedure and

return no value. In Map Designer, functions can be used to map and manipulate data. A list of

available functions is in the RIFL Script Editor interface. For a list of functions, see “All Functions”

in the Help Files.

G GUI

Page 205: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

205

(Graphical User Interface) A graphics-based user interface that incorporates icons, menus, and a

mouse. The interface has become the standard way users interact with a computer. In a client-server

environment, the GUI resides in your client machine.

H Header Information that appears at the beginning of a data file, but is not a part of the actual data.

I Integration In reference to data it is the combining or movement of data from different sources to provide end

users with a unified view of this data. The data movement may also involve transforming the data

through computations, or modifying the data format.

K Key In database management systems, a key is a field that you use to sort data. It can also be called a key

field, sort key, index, or key word. Most database management systems allow you to have more than

one key so that you can sort records in different ways. One of the keys is designated the primary key,

and must hold a unique value for each record. A key field that identifies records in a different table is

called a foreign key.

L Lookup Table An array or matrix of data that contains values that can be searched.

M

Mask A pattern of tokens used to accept or reject patterns in another set of data. For example, a date mask

that looks for two numbers followed by a slash followed by two more numbers, another slash and

two more numbers (##/##/##) can be used to match a string of source data. When the specified

pattern appears in both the mask and the data, the source data will be written to the target.

Metadata Data about data. Metadata describes how and when and by whom a particular set of data was

collected, and how the data is formatted. Metadata is essential for understanding information stored

in data warehouses.

Multimode Specific connector types that have been designated to allow writes to multiple tables. When a user

selects one of these connector types, the “Output” mode will automatically be "Multiple Output

Mode". This cannot be changed to regular output mode. SQL Script and ODBC 3.x are two of the

Multimode Connectors available.

N

Page 206: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

206

Null A value that indicates missing or unknown data in a field. Null characters are placeholders with a

hex value 00. These values can be entered in fields for which information is unknown and can be

used in expressions. Some fields, such as those identified with primary keys, cannot contain Null

values.

O Object A mechanism that binds data to methods that operate on it. In object-oriented programming, an

object is a self-contained entity that consists of both data and procedures to manipulate the data.

ODBC (Open Data Base Connectivity) A database programming interface introduced by Microsoft in 1992

that provides a common language for applications to access databases on a network. ODBC is made

up of the function calls programmers write into their applications and the ODBC drivers themselves.

For client/server database systems such as Oracle and SQL Server, the ODBC driver provides links

to their database engines to access the database. For desktop database systems such as dBASE and

FoxPro, the ODBC drivers actually manipulate the data. ODBC supports SQL and non-SQL

databases. Although the application always uses SQL to communicate with ODBC, ODBC will

communicate with non-SQL databases in its native language. Map Designer supports ODBC 2.x,

ODBC 3.x, ODBC 3.5 and ODBC 3.x multimode connectivity.

OLE OLE (Object Linking and Embedding) is a compound document technology and part of Microsoft

ActiveX technologies. A compound document can contain visual and information objects of all

kinds.

Each object is an independent program entity that can interact with a user and also communicate

with other objects. OLE utilizes the Component Object Model (COM) and its distributed version,

(DCOM). An OLE object is also, by default, a component (or COM object).

OnEOF A source schema event (upper left, Map tab in Map Designer). Executed when the end of the file

(EOF) is reached.

Operator A symbol that represents an operation to be performed on a value or values. For example, the +

operator represents addition, and the * operator represents multiplication.

Output A mode which represents the transfer of data from the source to the target (Map tab in Map

Designer). Some selections include: Replace File/Table, Append to File/Table, Update File/Table

and Clear/ Append. Connectors that write to multiple tables use the Multimode Output mode.

R RDBMS

Page 207: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

207

Relational Database Management System. RDBMS includes a wide variety of SQL and relational

database systems, such as SQL Server and Oracle. Data is stored in multiple tables, many of which

are linked by the use of primary key fields.

Record (1) In database management systems, a complete set of information. Records are composed of fields,

each of which contains one item of information. A set of records constitutes a file. For example, a

personnel file might contain records that have three fields: a name field, an address field, and a

phone number field.

(2) Some programming languages allow you to define a special data structure called a record.

Generally, a record is a combination of other data objects. For example, a record might contain three

integers, a floating-point number, and a character string.

Record layout The term for a data structure used in Map Designer. The alternative term is schema. The

arrangement of fields in a record in a particular data file, either source or target. This includes field

length, record length, field data types, and other field properties such as decimal and scale.

Record number A unique number that identifies each record in a data file or table.

Record type A set of field options within the source and target schemas (Map tab in Map Designer). These

options include layout name, length, lock, schema origin, and description.

Regular expression A string of characters that defines a set of rules for matching character strings found in fields.

Relative path An implied path. When a command is expressed that references files, the current working directory

is the implied, or relative, path if the full path is not explicitly stated.

Repository A physical location on your local system and on the network. It stores maps, connections, structured

schemas and join view files.

RIFL Rapid Integration Flow Language (RIFL) is a custom expression language for the integration

products. RIFL includes functions, statements, operators, events, scripts, and objects unique to the

integration platform. Some RIFL functions are similar, but not the same as, Visual Basic. RIFL

scripts can be run on both Windows and Unix systems. Use the .rifl extension for script files.

Run Time The events that occur during transformation and process execution. These include connecting to data

sources and targets, reading and writing data, compiling and evaluating expressions, transformation

events, and exception processing.

S Scale A Field Property Value option (Map tab in Map Designer). Designates where a decimal is

positioned in a number.

Page 208: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

208

Schema The term for a data structure (Map tab in Map Designer). The arrangement of fields in a record in a

particular data file, either source or target. This includes field length, record length, field data types,

and other field properties such as decimal and scale. You can create and modify schemas in

Document Schema Designer and in Structured Schema Designer. These schemas can then be

validated in Process Designer, and used as structural metadata in Map Designer.

Scope In programming, the visibility of variables within a program. For example, whether or not one

function can use a variable created in another function.

Script A Script or Expression is a grammatically correct combination of operators, literal values, field

names, variables and functions used to perform calculations, enter a specific value, concatenate data,

or otherwise modify data in a particular field.

Server The application that responds to the calling application or client in a DDE or OLE conversion. The

server usually sends data to the client.

SQL Structured Query Language (abbreviated SQL and commonly pronounced "sequel") is the standard

language for storing and manipulating data in relational databases.

Statement A descriptive phrase that generates one or more instructions in the computer.

String An alphanumeric value or an expression consisting of alphanumeric characters.

Syntax Grammar, structure, or order of elements in a language statement.

Syntax Error An error caused by an incorrectly expressed statement written in the RIFL Script Editor or in a

transformation event in Map Designer.

T Table (1) In programming, a collection of adjacent fields. Also called an array. A table contains data that is

either constant within the program or is called when the program runs.

(2) In a relational database, the same as a file; a collection of records. A structure made up of rows

(records) and columns (fields) that contain information. A table is the primary object used to store

data. When data is queried and accessed for modification, it is usually found in a table.

Transformation Called a conversion in previous versions of Map Designer, a transformation is the basic unit for all

data transfer and manipulation. A transformation is one set of source connection, target connection,

mapping, event, and property specifications. When these specifications are set, the data

transformation process can be run.

Page 209: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

209

Truncate To remove leading or trailing digits or characters from an item of data without regard to the accuracy

of the remaining characters. Truncation occurs when data is converted into a new record with

smaller field lengths than the original.

U Unicode A character encoding scheme that uses two bytes to represent every character, regardless of whether

it’s an ASCII character. This scheme is capable of encoding all known characters and is used as a

worldwide character-encoding standard.

V Validation A process that ensures that the user has provided sufficient information in the design phase. In

Process Designer, for example, it verifies that the Steps and links have certain fundamental

requirements.

Variable (Public, Global, Dim) A named storage location that can be modified during program execution. Each variable has a name

that uniquely identifies it within its level of scope. A Public variable can be used throughout a

project, while a Global variable can be used throughout a transformation. Dim variables are specific

to a module or an expression.

View A virtual table that looks like and acts like a table in a relational database. A view is defined based

on the structure and data of a table. A view can be queried and sometimes updated.

W Where Clause The part of a SQL statement that specifies which records to retrieve. In the Map Designer, the

statement is an option in source properties in several SQL database applications, such as Access,

Oracle, and SQL Server.

Workspace

A collection of Repositories. Each Workspace directory contains a macro definitions file

called "macrodef.xml".

Page 210: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

210

Appendix

This section contains additional exercises and information that may be of use.

Page 211: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

211

Additional Exercises

Page 212: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

212

Extract Schema Designer: Extracting Fixed Field Definitions

Keywords: Extract Schema Designer: Multiple Fields per Line

Style (fixed)

Description

The next file that we will be parsing is Purchases_Mail.txt. We should take a look at it in a text

viewer. Although it might be possible to use this report file as a direct input for a transformation, we

would have to define it as a multiple-record-type file. Although there are fewer record types than

with the phone purchases we dealt with earlier, there are still enough that when combined with the

extra processing logic involved, the job would become tedious. So, again, what we plan to do is use

the Extract Schema Designer to create an extract specification that will transform the report file into

a more familiar row/column format, and then use that formatted data as input to the transformation

that adds these purchases to the database table. As before, we don’t require multiple passes of the

input file. We will just create the extract schema and apply it to the input on the Source tab of our

eventual transformation.

Exercise

1. From the Repository Explorer, select New Object › Extract Schema.

2. At the prompt, navigate to the file you will be working with, in this case,

Purchases_Mail.txt.

3. In the Source Options dialog, on the Extract Design Choices tab, set the Tag

Separator to “Colon:Space(: )” Also on this tab, ensure that the Trim Leading and

Trailing Spaces checkbox is selected.

4. On the Display Choices tab, ensure that the Pad Lines checkbox is selected.

5. Choose OK to accept the selections.

6. Highlight the entire Account Number line in the data.

7. Right-click in the highlight and select Define Data Field › Parse Tagged Data.

8. Highlight the label Purchase Order Number.

9. Right-click in the highlight.

10. Select Define Line Style › New Line Style.

11. Change the Line Style Name to PONumber.

12. Choose Add.

13. Highlight the Purchase Order Number tag and the data following it.

14. Right-click in the highlight.

15. Select Define Data Field › Parse Tagged Data.

16. Define the PO_Date Field using the same technique

17. Define the Category Line Style and the three Fields on it using the same technique.

18. Define the Unit Cost Line Style and the three Fields on it using the same technique.

Page 213: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

213

19. Define the Line Style that determines the end of a row of data for the Extract File.

20. Locate the Line Style that contains the Field that will be the last column in each row in

the eventual extract file (in this case, Unit_Cost).

21. Double-click on the Line Style name to bring up the Line Style Definition dialog.

22. On the Line Action tab, choose ACCEPT Record, and accept the remaining defaults.

23. Choose Update.

24. Click on the Browse Data Record button.

25. Choose OK to allow assignment of all Fields to the Extract File.

26. Examine the data to ensure that your Field definitions are correct.

27. Close the browser window.

28. Ensure that the Fields are in the order they appear in the input data.

29. Save the Extract Schema Design as Purchases_Mail.cxl.

30. Close the Extract Schema Designer.

31. Remember that this schema can be used as part of a source connection in Map Designer.

Page 214: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

214

Integration Engine: Using the “-Set” Variable Option

Objectives

At the end of this lesson you should be able give a variable a value from the command line.

Keywords: -Set

Description

In the Solutions\MapDesigner_TransformationFundamentals folder there is a transformation that has

a msgbox that displays the value of a variable.

First let’s run the map without changing the value of the variable.

At the command prompt type: djengine C:\Cosmos_Work\

Fundamentals\Solutions\IntegrationEngine_CommandLine\m_EngineTestwithVar.tf.xml

Click OK on the MsgBox pop up.

Note that without the –Verbose command the only command line indication that the Map ran

correctly is a single line, “Return Code: 0”

Now let’s change the value of the variable.

For a string with a single word, type at the command prompt:

djengine -se myVar=\"NewValue\" C:\Cosmos_Work\

Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTestwithVar.tf.xml

Click OK on the MsgBox pop up.

Page 215: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

215

For a string with multiple words, type at the command prompt:

djengine -se “myVar=\"New Value\"”

C:\Cosmos_Work\Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTestwith

Var.tf.xml

Click OK on the MsgBox pop up.

Additional notes:

Aside from normal command line quoting/escaping sequences for the given operating system, what

is to the right of the equals sign will be used verbatim in an expression to set the variable.

On windows, the only command line quote character is the double quote, and it is escaped using a

backslash. By using -se gblsStartDate='07-09-1976' you are causing the expression gblsStartDate

= '07-09-1976' to be executed, which of course does nothing since the single quote indicates the start

of a comment.

By using -se gblsStartDate=07-09-1976 you are causing the expression gblsStartDate = (07 – 09)

– 1976 to be executed. If you use -se gblsStartDate="07-09-1976" you will get the same results as

above (as if the quotes weren't present).

However, if you use -se gblsStartDate=\"07-09-1976\" the expression gblsStartDate = "07-09-

1976" will be executed, which is what you want.

Note that this also means you can do something like -se gblsStartDate=now() and have

gblsStartDate = now() executed.

Page 216: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

216

Integration Engine: Scheduling Executions

Keywords: Scheduler

Pervasive’s Integration Manager product provides scheduling capabilities, but many users may just

want to use schedulers they already have at hand.

There are many schedulers available that can be used to call the DJEngine.exe command and execute

a process or map. Some are third-party tools and some are native to the operating systems

themselves.

For example, you can schedule a batch file containing DJEngine commands using the following:

Windows: Schtasks (command-line only); Task Scheduler

Unix: Cron

Page 217: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

217

Lookup Wizard: Flat File Lookup

Keywords: Lookup Wizard, Count & Counter Variable parameters,

One-to-Many records (unrolling occurrences), and referencing

Target Field values

Description

Flat File Lookups allow us to look up data from a file that is not our source. We reference this data

with a key value that does come from the source and returns matching data or a default value if no

matches are found. The Lookup Function Wizard allows us to build these customized functions and

store them in a code module.

We will also be unrolling a data field that contains multiple values. The Favorites categories are all

stored in one field with a pipe delimiter separating them. We will create a unique target record for

each of the values stored in a single source record. The Count and Counter Variable parameters of

the ClearMapPut action can be used to parse this field and unroll the records dynamically.

Exercise

1. Create a Map based on the specifications below.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: File: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

tblFavoriteInfo

Target Options: none

Target OutputMode: Clear File/Table contents and Append

Note: The following code module should be built through the Lookup Wizard. The steps

for creating the code module are specified below.

Page 218: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

218

Define Code Modules:

Code Modules: $(FUN_DATA)Scripts\Categories.flatfile.rifl

2. From the Menu click Tools › Define Lookup Functions to open the Lookup Wizard.

3. Choose the Flat File Lookup Wizard and click Next.

4. Create a new Flat File Definition named Categories and click Next.

5. Specify the Lookup File as C:\Cosmos9_Work\Fundamentals\Data\Category.txt.

6. Click Next.

7. Choose the appropriate Key Field, and Fields that should be returned by the lookup.

Page 219: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

219

8. Click Finish.

The Wizard will create the Lat File Lookup Functions in a code module. Use the

functions in the appropriate event handlers as described below.

Categories_Field2_Lookup(KeyValue, DefaultValue)

Categories_Field3_Lookup(KeyValue, DefaultValue)

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord

ClearMapPut Record target name = Target

record layout = R1

count = CharCount("|",Records("R1").Fields("Favorites")) + 1

counter variable = cntr

Target Filed Expressions

R1.FavoritesID Serial()

Page 220: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

220

R1.Account Number

Records("R1").Fields("Account Number")

R1.CategoryCode parse(cntr, Records("R1").Fields("Favorites"), "|")

R1.CategoryLiteral Categories_Field2_Lookup(Targets(0).Records("R1").Fields("CategoryCode"),

"NoMatches")

R1.ProductManager Categories_Field3_Lookup(Targets(0).Records("R1").Fields("CategoryCode"),

"NoManagers")

Page 221: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

221

Lookup Wizard: Dynamic SQL Lookup

Keywords: Lookup Wizard, Dynamic SQL Lookup, Count & Counter

Variable parameters, One-to-Many records (unrolling

occurrences), and referencing Target Field values

Description

Dynamic SQL Lookups allow us to look up values from other sources when that source is a

relational table or view. Again we will use the Lookup Function Wizard to create User Defined

Functions that are stored in a code module.

Exercise

1. Create a Map based on the specifications below.

Map Summary:

Define the Source:

Source Connector: ASCII(Delimited)

Source Data: File: $(FUN_DATA)Accounts.txt

Source Options: Header = True

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

tblFavoriteInfo

Target Options: none

Target OutputMode: Clear File/Table contents and Append

Note: The following code module should be built through the Lookup Wizard. The steps

for creating the code module are specified below.

Define Code Modules:

Code Modules: $(FUN_DATA)Scripts\Categories.dynsql.rifl

Page 222: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

222

2. From the Menu click Tools › Define Lookup Functions to open the Lookup Wizard.

3. Choose the Dynamic SQL Lookup and click Next.

4. Create a new Dynamic SQL Definition named Categories and click Next.

5. Create a name for the DJImport Object that will make the connection to the table or file.

Choose the name category and click Next.

6. Click Build to build a new Connection String and then click Next.

7. Connect to the data source defined below for the lookup and click Next.

Define the Connection String:

Connector: Access 2000

File: C:\Cosmos9_Work\Fundamentals\Data\TrainingDB.mdb

Table: tblCategories

Properties: none

8. Choose the appropriate Key Field, and Fields that should be returned by the lookup.

Page 223: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

223

9. Click Finish.

The Wizard will create the following Dynamic SQL functions in a code module. Use

the functions in the appropriate event handlers as described below.

Categories_Init() – Initializes the DJImport object and makes the connection to the data source as defined by the connection string.

Categories_Category_Lookup(KeyValue, DefaultValue) – Creates the SQL call needed to retrieve a value from the Category field based on a Key value.

Categories_ProductManager_Lookup(KeyValue, DefaultValue) – Creates the SQL call needed to retrieve a value from the ProductManager field based on a Key value.

Categories_Terminate() – Terminates the connection to the data source by destroying the DJImport Object.

Define Events: Transformation and Map Properties Events

Event Name Event Actions Event Parameters

BeforeTransformation Execute Expression:

Categories_Init()

AfterTransformation Execute Expression:

Categories_Terminate()

Page 224: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

224

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord

ClearMapPut Record target name = Target

record layout = R1

count = CharCount("|",Records("R1").Fields("Favorites")) + 1

counter variable = cntr

Target Field Expressions

R1.FavoritesID Serial()

R1.Account Number Records("R1").Fields("Account Number")

R1.CategoryCode parse(cntr, Records("R1").Fields("Favorites"), "|")

R1.CategoryLiteral Categories_Category_Lookup _

(Targets(0).Records("R1").Fields("CategoryCode"), "NoMatches")

R1.ProductManager Categories_ProductManager_Lookup _

(Targets(0).Records("R1").Fields("CategoryCode"), "NoManagers")

Page 225: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

225

RDBMS: Integration Querybuilder

Objectives

At the end of this lesson you should be able to extract data from one or more tables in the same

database by using a SQL Passthrough statement.

Keywords: Integration Query Builder, SQL Passthrough

Statements

Description

The Transformation Map designer source connectors allow for passing Select statements through to

a database server to obtain a row set. The resultant row set that is returned by the query then

becomes the source data for your Map.

Use the Integration Query Builder to generate the source record set. Alternatively, you can use the

SQL script that generates this source record set by using the SQL File connection option and

pointing to the matching SQL Script file in the Scripts folder.

When you choose an RDBMS source connector, there are three choices for selecting Source Data.

You can to point directly to a table or view, pass a SQL statement through, or point to a SQL script

file that contains a SQL statement.

We will construct a SQL statement using the query builder.

Exercise

Once you have connected to a data source, (described below) your connection is displayed in the

upper-right pane. You can set up and save as many data source connections as you need. Integration

Querybuilder stores all connections you create unless you explicitly delete them.

1. Double-click the connection you want to use. The DB Browser in the lower-right pane will

display the database.

2. Click the database icon to display the icons for tables, views and procedures for this

database. Clicking on these will display their contents. Click on the individual tables to list

their columns, or right-click and select Get Details from the shortcut menu to see the SQL

representation of column values such as length, data types and whether they are used as

primary or secondary keys.

3. To create a query, select New Query from the Query menu. A new query icon will be

opened beneath the connection icon in the upper-right pane. You can rename this now or

later by Integration Querybuilder Right-click on the icon.

4. Drag the tables and views you want to use into the upper-left pane. This is called the

Relations pane. As you drag tables into this pane, you will see that SELECT... FROM

statements are created in the SQL pane. If tables are already linked in the database, these

links will be displayed, although these can be changed or removed for the purpose of this

particular query.

If you are using a table more than once, the second and further copies will be renamed. For example,

if you already have a Customer table in the Relations pane and you drag across another copy, it will

be automatically renamed Customer1.

Page 226: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

226

The Select statement that is generated becomes part of the connection string and it is passed through

to the database server.

We can now map this data into any target type and format we desire.

The following is information taken from reports generated by Repository Manager from the

RDBMS_SelectStatements transformation in the Solutions folder:

Source (ODBC 3.x)

Database TrainingDB

SQLStatement

SELECT

srcAccounts.[Account Number],

srcAccounts.Name,

srcAccounts.Company,

srcAccounts.Street,

srcAccounts.City,

srcAccounts.State,

srcAccounts.Zip,

srcPurchases.PONumber,

srcPurchases.Category,

Page 227: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

227

srcPurchases.ProductNumber,

srcPurchases.ShipmentMethodCode

FROM

(srcAccounts

RIGHT JOIN srcPurchases ON

srcAccounts.[Account Number] = srcPurchases.AccountNumber)

ORDER BY

srcPurchases.ShipmentMethodCode,

srcAccounts.City

Target (ASCII (Delimited))

location $(FUN_DATA)Purchases_SQLSelect.txt

TargetOptions

header True

Outputmode Replace

Source R1 Events

AfterEveryRecord ClearMapPut Record

target name Target

record layout R1

Map Expressions

R1.Account Number Fields("Account Number")

R1.Name Fields("Name")

R1.Company Fields("Company")

R1.Street Fields("Street")

R1.City Fields("City")

R1.State Fields("State")

Page 228: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

228

R1.Zip Fields("Zip")

R1.PONumber Fields("PONumber")

R1.Category Fields("Category")

R1.ProductNumber Fields("ProductNumber")

R1.ShipmentMethodCode Fields("ShipmentMethodCode")

Page 229: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

229

Structured Schema Designer: Binary Data and Code Pages

Objectives

At the end of this lesson you should be able to create a Structured Schema for a Binary, EBCDIC

File.

Keywords: Binary, EBCDIC

Description

Creating a Structured Schema for a Binary, EBCDIC File

When working with binary files, we will usually need to tell the Structured Schema Designer that the

file should be displayed and accessed using a coding structure other than ANSI. The most common

binary coding structure is EBCDIC. To change this property, we will work with the SSD connection

specification and specifically its Property Sheet. We can change the “Code Page” property to match

the coding structure of the file we are working with.

Another issue with binary files is that the records are often some arbitrary length (e.g., 500 bytes)

even though the logical records might be longer or shorter than that. As a result, when the data is

displayed in the Visual Parser, it does not appear as if the data is structured. There is no automatic

solution to this problem, but you can adjust the record length that the Visual Parser will use until you

see the data lining up properly. Then you can parse normally, and the SSD will “remember” the

record length you have set, and break the file apart properly when you use the schema in a Map

Design.

Exercise

1. Start a New Structured Schema Design

2. Click the Visual Parser button (red knife)

3. Change the Code Page property to 37 US EBCDIC (click the Apply button!)

4. Navigate to the file named Accounts_Binary.bin.

5. Determine the record length by looking for patterns in the file

6. Overtype the Length and hit Enter key (try 180, what happens?)

7. After you have the columns lined up, parse the fields, select data types and field

properties until you have defined the structure.

8. Save the Structured Schema as s_BinaryDataCodePages.ss.xml for reuse.

Record Layouts

Record R1

Name Type Length

AccountNumber Text 9

Page 230: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

230

Name Text 21

Company Text 31

Address Text 35

City Text 16

State Text 2

ZipCode Text 10

Email Text 25

BirthDate Date 4

Favorites Text 11

StandardPayment Packed decimal 6

Payments Packed decimal 7

Balance Packed decimal 6

Total 183

Page 231: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

231

Structured Schema Designer: Reuse Metadata (Reusing a Structured Schema)

Objectives

At the end of this lesson you will know the steps involved in applying a pre-developed Structured

Schema to a new file that is supposed to follow the structure defined in that schema.

Keywords: Structured Schema

Description

This example transforms a Binary file into an ASCII Delimited file.

When you activate the Structured Schema Designer from the Source Tab or Target Tab, and have

saved the schema, it is automatically attached to the current Transformation. If you wish to use a

pre-defined schema, both the Source and Target Tabs have a dropdown from which an existing

schema can be selected.

As soon as the schema is attached, the Source or Target information (hierarchy and field list) will be

filled in on the Map Tab. You may change field names, lengths and data types, but only if you first

“unlock” the schema.

Exercise

1. Start a New Map design session and choose the Binary connector.

2. Select the Structured Schema named s_BinaryDataCodePages.ss.xml.

3. Select the file named Accounts_Binary.bin.

4. Change the source property Code Page to 37 US EBCDIC (click APPLY button!).

5. Browse the file to confirm the structure has been applied.

6. If desired, you can complete the map based on the specifications below. The lesson,

though is intended to demonstrate that a file can be parsed in Structured Schema

Designer, and used as input for Map Designer. In this exercise we use it as a source

connection. Structured Schemas can also be used as part of a target connection.

Map Summary:

Define the Source:

Source Connector: Binary

Source Data: $(FUN_DATA)Accounts_Binary.bin

Source Schema: s_BinaryDataCodePages.ss.xml

Source Options: codepage = 0037 US (EBCDIC)

Page 232: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

232

Define the Target:

Target Connector: ASCII(Delimited)

Target Data: File: $(FUN_DATA)AccountsOut.txt

Target Options: Header = True

Target OutputMode: Replace

Target Field Expressions

R1.AccountNumber Records("R1").Fields("AccountNumber")

R1.Name Records("R1").Fields("Name")

R1.Company Records("R1").Fields("Company")

R1.Address Records("R1").Fields("Address")

R1.City Records("R1").Fields("City")

R1.State Records("R1").Fields("State")

R1.ZipCode Records("R1").Fields("ZipCode")

R1.Email Records("R1").Fields("Email")

R1.BirthDate Records("R1").Fields("BirthDate")

R1.Favorites Records("R1").Fields("Favorites")

R1.StandardPayment Records("R1").Fields("StandardPayment")

R1.Payments Records("R1").Fields("Payments")

R1.Balance Records("R1").Fields("Balance")

Define Events: Source R1 Events

Event Name Event Actions Event Parameters

AfterEveryRecord ClearMapPut Record target name Target

record layout R1

Page 233: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

233

Structured Schema Designer: Multiple Record Type Support in Structured Schema Designer

Objectives

At the end of this lesson you should be able to discuss the differences between files that have

multiple record types and those that don’t. You should be able to describe the tasks that will have to

be performed to work with source files that have multiple record types. You should also be able to

describe the actions you will have to take should you wish to create a target file with multiple record

types.

Keywords: Record Types, Record Layouts, Discriminator, and

Recognition Rules

Description

Files can be grouped into two main classifications relative to the records they contain. The first

classification is comprised of those files all of whose records are of the same type. This means that

each record will contain the same fields, in the same order and with the same properties. The second

classification is comprised of those files that contain records that have different formats. One record

might contain ten fields while another might contain only six or perhaps twelve. One record type

might describe a Customer while another describes a payment he made on his account. Certainly

these two records would be different.

The critical issue for record type files is not the definition of the records themselves. These can be

defined in the Structured Schema Designer with the Visual Parser (by parsing one of them, adding

another, parsing it, adding another, and so on). They can also be defined using the grid interface

within the SSD (where you simply enter record type names and then enter the field lists for each).

You might also be able to import the record layouts, perhaps from a COBOL copybook or some

other readable file.

The critical issue is how the Map Designer will be able to distinguish one record from another. For

any application to be able to work with a file of this type there must be some way to tell the records

apart. There should be one common field in each record type, the value of which must identify the

record type itself. If this were not true, no software application would be able to deal with the file-

Map Designer included. This field is called the “discriminator” field as it enables us to discriminate

between record types.

Once the discriminator field has been identified, the remaining task is to define the values that it can

have and associate these values with individual record types. For example, if the value of the field

were “CUS,” we might know we have a Customer record type. Or if the value of the field were

“PAY,” we might know we are dealing with a record that describes a payment on an account. These

types of rules are called “recognition rules,” and we must define at least one such rule for each

record type. Rules might not be so simple, but fortunately the Structured Schema Designer can work

with very complex ones.

To create a structured schema for a source file that contains multiple record types, there are three

possible strategies you can follow. The strategies you choose depends on what information you

already have available describing the file. The three strategies are:

1. You have record layout definitions available in a file:

Import the record layout definition file into the SSD.

Use the ALL Record Type Rules › Recognition dialog to define at least one rule for each

record type.

Page 234: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

234

2. You have record layout definitions available in a printed document:

Select the connector type in the SSD.

Use the Grid layout to define each record type and its fields.

Use the “ALL Record Type Rules” › ”Recognition” dialog to define at least one rule for

each record type.

3. You have no definitions available- only the data file:

Activate the SSD Visual Parser for your file.

Name and parse each record type.

Find and select the discriminator field.

Use the “Recognition Rules” button to activate the Recognition Rules dialog and define at

least one rule for each record type.

The common element to these strategies is the definition of the “recognition rules.” These are

defined in the “Recognition Rules” dialog, which is activated from either the “ALL Record Type

Rules” › “Recognition” hierarchy item or the individual “R1 Rules” › “R1 Recognition” items on the

grid layout in the SSD.

First, you’ll identify the discriminator- the field whose contents will be used to tell the record types

apart. Next, you can use the Generate Rules button to automatically generate some skeleton rules for

each record type. Finally, you can add the actual value that the discriminator field will contain for

each record type (and adjust other properties of the rules as you wish). When you’re done, the

structured schema for the file can be saved.

Scenario

A source file (Payments_MultiRecType.txt) contains multiple record types, and there is not any

information about the files records or its fields. We know that the file contains payment records

followed by a summary record. We also know that the payment records are supposed to contain an

account number, payment date and payment amount, and that the summary records will contain a

payment count and a payment total. However, we do not know where in the records each field

begins and ends. We need to define a structured schema for this file by visually determining where

each field for both record types starts and stops. We will use the parse data tool to accomplish the

task.

Exercise

1. Begin a new Map Design.

2. Point the source to the ASCII Fixed file Payments_MultiRecType.txt.

3. Browse the source file and determine whether record types exist. Close the browser.

4. Click the “Build Schema...” button for the Structured Schema.

5. Click the Parse Data icon.

6. Rename the Record to Payment and parse a payment record according to the record

layout given below.

Page 235: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

235

Record Payment

Name Type Length

RecordIndicator Text 1

AccountNumber Text 9

PaymentDate Text 8

Amount Text 11

Total 29

7. Click the Add Record button and name the new record type CheckSum.

8. Scroll down until you find the next different structured record (row 30).

9. Parse this record type with its fields as described below.

Record CheckSum

Name Type Length

RecordIndicator Text 1

EmptiedDate Text 8

Action Text 3

TotalAmount Text 9

PaymentCount Text 4

ClerkID Text 4

Total 29

10. Select the Payment record from the Record dropdown and ensure that the

RecordIndicator field is displayed in the “Field Name” box.

11. Check the Discriminator check box.

12. Click the “Recognition Rules...” button.

13. Click the “Generate Rules” button.

14. Define PaymentRule1 to be that the discriminator field equals P .

15. Define CheckSumRule1 to be that the discriminator field must be equal to E.

16. Return to the Structured Schema Designer dialog.

17. Save the structured schema as s_Payments_MultiRecType.ss.xml.

18. Close the Structured Schema Designer.

Page 236: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

236

19. Browse the source file again and note how the structured schema information has been

applied to it. Look at both kinds of records and see how the browser changes.

Page 237: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

237

Structured Schema Designer: Conflict Resolution

Objectives

At the end of this lesson you should be able to use a Structured Schema to set up a Map that uses one

source record type to verify the data in the other record type.

Keywords: Schema Mismatch Handling, Record Specific Event

Handlers, and Validation

Description

Our newly defined payment file structure allows us more robust data validation opportunities as we

load the Payments table because we have some checksum values on which we can evaluate data. The

additional record layout (Check Sum) in our payments file has data that allows us to evaluate

aggregated data with checksum values. We can make use of the record specific Event Handlers to

perform the evaluations at the appropriate time.

When creating map with multiple record types, the Default Event Handler may not be set for

you automatically. Neither will the Default Event Handler be sufficient. Therefore, you will need to

define the Event Handlers and Actions that are needed to perform the transformation.

Exercise

Build a map based on the specifications in the report below.

Map Summary:

Define the Source:

Source Connector: ASCII(Fixed)

Source Data: $(FUN_DATA)Payments_MultiRecType.txt

Source Schema: s_PaymentsMultiRecType.ss.xml

Source Options: None

Define the Target:

Target Connector: ODBC 3.x

Target Data: Database: TrainingDB

Table: tblPaymentsVerified

Target Options: None

Target OutputMode: Clear File/Table contents and Append

Page 238: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

238

Target Field Expressions

R1.AccountNumber Records("Payment").Fields("AccountNumber")

R1.PaymentDate Datevalmask(Trim(Records("Payment").Fields("PaymentDate")),

"mmddyyyy")

R1.PaymentAmount Records("Payment").Fields("Amount") / 100

Variables

Name Type Public Value

paymentCounter Variant no

paymentSubtotal Variant no

Define Events: Source Payment Events

Event Name Event Actions Event Parameters

AfterEveryRecord Execute Expression:

paymentSubtotal = paymentSubtotal +

Records("Payment").Fields("Amount")

paymentCounter = paymentCounter + 1

ClearMapPut Record target name Target

record layout R1

Define Events: Source CheckSum Events

Event Name Event Actions Event Parameters

AfterEveryRecord Execute Expression:

'This code can be imported by the menu, File > Open Script File >

ChecksumTest.rifl

' declare temp variables used for better readability

Dim crlf, realTotal, realCount, crlf

crlf = Chr(13)&Chr(10)

realTotal = Records("CheckSum").Fields("TotalAmount")

realCount = Records("CheckSum").Fields("PaymentCount")

Page 239: Pervasive ETL Fundamental Exercises

Data Integrator – Fundamentals Training

239

' display current count and payment sub-total for each clerk

MsgBox("---New Checksum---" & crlf & _

"PaymentCounter= " & paymentCounter & " : Should be = " &

realCount & crlf & _

"Paymt Amt= " & paymentSubtotal & " : Should be = " & realTotal)

' evaluate count and sub-total for inconsistencies

If paymentSubtotal <> Trim(realTotal) Then

MsgBox("Total payment amount for this clerk does not match

checksum amount!!!", 48)

End If

If paymentCounter <> Trim(realCount) Then

MsgBox("Payment Count for this clerk does not match checksum

amount!!!", 48)

End If

' reset global variables for next clerk

paymentCounter = 0

paymentSubtotal = 0