enterprise edition batch processing guide build 39 · document sciences corporation 4 table of...

39
xPression 3 Enterprise Edition Batch Processing Guide

Upload: others

Post on 25-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

xPression 3Enterprise Edition Batch Processing Guide

© 2001-2008 by EMC Document Sciences Corporation. All rights reserved. The copyright protection claimed includes all formats of copyrightable material and information governed by current or future statutory or judicial law. This includes, without limitations, any material generated by the software programs that display icons or other screen interfaces. You may not copy or transmit any part of this document in electronic or printed format without the express written permission of Document Sciences Corporation. xPression, CompuSet, and all other Document Sciences Corporation products mentioned in this publication are trademarks of Document Sciences Corporation. For complete copyright information, please see the file xPression Licensing Document.pdf located on your eBook Library CD.EMC Document Sciences Corporation, 5958 Priestly Drive, Carlsbad, CA 92008www.docscience.com

Document Sciences Corporation

Table of Contents

Table of Contents

Introduction .................................................................................................... 5

Boxes and Revision Bars ................................................................................................................................... 5Solution Support ................................................................................................................................................ 6

xPression Batch Processing ........................................................................ 7

How Does xPression Handle Batch Jobs? ........................................................................................................ 7Understanding Job Definitions .............................................................................................................. 7If You Are Manually Creating Your Job Definition ................................................................................. 8CompuSet and xPublish Job Definitions ............................................................................................... 8

An In-Depth Look at xPression Batch ................................................................................................................ 9How xPression Batch Fits into Your Environment ................................................................................ 9Important Information About Your Data .............................................................................................. 10Oracle Databases and Batch .............................................................................................................. 11

Multi-Threading in xPression ........................................................................................................................... 11An In-Depth Look at Multi-Threading in xPression Batch ................................................................... 11

xPression Batch and Sub Documents ............................................................................................................. 16File Naming Recommendations for CompuSet Batch Jobs ............................................................................. 16Debugging Your Batch Environment ............................................................................................................... 17

Command Line Batch Processing ............................................................. 18

Command Line Parameters ............................................................................................................................. 19Overriding Data Sources .................................................................................................................................. 21

Data Override with xRevise Custom Batch ......................................................................................... 23Debugging Your Batch Environment ............................................................................................................... 23If You Are Manually Creating Your Job Definition ........................................................................................... 24Sample Batch Trigger File ............................................................................................................................... 24

Document Sciences Corporation

Table of Contents4

Previously Run Jobs ................................................................................... 26

Overview of a Previously Run Jobs Scenario ..................................................................................... 26Creating the Intermediate Output ........................................................................................................ 27Creating the Previously Run Jobs Job Step ....................................................................................... 28

Sample Batch Trigger File .......................................................................... 30

Batch Event Monitor .................................................................................... 32

Implementing the Event Notification Monitor ................................................................................................... 32Event Notification Sequence ............................................................................................................................ 33LifecycleListenerFactory Abstract Class .......................................................................................................... 34

BatchLifecycleListener ........................................................................................................................ 34LifecycleListenerFactory ..................................................................................................................... 34Sample LifecycleListenerFactory Abstract Class ................................................................................ 34

BatchLifecycleListener Interface ...................................................................................................................... 34Sample BatchLifecycleListener Interface ............................................................................................ 35

BatchInfo Class ................................................................................................................................................ 36Attributes ............................................................................................................................................. 36

Methods .............................................................................................................................................. 36BatchError Class .............................................................................................................................................. 36

Attributes ............................................................................................................................................. 36Methods .............................................................................................................................................. 37

DocInfo Class .................................................................................................................................................. 37Attributes ............................................................................................................................................. 37Methods .............................................................................................................................................. 37

EventListenerException Class ......................................................................................................................... 38Methods .............................................................................................................................................. 38

FactoryConfigurationException Class .............................................................................................................. 38Methods .............................................................................................................................................. 38

Document Sciences Corporation

This guide introduces you to the batch process in xPression Enterprise Edition. We will provide an overview of the batch process, show you how to run batch from the command line, and explain how previously run jobs work. Additionally, this guide provides a sample trigger file that you can review and instructions for creating a batch even monitor.

Boxes and Revision BarsThe following colored boxes alert you to special information in the documentation.

Revision bars help you locate new or changed information. Look for these revision bars in the right margin of each affected page.

Chapter 1

Introduction 1

Caution: The caution box warns you that a fatal error, unsatisfactory output, or loss of data may occur if you do not follow the directions carefully.

Tip: A tip offers suggestions to simplify a task or describes a useful shortcut. They may also describe an alternate way to use the techniques described in the text.

Note: A note offers information that emphasizes or supplements important points of the main text.

Document Sciences Corporation

Chapter 1 - Introduction6

Solution SupportFor more information or to solve a problem, contact Document Sciences Solution Support:

Telephone: (760) 602-1500

Fax: (760) 602-1515

World Wide Web: http://support.docscience.com

E-mail: [email protected]

Document Sciences Corporation

Running jobs in batch mode enables you to produce large volumes of personalized xPression documents in a single batch run. You can run batch jobs in three ways:

• Schedule batch jobs to run as unattended jobs

• Manually initiate the batch job from the command line

• Run the batch job directly from Job Management

How Does xPression Handle Batch Jobs?When running in batch mode, xPression produces one or more personalized documents from a set of customer data records. When you run a batch job, you are executing the Batch Runner command, passing along parameters and a job definition.

Understanding Job DefinitionsThe job definition is the key component of a batch job in xPression. It controls everything about your batch job except when and where it executes. Job definitions are XML files that provide the following information:

• Which documents to select

• How to select them

• Which data source to use

• How to process the documents

• How to customize batch reports and batch logs

Job definitions can be created through Job Management or composed manually using a text or XML editor.

Chapter 2

xPression Batch Processing 2

Document Sciences Corporation

Chapter 2 - xPression Batch Processing8

Job definitions generated by Job Management are stored in the content repository. Manually composed job definitions are stored on your file system as an XML file. To learn more about creating job definitions, see The Job Definition Page.

If You Are Manually Creating Your Job DefinitionIf you are manually creating your Job Definitions in XML and are using a DB2 Content Repository, you must use entity notation to represent the following symbols.

CompuSet and xPublish Job DefinitionsThe Job Management page handles job definitions for both CompuSet and xPublish documents. When creating a job definition you can identify the job as a CompuSet or xPublish job. Ensure you pay close attention to this designation as CompuSet and xPublish job definitions are not interchangeable, and only documents that match the job type will be available in Job Management.

All batch configuration work is done from The Job Management Page in xDashboard.

Symbol Notation Symbol Notation

> &gt; < &lt;

>= &gt;= <= &lt;=

<> &lt;&gt;

Document Sciences Corporation

Chapter 2 - xPression Batch Processing9

An In-Depth Look at xPression BatchNow let's take a more in-depth look at the entire Batch process.

The main job thread created by Batch Runner starts an xPression component that reads your customer data, and based on the instructions specified in the job definition, executes the job. It then sends the job record to the Task Queue. The customer data reader reads a block of customer data, which the batch component starts processing by launching a configurable number of parallel threads, each of which invokes an instance of the assembly engine. While the assembly engine threads are assembling personalized documents from the block of data read, xPression reads in the next block. All of these threads feed into a single thread to the Output Profile Controller, which in turn calls the composition engines, xPublish or CompuSet.

How xPression Batch Fits into Your EnvironmentA batch job definition contains your specifications for the customer data source, document source, publisher, and error log preferences. When processing a batch job, xPression merges this information with a previously defined output profile that directs your batch job to the correct output format and output device.

The left side of the diagram in Figure 1 on page 9 shows that the job definition and output profile are sent as input to the batch process.

Figure 1. This diagram shows the queues and threads that are created during batch job processing

Document Sciences Corporation

Chapter 2 - xPression Batch Processing10

xPression produces one or more output streams, each of which contains a set of personalized documents in a specific output format, targeted for a specific output device. The following table shows how output management and job definition elements control your output.

Important Information About Your DataYour data is defined in your data sources as everything between one delimiter node and the next. For example, one of the sample XML data sources supplied with xPression, AUTOPAY.xml, uses <Transaction> and </Transaction> to mark the start and end of data blocks.

xPression Batch will stop processing data for a single customer record set when the number of bytes read reaches the XMLCUSTOMERDATA_MAXRECORDLEN setting in the customerdata.properties file. To ensure that all the XML data is read for a single customer record set, make sure that you set the XMLCUSTOMERDATA_MAXRECORDLEN value to exceed the maximum byte count of the longest customer record set byte length.

xPression Batch sets a default memory size of 20 MB, which is used to store all of the items selected for inclusion in a customer’s document. This value is stored in the customerdata.properties file. In most cases, this will be enough memory to successfully assemble the output. If you need to increase the memory setting, change the XMLCUSTOMERDATA_MAXRECORDLEN value in the customerdata.properties file.

XMLCUSTOMERDATA_BUFSIZE = 5000XMLCUSTOMERDATA_MAXRECORDLEN=20000000

The XMLCUSTOMERDATA_BUFSIZE setting in the customerdata.properties file sets the upper limit for the number of bytes of memory used to process a customer record. If the XMLCUSTOMERDATA_ MAXRECORDLEN setting is smaller than the XMLCUSTOMERDATA_BUFSIZE setting, xPression will read in and process a complete customer record, and then begin to read the next record into the remaining memory. When the remaining memory is too small to successfully process the second customer record, xPression issues an error. To avoid partial processing of customer records, make sure that the XMLCUSTOMERDATA_BUFSIZE setting is always smaller than the XMLCUSTOMERDATA_ MAXRECORDLEN setting.

xPression Element What it Contributes to Your Output

Job Definition Selects customer data, document data, creates reports, and customizes the batch error log.

Output Definition/Format Definition

Determines the output format of the documents in the batch job. This definition is passed along as part of an output profile.

Distribution Definition Sends the documents to a specific output device for publishing. This definition is passed along as part of an output profile. xPression Batch does not support the Return to Caller distribution definition.

Document Sciences Corporation

Chapter 2 - xPression Batch Processing11

Oracle Databases and BatchIf your customer data is in an Oracle database and you encounter random dropped records when running a large batch job, you may need to adjust some parameters in the init.ora file. See your xPression installation documentation.

Multi-Threading in xPressionThe concept behind multi-threading is to enable xPression to process more than one customer record at a time. xPression’s batch capabilities were designed to be thread-safe, enabling xPression to open up several concurrent processing threads, significantly improving batch performance.

xPression sends these multiple threads to either xPublish or the CompuSet publishing engine. While xPression Batch and xPublish are thread-safe, the CompuSet publishing engine is not. CompuSet receives the multiple threads, but can only process them consecutively, not concurrently.

An In-Depth Look at Multi-Threading in xPression BatchxPression uses its main thread, the batch job reading thread, to launch an instance of the customer data reader. The customer data reader parses data according to the instructions contained in the job definition. After the customer data reader reads a block of data, xPression processes it by launching a configurable amount of parallel threads. Each thread invokes an instance of the assembly engine. While the assembly engine threads are

Figure 2. This figure illustrates a high-level view of multi-threading in xPression batch. The following section describes the batch process in more detail

Document Sciences Corporation

Chapter 2 - xPression Batch Processing12

assembling personalized documents for one block of customer data, the customer data reader reads in the next block.

For CompuSet, all of these threads feed into a single thread of the Output Processing Controller (OPC), which in turn calls the CompuSet composition engine. xPression uses a single thread because CompuSet is not thread safe. xPression can invoke only a single instance of it within a process. The multiple threads of the Assembly Engine EJB are synchronized through a queue. Since the Assembly Engine is faster than the Composition Engine, it is important to control the length of this queue so that it doesn't use too much memory and hang the application server. For xPublish, these threads feed into the multiple worker threads of the Streamer, which in turn calls xPublish. The Streamer performs the same functions for xPublish that the OPC performs for CompuSet.

Figure 3. This figure shows how the batch component moves customer data into the assembly engine.

Figure 4. This figure shows multi-threading through CompuSet and xPublish.

Document Sciences Corporation

Chapter 2 - xPression Batch Processing13

For both xPublish and CompuSet, the TaskQueue and the OPCQueue synchronize the work of BatchRunner, the customer data reader, the assembly engine, and the OPC or Streamer. Both are implemented using the same Java class called SyncQueue().

The TaskQueue is a data synchronization channel between Batch Runner's main thread (also known as the batch job reading thread) and the assembly engine threads. The OPCQueue is the data synchronization channel between the assembly engine threads and the OPC or Streamer. Both queues have two configurable thresholds: EnqueueThreshold and DequeueThreshold.

When the number of records in a queue exceeds the configuration value in the EnqueueThreshold, the component that is moving records into the queue must stop and wait until the number of records has fallen to the value in the

Figure 5. This figure shows TaskQueue and OPCQueue

Document Sciences Corporation

Chapter 2 - xPression Batch Processing14

DequeueThreshold. When the value falls to the level of DequeueThreshold xPression initiates the component that moves records into the queue to start moving records again.

You can set this threshold in your xDashboard job definition page.

Configuring Multi-Threading

The settings that control multi-threading reside in the BatchRunner.properties file, the xPressionPublish.properties file, and on the xAdmin Job Definition page. You should configure xPression Batch to run most efficiently on your system.

BatchRunner.properties is located in the xPression installation directory on your application server. It contains the following properties.

Figure 6. Moving records into the queue.

Parameter Definition

TaskQueue_DequeueThreshold The customer data reader reads in blocks of data to the TaskQueue. If too many records accumulate in the TaskQueue, the customer data reader stops reading new blocks of data until the number of records in the queue match this value.

When the number of records in the TaskQueue matches or is lower than this value, the customer data reader resumes reading new blocks of customer data records. Document Sciences recommends a value of 5. See Figure 5 on page 13.

TaskQueue_EnqueueThreshold The maximum number of customer job records that can be queued up for assembly. When the TaskQueue surpasses this value, the customer data reader ceases reading new blocks of data until the DequeueThreshold is met. Document Sciences recommends a value of 10000.

OPCQueue_DequeueThreshold This is not needed, but is included for completion. This value should always be zero.

OPCQueue_EnqueueThreshold The maximum number of assembled documents that can be queued for processing. It is important to set the value to prevent the OPC queue from growing too large in memory. Document Sciences recommends a value of 10.

Document Sciences Corporation

Chapter 2 - xPression Batch Processing15

The xAdmin Job Definition page contains performance parameters.

These parameters configure multi-threading in xPression.

There is one more setting that can affect multi-threading, it is the EnableSubstream setting in xPressionPublish.properties that activates or deactivates the multiple thread capabilities of the xPublish emitters. Once a document is composed by xPublish, it is sent to an emitter that creates the file in the proper format. When you disable the multiple thread capabilities of the xPression emitters, the emitter is only capable of processing one document at a time.

It is necessary to disable this setting when you wish to produce all of your customer records in a single output file.

EnableSubstream is disabled by default in the xPressionPublish.properties file located in your xPression installation directory on the application server. Disabling this setting should not produce a significant performance downgrade for typical xPublish documents, such as statements. Emitter performance is generally an issue for long text-based documents, such as policies and contracts.

Figure 7. Batch Performance parameters

Parameter Definition

Thread Pool Size Defines the number of worker threads available for each batch run. The Customer Data Reader and xPression Assembly components of the batch process use this setting to distribute customer records across parallel threads to improve performance.

Customer Record Buffer Number of customer records the main batch thread reads in at a time.

Job Level Indicates what type of information will be collected for your job.

Statistics - Collects only batch statistics, such as start time, end time, and publish type.

Statistics with errors - Collects all the statistics information and information about failed customer documents.

Statistics with details - Collects all the statistics information and customer document information for all documents.

Document Sciences Corporation

Chapter 2 - xPression Batch Processing16

If you are using multi-threading, are not producing a single print file, and need increased emitter performance, configure the following parameters in the xPressionPublish.properties file.

The recommendations provided above may not be ideal for your specific installation because of various factors, such as CPU speed, available memory, and I/O speed. You may need to adjust the configurations based on observed performance.

xPression Batch and Sub DocumentsWhen using data override with subdocuments, the customer data source for both master and subdocument must use the same schema.

File Naming Recommendations for CompuSet Batch JobsIf you are running a batch job that can conditionally execute one or more output streams based on a set of criteria, Document Sciences recommends creating unique names for your print files using counters or variables, or deleting each print file with a print script after it completes processing.

The reason for this suggestion is that the print files created by the batch job are not automatically deleted after each run. Remember that the DistributionController service processes any records that it finds in the T_PRINTINFO table in your content repository. xPression populates this table based on the output files it finds in the output file directory you specified. If the output files are not automatically deleted after each run, xPression will continue populating T_PRINTINFO with all of the files in that directory regardless if they were generated by the new batch run or not.

For example, if you ran a batch job that produced two files (file A and file B), both file A and file B would be saved to your output file directory. If you then ran the batch job a second time and it produced only one file (file A), xPression would overwrite the existing file A but not delete the existing file B. xPression would then process both file A and the unneeded file B for the second batch job.

Parameter Definition

EnableSubStream This value can be set to true or false. If set to true, the xPublish emitters are capable of processing multiple streams of documents concurrently.

SubstreamNumber This parameter is only used if EnableSubstream=true. This value defines the number of parallel substreams you want your emitter to use. This value must be set to an integer.

Document Sciences Corporation

Chapter 2 - xPression Batch Processing17

You can avoid this problem by giving your output files unique names or by deleting the output files after each batch run through a print script.

Debugging Your Batch EnvironmentIf you are experiencing problems while processing your jobs with xPression Batch, your first step should be to discover the source of the error.

Often, the easiest way to accomplish this is to check for errors in the log file specified on the Job Log tab of your job definition. This log file should contain a list of errors encountered by the batch process.

For more information about the problems your batch run encountered, see the main log file, xPression.log. As is shown in Working with the LogConfiguration Files, you can set up this log file to produce three levels of details: INFO, ERROR, and DEBUG.

Document Sciences Corporation

For Windows 2000 and Windows XP operating systems, Document Sciences provides BatchRunner as a batch file. For UNIX platforms, Document Sciences provides Batch Runner as a shell script.

To run xPression Batch directly from the command line, navigate to the xPression .ear directory in your application server installation directory and type the BatchRunner command. The BatchRunner command uses the following parameters:

• For Windows operating systems, type:

BatchRunner -j JobDefName -f JobDefLocation -q OutDataOverrideFile -o InDataOverrideFile -n BatchParameter -p OutputFilePath -ignoreDebug -d DiagnosticOutput -disableReporting -r JobRunIDOverride

• For UNIX operating systems, the command is case sensitive. Type:

$ BatchRunner.sh -j JobDefName -f JobDefLocation -q OutDataOverrideFile -o InDataOverrideFile -n BatchParameter -p OutputFilePath -ignoreDebug -d DiagnosticOutput -disableReporting -r JobRunIDOverride

For example, if you installed WebSphere to the C:\ directory, your BatchRunner command would look like this: C:\WebSphere\appServer\installedApps\xPression.ear\BatchRunner -j NightlyBatchRun

The default location of xPression Batch output is:

• For CompuSet: <xPressionHome>\CompuSet\Output\

• For xPublish: <xPressionHome>\Publish\Output\

Chapter 3

Command Line Batch Processing 3

Document Sciences Corporation

Chapter 3 - Command Line Batch Processing19

Command Line ParametersThe following table contains descriptions of the BatchRunner parameters.

Parameter Definition

-j This parameter identifies the job definition you want to run. Use this parameter if you created your job definition in xDashboard. Follow the -j parameter with the name of the job definition. You cannot use this parameter if you are also using the -f parameter. If you created your job definition manually, see the instructions for the -f parameter.

NOTE: If your job definition name contains more than one word, you must enclose the name in double-quotes. For example, “Sample Job Def”.

-f (Optional) This parameter identifies the location of a manually created job definition. You should only use this parameter if you created your job definition manually.

Follow the -f parameter with the fully qualified name and location of the job definition file. Do not use this parameter if you created your job definition with xDashboard. You cannot use this parameter if you are also using the -j parameter.

-q (Optional) This parameter creates a list of the data sources used in the batch run. If you follow this parameter with an output file name, xPression will write the list to the file. If you do not supply a file name, xPression will echo the results to the user screen.

By specifying a filename, you can edit this file to change the data source designations and import it into the job definition using the -o parameter.

For more information, see Overriding Data Sources.

-o (Optional) This parameter enables you to override the data sources of a batch run. Typically, you will first create a data override file with the -q parameter. You can then edit this file to change the data source designations, and import it with the -o parameter.

Supply the name of the file you want to import after the -o parameter.

For more information, see Overriding Data Sources.

Document Sciences Corporation

Chapter 3 - Command Line Batch Processing20

-n (Optional) The BatchParameter. This parameter enables you to add an identifier to your report file, log file, and print file names. This identifier helps you identify output files for a particular job. This is especially helpful when output from more than one job resides in the same directory.

Follow the -n parameter with text to identify the output files of a batch run.

For example:

-n 1stRun

If you have added the BatchParameter variable to your report file, log file, or print file names, xPression will take the identifier and apply it to those file names.

Anytime you use a BatchParameter in xDashboard, xPression appends the BatchParameter to the beginning of your CompuSet and emitter log file names. This helps you locate all the log files for a particular batch run.

-p (Optional) This parameter specifies the output file location for the print and log files. If used, it overrides and print, log, or report file paths you specified in xAdmin. For example:

-p C:\Test Jobs\Daily Run\

-ignoreDebug When running in DEBUG mode, xPression will warn you that your LogConfiguration file is set to DEBUG. When you run in DEBUG mode, xPression writes a lot of information to the xPression.log file. This can slow down your publishing time. xPression requires you to confirm running in DEBUG mode by typing "Y". To disable this warning, use the -ignore DEBUG parameter. This parameter is used by default when running batch through xAdmin.

-d The diagnostic output parameter. To use the diagnostic utility with xPression batch, simply add the -d command line parameter to the batch start command.

The -d parameter should be followed by an identification string. This string can consist of any characters that are valid for a filename.

For example:BatchRunner -j MyJobDefinition -d TestDiagnosticOutput

See The xPression Diagnostic Utility for more information.

-disableReporting xPression’s reporting capabilities can slow down your batch performance. Use this parameter to disable the xPression’s reporting capabilities for the current batch run. xPression will not capture any job run information when this parameter is used.

Parameter Definition

Document Sciences Corporation

Chapter 3 - Command Line Batch Processing21

Overriding Data SourcesYou can override the designated data sources when running a batch job by making use of the -q and -o parameters. The -q parameter enables you to output a data file that lists the data sources for each step in the job. You can alter this file and use the -o parameter to input the altered data file to the batch runner.

-r Enables you to specify the job run ID as an input parameter to xPression during a batch run. If you do not specify the job run ID, xPression automatically creates a unique identifier.

If you want to query your batch jobs from the database in an automated fashion, you should use this parameter.

Please make sure the job run ID you choose is unique. xPression will report an error if you attempt to create a job run ID that already exists. The job run ID must contain only numbers and be no longer than 31 characters.

For example:

-r 142990045

The xPression-generated job run ID consists of the following values:<IP_address><object_hashcode><system_milliseconds><global_counter>

Figure 8. This figure shows the format of the output data file.

Parameter Definition

Trigger Data Source

Step Number

Step Name Assemble Data Source

Document Sciences Corporation

Chapter 3 - Command Line Batch Processing22

You can edit this file to change the data source and trigger file designations. You cannot change the step number or step name. To override only the trigger file in step 1, change the output data file as shown here.

The following parameters make up the JobStepDatasourceOverride element:

<?xml version=”1.0” encoding=”UTF-8”?><JobDatasourceOverride>"<JobStepDatasourceOverride jobStepID="" jobStepName="" definedAssembleDatasource="" definedTriggerDatasource="" overridenAssembleDatasource="" overridenAssembleDataFile="" overridenTriggerDatasource=""" overridenTriggerDataFile="" /></JobDatasourceOverride>

Parameter Description

jobStepID The unique ID for the job step.

jobStepName The name of the job step. It has to match the job step name you defined in xDashboard for the job .

definedAssembleDatasource

The name of the defined assemble datasource. This data source is specified in xDashboard when you define the job.

definedTriggerDatasource

The name of the defined trigger data source.

overridenAssembleDatasource

The name of the overriden assemble data source.

overridenAssembleDataFile

The name of the overriden assemble data file. This is the XML file where the override data resides.

overridenTriggerDatasource

The name of the overriden trigger data source.

overridenTriggerDataFile

The name of the overriden trigger data file.

Document Sciences Corporation

Chapter 3 - Command Line Batch Processing23

To import an altered output data file, use the -o parameter as follows:

• From your Windows command line, type BatchRunner -j Test -o OverrideTrigger.txt

• From your UNIX command line, type $ BatchRunner.sh -j Test -o OverrideTrigger.txt

Data Override with xRevise Custom BatchA new attribute, customerXMLFilePath, has been added to the JobDatasourceOverride element in xRevise Custom Batch. For example:

By pointing customerXMLFilePath to the customer data xml, xRevise custom batch will use these data to replace variables in the selected completed work item.

Debugging Your Batch EnvironmentIf you are experiencing problems while processing your jobs with xPression Batch, your first step should be to discover the source of the error.

Often, the easiest way to accomplish this is to check for errors in the log file specified on the Job Log tab of your job definition. This log file should contain a list of errors encountered by the batch process.

For more information about the problems your batch run encountered, see the main log file, xPression.log. As is shown in Working with the LogConfiguration Files, you can set up this log file to produce three levels of details: INFO, ERROR, and DEBUG.

<JobDatasourceOverride> <JobStepDatasourceOverride jobStepID="0" jobStepName="xxx"customerXMLFilePath="xxx"/></JobDatasourceOverride>

Document Sciences Corporation

Chapter 3 - Command Line Batch Processing24

If You Are Manually Creating Your Job DefinitionIf you are manually creating your Job Definitions in XML and are using a DB2 xPression database, you must use entity notation to represent the following symbols.

Sample Batch Trigger FileAn xPression trigger file contains two elements, the primary key (PK) and the document name. In most cases, the PK identifies the customer record.

Please examine the following sample trigger file:

Symbol Notation Symbol Notation

> &gt; < &lt;

>= &gt;= <= &lt;=

<> &lt;&gt;

<CustomerData><TRIGGERDATA><AUTOPAY_KEY>1</AUTOPAY_KEY><BDT>Insurance Letter</BDT>

</TRIGGERDATA><TRIGGERDATA><AUTOPAY_KEY>5</AUTOPAY_KEY><BDT>Invoice</BDT>

</TRIGGERDATA><TRIGGERDATA><AUTOPAY_KEY>8</AUTOPAY_KEY><BDT>Invoice</BDT>

</TRIGGERDATA><TRIGGERDATA><AUTOPAY_KEY>45</AUTOPAY_KEY><BDT>Insurance Letter</BDT>

</TRIGGERDATA><TRIGGERDATA><AUTOPAY_KEY>70</AUTOPAY_KEY><BDT>Insurance Letter</BDT>

</TRIGGERDATA></CustomerData>

Document Sciences Corporation

Chapter 3 - Command Line Batch Processing25

In this example, <AUTOPAY_KEY> is the primary key for the data source. When creating your trigger file, use the name for your primary key.

The <BDT> element identifies the document name.

You can only specify one primary key and one document name for each triggerdata record. If you want to publish multiple documents for the same customer record, you must create a separate triggerdata record for each document. For example:

<TRIGGERDATA><AUTOPAY_KEY>45</AUTOPAY_KEY><BDT>Insurance Letter</BDT>

</TRIGGERDATA><TRIGGERDATA>

<AUTOPAY_KEY>45</AUTOPAY_KEY><BDT>Invoice</BDT>

</TRIGGERDATA>

Document Sciences Corporation

Previously Run Job is a CompuSet-only feature. Previously run jobs are jobs that were intercepted by xPression before the final output was created. This intermediate output is saved to a network location defined by the user. The purpose of saving the intermediate output is to merge it with another batch job through the use of a Previously Run Job step.

A Previously Run Job step performs the sole action of merging intermediate output into your current job definition. This chapter covers how to set up a complete Previously Run Job scenario.

Overview of a Previously Run Jobs ScenarioIn most cases, a Previously Run Jobs scenario is similar to the following setup:

Notice that this scenario uses three different job definitions.

The first two job definitions create separate pieces of intermediate output. These job definitions do not use a Previously Run Jobs step. The third job definition uses the Previously Run Jobs step to merge the intermediate output.

Chapter 4

Previously Run Jobs 4

Job Definition Description

Job 1 The first job definition is configured to save intermediate output to a network location.

Job 2 The second job definition is configured to save additional intermediate output to a network location.

Job 3 The third job definition contains the Previously Run Job step that merges together the intermediate output from Job 1 and Job 2.

Document Sciences Corporation

Chapter 4 - Previously Run Jobs27

Important Requirement

A Previously Run Jobs scenario contains one crucial requirement. Intermediate output can only be merged and published by the same output stream(s) that originally created it. This means that the output stream used in the output profile for your Previously Run Job definition will only produce output if that same stream was used to initially create the intermediate output.

For example, consider the following scenario:

Job 1 uses output profile A which contains two output streams, Stream X and Stream Y. This job creates output named CCF1.

Job 2 uses output profile B which contains two output streams, Stream Y and Stream Z. This job creates output named CCF2.

Job 3 attempts to merge and publish intermediate output from both jobs.

If Job 3 uses output profile A which contains Stream X and Stream Y, your final output will be as follows:

• Stream X from output profile A will produce all documents from CCF1 and no documents from CCF2

• Stream Y from output profile A will produce all documents from CCF1 and CCF2

Stream X does not produce documents for CCF2 because stream X was not used to create CCF2.

For this reason, Document Sciences recommends that any stream you want to output from intermediate output be included in the job that creates the intermediate output.

Creating the Intermediate OutputThis section describes how to setup a job definition that creates intermediate output as described in the table on page 26 (Job 1 and Job 2).

To create intermediate output, complete the following steps:

1. Create a new CompuSet job definition and supply a name for the job definition.

2. Select the output profile. It is important to remember the name of the profile you select. The job definition that will eventually merge together your intermediate output (Job 3 in the table on page 26) must reference an output profile that uses an output stream that was previously used in one of the job definitions that created the intermediate output (Jobs 1 and 2 from the table on page 26).

3. This job definition requires a job step to select your documents and customer records. Do not select Previously Run Jobs for this job step. Select Assemble or Queued Documents.

Document Sciences Corporation

Chapter 4 - Previously Run Jobs28

4. Select the Save to Intermediate Output Only checkbox and provide a path and filename for the intermediate output in the field indicated. All documents processed through this job definition are saved to the path you define in the Path to directory box.

5. Repeat these steps to create additional intermediate output jobs if necessary.

Creating the Previously Run Jobs Job StepThis section describes how to setup a Previously Run Job definition that merges intermediate output as described in the table on page 26 (Job 3).

To create your Previously Run Job definition, create a CompuSet job definition (as shown in Creating the Intermediate Output) that uses a valid output profile. This job definition must reference an output profiles that uses an output stream that was previously used in one of the output profiles from the job definition that created the intermediate output (Jobs 1 and 2 from the table on page 26).

For example, the following scenario is valid:

• Job1 creates intermediate output and uses an output profile that contains output stream A.

• Job2 creates intermediate output and uses an output profile that contains output stream B

• Job3 uses a Previously Run Job steps to merge together the intermediate output, and uses an output profile that contains output stream B.

The following scenario is also valid:

• Job1 creates intermediate output and uses an output profile that contains output stream A.

• Job2 creates intermediate output and uses an output profile that contains output stream A

• Job3 uses a Previously Run Job steps to merge together the intermediate output, and uses an output profile that contains output stream A.

Figure 9. Remember the path you specify here, you'll need to reference it from the Previously Run Jobs definition

Document Sciences Corporation

Chapter 4 - Previously Run Jobs29

The following scenario is invalid:

• Job1 creates intermediate output and uses an output profile that contains output stream A.

• Job2 creates intermediate output and uses an output profile that contains output stream B

• Job3 uses a Previously Run Job steps to merge together the intermediate output, and uses an output profile that contains output stream C.

To merge the intermediate output:

1. Click Add Step to create a new job step.

2. From the job step page, select Previously Run Jobs from the Type list.

Supply the fully qualified path and filename of the intermediate output you want to merge into the current job. The path you define here is the same path that you used when saving intermediate output from your previous job definitions on page 28.

3. Click Update. The job definition page shows the new job step.

Each job step can only contain a reference to one piece of intermediate output. To merge in another intermediate output file, create an additional job step using the same instructions.

4. Next, you must use an Assemble job step as the final job step for your Previously Run Jobs job definition. The Assemble job step provides a key for merging the intermediate output from the previous documents into the final document.

Figure 10. Supply path and filename to intermediate output

Document Sciences Corporation

An xPression trigger file contains two elements, the primary key (PK) and the document name. In most cases, the PK identifies the customer record.

Please examine the following sample trigger file:

In this example, <AUTOPAY_KEY> is the primary key for the data source. When creating your trigger file, use the name for your primary key.

The <BDT> element identifies the document name.

Chapter 5

Sample Batch Trigger File 5

<CustomerData><TRIGGERDATA><AUTOPAY_KEY>1</AUTOPAY_KEY><BDT>Insurance Letter</BDT>

</TRIGGERDATA><TRIGGERDATA><AUTOPAY_KEY>5</AUTOPAY_KEY><BDT>Invoice</BDT>

</TRIGGERDATA><TRIGGERDATA><AUTOPAY_KEY>8</AUTOPAY_KEY><BDT>Invoice</BDT>

</TRIGGERDATA><TRIGGERDATA><AUTOPAY_KEY>45</AUTOPAY_KEY><BDT>Insurance Letter</BDT>

</TRIGGERDATA><TRIGGERDATA><AUTOPAY_KEY>70</AUTOPAY_KEY><BDT>Insurance Letter</BDT>

</TRIGGERDATA></CustomerData>

Document Sciences Corporation

Chapter 5 - Sample Batch Trigger File31

You can only specify one primary key and one document name for each triggerdata record. If you want to publish multiple documents for the same customer record, you must create a separate triggerdata record for each document. For example:

<TRIGGERDATA><AUTOPAY_KEY>45</AUTOPAY_KEY><BDT>Insurance Letter</BDT>

</TRIGGERDATA><TRIGGERDATA>

<AUTOPAY_KEY>45</AUTOPAY_KEY><BDT>Invoice</BDT>

</TRIGGERDATA>

Document Sciences Corporation

xPression enables you to monitor your batch jobs and receive notification when an error is generated. The event monitor is provided through a JAVA abstract class named LifecycleListenerFactory and a JAVA interface named BatchLifecycleListener. To use the event monitor, you are required to create custom JAVA classes to extend the abstract class and implement the interface.

The following diagram shows the architecture of the event notification design.

Implementing the Event Notification MonitorTo implement the event notification monitor, you must complete the following steps:

1. Create a class to implement the BatchLifecycleListener Interface. This class should have a default constructor.

2. Create a class to extend the LifecycleListenerFactory Abstract Class. You must override the method named createBatchLifecycleListenerFor();

Chapter 6

Batch Event Monitor 6

Figure 11. The following diagram shows the event monitor architecture.

Document Sciences Corporation

Chapter 6 - Batch Event Monitor33

3. Edit the batchrunner.properties file located in your xPressionHome directory. Add the following property to this file:ListenerFactoryImpClass=<ClassName>

Where <ClassName> is the class you created to extend the LifecycleListenerFactory abstract class.

Event Notification SequenceWhen fully implemented, the event notification sequence will operate as follows:

1. Batch is executed.

2. xPression creates the BatchInfo object.

3. xPression creates the LifecycleListenerFactory as dictated by the “ListenerFactoryImpClass=Class Name” property in the batchrunner.properties file. You set this property using the name of the class you created to extend the LifecycleListenerFactory class.

4. xPression creates the BatchLifecycleListener instance.

5. xPression invokes the BatchLifecycleListener.onBatchStart() method.

6. If the batch error handling mechanism captures errors, xPression invokes the onFatalError() or onDocumentError() methods.

7. When the batch job completes, the onBatchDone() method is called.

Figure 12. Sequence Diagram

Document Sciences Corporation

Chapter 6 - Batch Event Monitor34

LifecycleListenerFactory Abstract ClassThis is the JAVA abstract class provided by Document Sciences. The class you create must be derived from this abstract class. This class consists of two methods: BatchLifecycleListener and LifecycleListenerFactory.

BatchLifecycleListenerpublic abstract BatchLifecycleListener

createBatchLifecycleListenerFor(BatchInfo batchInfo) throws EventListenerException

LifecycleListenerFactoryThis method acquires configuration information from the batchrunner.properties file and creates a LifecycleListenerFactory class that is derived from LifecycleListenerFactory. The derived class then implements the createBatchLifecycleListenerFor() method.

public static LifecycleListenerFactory

createInstance() throws FactoryConfigurationException

Sample LifecycleListenerFactory Abstract ClassThe following text is a sample of the LifeCycleListenerFactory abstract class.

BatchLifecycleListener InterfaceThe interface will provide four basic events.

+onBatchStart()+onBatchDone()+onFatalError(in error : BatchError)+onDocumentError(in info : DocInfo, in error : BatchError)

import com.dsc.xpression.batch.tracking.*;public class BatchLifecycleFactoryImp extends LifecycleListenerFactory{ public BatchLifecycleListener createBatchLifecycleListenerFor(BatchInfo info){ return new TestListenerImp(info); }}

Document Sciences Corporation

Chapter 6 - Batch Event Monitor35

If the onBatchStart(), onFatalError() and onDocumentError() methods fail, you can catch these exceptions for debugging purposes as shown in the following sample code.

Sample BatchLifecycleListener InterfaceThe following text is a sample of the BatchLifeCycleListener interface.

import com.dsc.xpression.batch.tracking.*;import com.dsc.xpression.batch.tracking.exception.*;public class TestListenerImp implements BatchLifecycleListener{ BatchInfo info; public TestListenerImp(BatchInfo info){ this.info = info; } public void onBatchStart() throws EventListenerException{ System.out.println("onBatchStart(). " + info.getJobName() + " Start time: " + info.getStartTime()); }

public void onBatchDone(){ System.out.println("onBatchDone(). " + " End Time: " + info.getEndTime()); }

public void onFatalError(BatchError error) throws EventListenerException{ System.out.println("onFatalError(). "); System.out.println("code = " + error.getErrorCode()); System.out.println("level = " + error.getErrorLevel()); System.out.println("message = " + error.getErrorMessage()); }

public void onDocumentError(DocInfo info, BatchError error) throws EventListenerException{ System.out.println("onDocumentError()."); System.out.println("code = " + error.getErrorCode()); System.out.println("level = " + error.getErrorLevel()); System.out.println("message = " + error.getErrorMessage()); System.out.println("customerkeys = " + info.getCustomerKeys()); }}

Document Sciences Corporation

Chapter 6 - Batch Event Monitor36

BatchInfo ClassThis class captures the batch information and provides “set” and “get” methods that enable you to access these attributes.

AttributesThe BatchInfo Class has the following attributes.

MethodsThe BatchInfo class has the following methods.

+setJobName(in jobName : String)+getJobName() : String+setStartTime(in startTime : Date)+getStartTime() : Date+setendTime(in endTime : Date)+getendTime() : Date

BatchError ClassThis class captures batch error information and provides “set” and “get” methods that enable you to access these attributes.

AttributesThe BatchError class has the following attributes.

Attribute : Type Definition

-jobName : String The name of the batch job.

-startTime : Date The start time of the batch job.

-endTime : Date The ending time of the batch job.

Attribute : Type Definition

-errorCode : String The error message code.

Document Sciences Corporation

Chapter 6 - Batch Event Monitor37

MethodsThe BatchError class has the following methods.

+setErrorCode(in code : String) : void+getErrorCode() : String+setErrorMessage(in message : String) : void+getErrorMessage() : String+setErrorLevel(in errorLevel : String) : void+getErrorLevel() : String

DocInfo ClassThis class captures the customerKey attribute.

AttributesThe DocInfo class has the following attributes.

MethodsThe DocInfo class has the following methods.

+setcustomerKeys : String+getcustomerKeys() : String

-errorMessage : String The error message.

-errorLevel : String The recorded error level.

Attribute : Type Definition

-customerKeys : String The customer keys included in the batch job.

Attribute : Type Definition

Document Sciences Corporation

Chapter 6 - Batch Event Monitor38

EventListenerException Class

MethodsThe EventListenerException class has the following methods.

+EventListenerException()+EventListenerException(in s: String)

FactoryConfigurationException Class

MethodsThe FactoryConfigurationException class has the following methods.

+FactoryConfigurationException()+FactoryConfigurationException(in s: String)

Document Sciences Corporation

© 2011 - 2013 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change

without

notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO

REPRESENTATIONS OR

WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND

SPECIFICALLY

DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

EMC2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United State and other

countries.

All other trademarks used herein are the property of their respective owners.