denodo itpilot 4.6 developer guidehelp.denodo.com/.../4.6/denodoitpilot.developer.pdf ·...
TRANSCRIPT
DENODO ITPILOT 46 DEVELOPER GUIDE
Update Aug 16th 2011
NOTE This document is confidential and is the property of Denodo Technologies (hereinafter Denodo) No part of the document may be copied photographed transmitted electronically stored in a document management system or reproduced by any other means without prior written permission from Denodo
Copyright 2011 This document may not be reproduced in total or in part without written permission from Denodo Technologies
ITPilot 46 Developer Guide
INDEX
PREFACE I SCOPE I WHO SHOULD USE THIS DOCUMENT I SUMMARY OF CONTENTS I
1 INTRODUCTION 2
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES 3 21 WEB SERVICE TYPES 3 22 INVOKING SOAP WEB SERVICES 3 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES 3
231 HTML Output Configuration 4 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES 5
3 ITPILOT DEVELOPMENT API 7 31 CONNECTING TO THE SERVER 7 32 OBTAINING WRAPPERS 8 33 USING WRAPPERS 8 34 PROCESSING QUERY RESULTS 9
341 Canceling Queries 11 35 EXAMPLE OF USE 11
4 CREATING CUSTOM ITPILOT FUNCTIONS 14 41 NAMING CONVENTIONS AND ANNOTATIONS 15 42 COMPOUND TYPES 15 43 PAGE TYPE 16 44 CUSTOM FUNCTION RETURN TYPE 16 45 EXAMPLE 17
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT 18 51 INTRODUCTION 18 52 REPRESENTATION FORMAT OF A WRAPPER 18
521 Initialization of Searchable Parameters 19 522 Main Function 19 523 Generating the Output Structure 19
53 PREDEFINED ITPILOT COMPONENT GUIDE 19 531 Introduction 19 532 Data Structures 19 533 Common functions 22 534 Add Record To List 24 535 Condition 25 536 Create List 26 537 Create Persistent Browser 27 538 Diff 28 539 ExecuteJS 30 5310 Expression 31 5311 Extractor 32
ITPilot 46 Developer Guide
5312 Fetch 33 5313 Filter 35 5314 Form Iterator 36 5315 Get Page 40 5316 Init 41 5317 Iterator 45 5318 JDBCExtractor 46 5319 Loop 48 5320 Next Interval Iterator 49 5321 Output 51 5322 Record Constructor 52 5323 Record Sequence or Extractor Sequence 53 5324 Release Persistent Browser 54 5325 Repeat 55 5326 Script 56 5327 Sequence 57 5328 Store File 59 5329 Thread 60
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS 61 541 Developing Custom Components 61 542 Using Custom Components 62
55 WRAPPER DEVELOPMENT 62
REFERENCES 63
ITPilot 46 Developer Guide
FIGURES Figure 1 Example of query execution to a wrapper 13 Figure 2 ITPilot Custom Function Sample 17 Figure 3 ITPilot Wrapper Skeleton in JavaScript 18 Figure 4 Using the ExecuteJS NSEQL command 30 Figure 5 Using threads in the Iterator component 45 Figure 6 Using the Loop function 48 Figure 7 Using the Repeat function 55 Figure 8 Using custom components from JavaScript 62
ITPilot 46 Developer Guide
Preface i
PREFACE
SCOPE
Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot
WHO SHOULD USE THIS DOCUMENT
This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises
SUMMARY OF CONTENTS
More specifically this document
bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot
bull Describes the task of exporting and deploying a wrapper as a Web Service
bull Gives a detailed description of how to use the development API offered by Denodo ITPilot
bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server
bull Details how to create custom ITPilot functions
bull Explains how to develop wrappers by using the ITPilot JavaScript components
ITPilot 46 Developer Guide
Introduction 2
1 INTRODUCTION
Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 3
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper
bull An operation containing all searchable and compulsory parameters
bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])
The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server
21 WEB SERVICE TYPES
ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are
bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table
containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file
The following section describes the querying process for these Web Services
22 INVOKING SOAP WEB SERVICES
The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application
23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 4
Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be
httpacme9090testWSrest httpacme9090testWShtml
For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used
httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA
httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following
httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen
where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions
httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA
If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing
httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1
231 HTML Output Configuration
The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows
bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 5
bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval
bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected
bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines
bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar
bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added
bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added
These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format
httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen
For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10
24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool
1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo
ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
NOTE This document is confidential and is the property of Denodo Technologies (hereinafter Denodo) No part of the document may be copied photographed transmitted electronically stored in a document management system or reproduced by any other means without prior written permission from Denodo
Copyright 2011 This document may not be reproduced in total or in part without written permission from Denodo Technologies
ITPilot 46 Developer Guide
INDEX
PREFACE I SCOPE I WHO SHOULD USE THIS DOCUMENT I SUMMARY OF CONTENTS I
1 INTRODUCTION 2
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES 3 21 WEB SERVICE TYPES 3 22 INVOKING SOAP WEB SERVICES 3 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES 3
231 HTML Output Configuration 4 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES 5
3 ITPILOT DEVELOPMENT API 7 31 CONNECTING TO THE SERVER 7 32 OBTAINING WRAPPERS 8 33 USING WRAPPERS 8 34 PROCESSING QUERY RESULTS 9
341 Canceling Queries 11 35 EXAMPLE OF USE 11
4 CREATING CUSTOM ITPILOT FUNCTIONS 14 41 NAMING CONVENTIONS AND ANNOTATIONS 15 42 COMPOUND TYPES 15 43 PAGE TYPE 16 44 CUSTOM FUNCTION RETURN TYPE 16 45 EXAMPLE 17
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT 18 51 INTRODUCTION 18 52 REPRESENTATION FORMAT OF A WRAPPER 18
521 Initialization of Searchable Parameters 19 522 Main Function 19 523 Generating the Output Structure 19
53 PREDEFINED ITPILOT COMPONENT GUIDE 19 531 Introduction 19 532 Data Structures 19 533 Common functions 22 534 Add Record To List 24 535 Condition 25 536 Create List 26 537 Create Persistent Browser 27 538 Diff 28 539 ExecuteJS 30 5310 Expression 31 5311 Extractor 32
ITPilot 46 Developer Guide
5312 Fetch 33 5313 Filter 35 5314 Form Iterator 36 5315 Get Page 40 5316 Init 41 5317 Iterator 45 5318 JDBCExtractor 46 5319 Loop 48 5320 Next Interval Iterator 49 5321 Output 51 5322 Record Constructor 52 5323 Record Sequence or Extractor Sequence 53 5324 Release Persistent Browser 54 5325 Repeat 55 5326 Script 56 5327 Sequence 57 5328 Store File 59 5329 Thread 60
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS 61 541 Developing Custom Components 61 542 Using Custom Components 62
55 WRAPPER DEVELOPMENT 62
REFERENCES 63
ITPilot 46 Developer Guide
FIGURES Figure 1 Example of query execution to a wrapper 13 Figure 2 ITPilot Custom Function Sample 17 Figure 3 ITPilot Wrapper Skeleton in JavaScript 18 Figure 4 Using the ExecuteJS NSEQL command 30 Figure 5 Using threads in the Iterator component 45 Figure 6 Using the Loop function 48 Figure 7 Using the Repeat function 55 Figure 8 Using custom components from JavaScript 62
ITPilot 46 Developer Guide
Preface i
PREFACE
SCOPE
Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot
WHO SHOULD USE THIS DOCUMENT
This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises
SUMMARY OF CONTENTS
More specifically this document
bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot
bull Describes the task of exporting and deploying a wrapper as a Web Service
bull Gives a detailed description of how to use the development API offered by Denodo ITPilot
bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server
bull Details how to create custom ITPilot functions
bull Explains how to develop wrappers by using the ITPilot JavaScript components
ITPilot 46 Developer Guide
Introduction 2
1 INTRODUCTION
Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 3
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper
bull An operation containing all searchable and compulsory parameters
bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])
The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server
21 WEB SERVICE TYPES
ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are
bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table
containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file
The following section describes the querying process for these Web Services
22 INVOKING SOAP WEB SERVICES
The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application
23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 4
Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be
httpacme9090testWSrest httpacme9090testWShtml
For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used
httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA
httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following
httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen
where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions
httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA
If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing
httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1
231 HTML Output Configuration
The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows
bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 5
bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval
bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected
bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines
bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar
bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added
bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added
These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format
httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen
For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10
24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool
1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo
ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
INDEX
PREFACE I SCOPE I WHO SHOULD USE THIS DOCUMENT I SUMMARY OF CONTENTS I
1 INTRODUCTION 2
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES 3 21 WEB SERVICE TYPES 3 22 INVOKING SOAP WEB SERVICES 3 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES 3
231 HTML Output Configuration 4 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES 5
3 ITPILOT DEVELOPMENT API 7 31 CONNECTING TO THE SERVER 7 32 OBTAINING WRAPPERS 8 33 USING WRAPPERS 8 34 PROCESSING QUERY RESULTS 9
341 Canceling Queries 11 35 EXAMPLE OF USE 11
4 CREATING CUSTOM ITPILOT FUNCTIONS 14 41 NAMING CONVENTIONS AND ANNOTATIONS 15 42 COMPOUND TYPES 15 43 PAGE TYPE 16 44 CUSTOM FUNCTION RETURN TYPE 16 45 EXAMPLE 17
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT 18 51 INTRODUCTION 18 52 REPRESENTATION FORMAT OF A WRAPPER 18
521 Initialization of Searchable Parameters 19 522 Main Function 19 523 Generating the Output Structure 19
53 PREDEFINED ITPILOT COMPONENT GUIDE 19 531 Introduction 19 532 Data Structures 19 533 Common functions 22 534 Add Record To List 24 535 Condition 25 536 Create List 26 537 Create Persistent Browser 27 538 Diff 28 539 ExecuteJS 30 5310 Expression 31 5311 Extractor 32
ITPilot 46 Developer Guide
5312 Fetch 33 5313 Filter 35 5314 Form Iterator 36 5315 Get Page 40 5316 Init 41 5317 Iterator 45 5318 JDBCExtractor 46 5319 Loop 48 5320 Next Interval Iterator 49 5321 Output 51 5322 Record Constructor 52 5323 Record Sequence or Extractor Sequence 53 5324 Release Persistent Browser 54 5325 Repeat 55 5326 Script 56 5327 Sequence 57 5328 Store File 59 5329 Thread 60
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS 61 541 Developing Custom Components 61 542 Using Custom Components 62
55 WRAPPER DEVELOPMENT 62
REFERENCES 63
ITPilot 46 Developer Guide
FIGURES Figure 1 Example of query execution to a wrapper 13 Figure 2 ITPilot Custom Function Sample 17 Figure 3 ITPilot Wrapper Skeleton in JavaScript 18 Figure 4 Using the ExecuteJS NSEQL command 30 Figure 5 Using threads in the Iterator component 45 Figure 6 Using the Loop function 48 Figure 7 Using the Repeat function 55 Figure 8 Using custom components from JavaScript 62
ITPilot 46 Developer Guide
Preface i
PREFACE
SCOPE
Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot
WHO SHOULD USE THIS DOCUMENT
This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises
SUMMARY OF CONTENTS
More specifically this document
bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot
bull Describes the task of exporting and deploying a wrapper as a Web Service
bull Gives a detailed description of how to use the development API offered by Denodo ITPilot
bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server
bull Details how to create custom ITPilot functions
bull Explains how to develop wrappers by using the ITPilot JavaScript components
ITPilot 46 Developer Guide
Introduction 2
1 INTRODUCTION
Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 3
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper
bull An operation containing all searchable and compulsory parameters
bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])
The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server
21 WEB SERVICE TYPES
ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are
bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table
containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file
The following section describes the querying process for these Web Services
22 INVOKING SOAP WEB SERVICES
The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application
23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 4
Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be
httpacme9090testWSrest httpacme9090testWShtml
For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used
httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA
httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following
httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen
where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions
httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA
If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing
httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1
231 HTML Output Configuration
The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows
bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 5
bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval
bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected
bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines
bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar
bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added
bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added
These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format
httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen
For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10
24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool
1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo
ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
5312 Fetch 33 5313 Filter 35 5314 Form Iterator 36 5315 Get Page 40 5316 Init 41 5317 Iterator 45 5318 JDBCExtractor 46 5319 Loop 48 5320 Next Interval Iterator 49 5321 Output 51 5322 Record Constructor 52 5323 Record Sequence or Extractor Sequence 53 5324 Release Persistent Browser 54 5325 Repeat 55 5326 Script 56 5327 Sequence 57 5328 Store File 59 5329 Thread 60
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS 61 541 Developing Custom Components 61 542 Using Custom Components 62
55 WRAPPER DEVELOPMENT 62
REFERENCES 63
ITPilot 46 Developer Guide
FIGURES Figure 1 Example of query execution to a wrapper 13 Figure 2 ITPilot Custom Function Sample 17 Figure 3 ITPilot Wrapper Skeleton in JavaScript 18 Figure 4 Using the ExecuteJS NSEQL command 30 Figure 5 Using threads in the Iterator component 45 Figure 6 Using the Loop function 48 Figure 7 Using the Repeat function 55 Figure 8 Using custom components from JavaScript 62
ITPilot 46 Developer Guide
Preface i
PREFACE
SCOPE
Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot
WHO SHOULD USE THIS DOCUMENT
This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises
SUMMARY OF CONTENTS
More specifically this document
bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot
bull Describes the task of exporting and deploying a wrapper as a Web Service
bull Gives a detailed description of how to use the development API offered by Denodo ITPilot
bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server
bull Details how to create custom ITPilot functions
bull Explains how to develop wrappers by using the ITPilot JavaScript components
ITPilot 46 Developer Guide
Introduction 2
1 INTRODUCTION
Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 3
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper
bull An operation containing all searchable and compulsory parameters
bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])
The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server
21 WEB SERVICE TYPES
ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are
bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table
containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file
The following section describes the querying process for these Web Services
22 INVOKING SOAP WEB SERVICES
The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application
23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 4
Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be
httpacme9090testWSrest httpacme9090testWShtml
For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used
httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA
httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following
httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen
where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions
httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA
If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing
httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1
231 HTML Output Configuration
The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows
bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 5
bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval
bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected
bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines
bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar
bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added
bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added
These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format
httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen
For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10
24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool
1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo
ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
FIGURES Figure 1 Example of query execution to a wrapper 13 Figure 2 ITPilot Custom Function Sample 17 Figure 3 ITPilot Wrapper Skeleton in JavaScript 18 Figure 4 Using the ExecuteJS NSEQL command 30 Figure 5 Using threads in the Iterator component 45 Figure 6 Using the Loop function 48 Figure 7 Using the Repeat function 55 Figure 8 Using custom components from JavaScript 62
ITPilot 46 Developer Guide
Preface i
PREFACE
SCOPE
Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot
WHO SHOULD USE THIS DOCUMENT
This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises
SUMMARY OF CONTENTS
More specifically this document
bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot
bull Describes the task of exporting and deploying a wrapper as a Web Service
bull Gives a detailed description of how to use the development API offered by Denodo ITPilot
bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server
bull Details how to create custom ITPilot functions
bull Explains how to develop wrappers by using the ITPilot JavaScript components
ITPilot 46 Developer Guide
Introduction 2
1 INTRODUCTION
Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 3
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper
bull An operation containing all searchable and compulsory parameters
bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])
The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server
21 WEB SERVICE TYPES
ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are
bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table
containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file
The following section describes the querying process for these Web Services
22 INVOKING SOAP WEB SERVICES
The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application
23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 4
Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be
httpacme9090testWSrest httpacme9090testWShtml
For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used
httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA
httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following
httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen
where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions
httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA
If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing
httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1
231 HTML Output Configuration
The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows
bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 5
bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval
bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected
bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines
bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar
bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added
bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added
These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format
httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen
For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10
24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool
1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo
ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Preface i
PREFACE
SCOPE
Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot
WHO SHOULD USE THIS DOCUMENT
This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises
SUMMARY OF CONTENTS
More specifically this document
bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot
bull Describes the task of exporting and deploying a wrapper as a Web Service
bull Gives a detailed description of how to use the development API offered by Denodo ITPilot
bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server
bull Details how to create custom ITPilot functions
bull Explains how to develop wrappers by using the ITPilot JavaScript components
ITPilot 46 Developer Guide
Introduction 2
1 INTRODUCTION
Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 3
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper
bull An operation containing all searchable and compulsory parameters
bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])
The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server
21 WEB SERVICE TYPES
ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are
bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table
containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file
The following section describes the querying process for these Web Services
22 INVOKING SOAP WEB SERVICES
The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application
23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 4
Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be
httpacme9090testWSrest httpacme9090testWShtml
For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used
httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA
httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following
httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen
where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions
httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA
If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing
httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1
231 HTML Output Configuration
The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows
bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 5
bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval
bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected
bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines
bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar
bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added
bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added
These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format
httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen
For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10
24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool
1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo
ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Introduction 2
1 INTRODUCTION
Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 3
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper
bull An operation containing all searchable and compulsory parameters
bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])
The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server
21 WEB SERVICE TYPES
ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are
bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table
containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file
The following section describes the querying process for these Web Services
22 INVOKING SOAP WEB SERVICES
The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application
23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 4
Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be
httpacme9090testWSrest httpacme9090testWShtml
For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used
httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA
httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following
httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen
where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions
httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA
If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing
httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1
231 HTML Output Configuration
The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows
bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 5
bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval
bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected
bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines
bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar
bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added
bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added
These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format
httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen
For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10
24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool
1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo
ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 3
2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper
bull An operation containing all searchable and compulsory parameters
bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])
The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server
21 WEB SERVICE TYPES
ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are
bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table
containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file
The following section describes the querying process for these Web Services
22 INVOKING SOAP WEB SERVICES
The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application
23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 4
Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be
httpacme9090testWSrest httpacme9090testWShtml
For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used
httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA
httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following
httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen
where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions
httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA
If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing
httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1
231 HTML Output Configuration
The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows
bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 5
bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval
bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected
bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines
bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar
bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added
bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added
These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format
httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen
For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10
24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool
1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo
ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 4
Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be
httpacme9090testWSrest httpacme9090testWShtml
For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used
httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA
httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following
httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen
where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions
httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA
If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing
httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1
231 HTML Output Configuration
The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows
bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 5
bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval
bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected
bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines
bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar
bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added
bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added
These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format
httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen
For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10
24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool
1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo
ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 5
bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval
bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected
bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines
bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar
bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added
bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added
These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format
httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen
For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10
24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool
1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo
ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Deploying and Invoking ITPilot Wrapper Access Web Services 6
3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established
ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
ITPilot Development API 7
3 ITPILOT DEVELOPMENT API
Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]
31 CONNECTING TO THE SERVER
There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)
In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
ITPilot Development API 8
32 OBTAINING WRAPPERS
As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it
bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained
bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter
bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server
bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server
bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server
bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server
33 USING WRAPPERS
Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method
HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method
HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
ITPilot Development API 9
Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods
void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)
allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot
34 PROCESSING QUERY RESULTS
The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
ITPilot Development API 10
The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator
bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not
bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
ITPilot Development API 11
341 Canceling Queries
The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query
void cancel()
35 EXAMPLE OF USE
This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section
TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
ITPilot Development API 12
package comdenodoitpilotclient
import javautilList
import javautilHashMap
import javautilMap
import javautilIterator
import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO
import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO
import
comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO
public class ITPilotExample
public static void main(String args[])
try
Connect to server
HTMLWrapperServerProxy server = new HTMLWrapperServerProxy
(acme9999)
Get Wrapper
HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)
Prepare query params
Map queryParams = new HashMap ()
queryParamsput (DIRECTORWoody Allen)
Execute query
HTMLWrapperResultIterator results = wrapperquery(queryParams)
Iterate results
int numOfTuples = 0
while (resultshasNext())
numOfTuples++
StandardRowVO tuple = (StandardRowVO) resultsnext()
Process each tuple
Systemoutprint(numOfTuples + )
Get and print atomic fields TITLE DIRECTOR
SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)
String title = (String)titleVOgetValue()
Systemoutprintln(TITLE+ title)
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
ITPilot Development API 13
SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)
String director = (String)directorVOgetValue()
Systemoutprintln(DIRECTOR + director)
Get EDITIONS array
ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)
Iterate over EDITION registers
int numEditions=0
Iterator editions = editionsVOgetValues()iterator()
while (editionshasNext())
numEditions++
Systemoutprintln(EDITION + numEditions)
RegisterVO editionVO = (RegisterVO)editionsnext()
Map edition = editionVOgetValues()
SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)
String format = (String)formatVOgetValue()
Systemoutprintln(t FORMAT + format)
DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()
Systemoutprintln(t PRICE + price)
SimpleVO
descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)
String description = (String)descriptionVOgetValue()
Systemoutprintln(tDESCRIPTION + description)
Systemoutprintln()
Check errors
if (resultscheckErrors())
Systemoutprintln(Error + resultsgetErrorDescription())
catch(Exception e)
Systemerrprintln(Error trying to access server )
finally
Figure 1 Example of query execution to a wrapper
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 14
4 CREATING CUSTOM ITPILOT FUNCTIONS
Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in
$DENODO_HOMElibcontribdenodo-customjar
These are the rules that every custom function must follow to work properly
bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times
A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used
Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary
Equivalency between Java and ITPilot data types
Note The parameters of a custom functions cannot be basic types int long double etc
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 15
41 NAMING CONVENTIONS AND ANNOTATIONS
The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern
bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are
bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters
bull name name of the custom function
bull type In ITPilot it must be CustomElementTypeITPFUNCTION
bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query
bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters
42 COMPOUND TYPES
Compound types and values in the custom functions are defined by the following Java classes
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 16
bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)
bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)
bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array
bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances
bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed
43 PAGE TYPE
ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies
44 CUSTOM FUNCTION RETURN TYPE
As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules
1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object
See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Creating Custom ITPilot functions 17
45 EXAMPLE
Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array
Figure 2 ITPilot Custom Function Sample
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 18
5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
51 INTRODUCTION
Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541
52 REPRESENTATION FORMAT OF A WRAPPER
An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3
function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()
Figure 3 ITPilot Wrapper Skeleton in JavaScript
There are three possible functions in each script one mandatory and two optional ones
1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1
The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component
1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 19
521 Initialization of Searchable Parameters
This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)
522 Main Function
This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53
523 Generating the Output Structure
This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog
53 PREDEFINED ITPILOT COMPONENT GUIDE
531 Introduction
This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)
532 Data Structures
ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them
5321 Record Structure
bull Object Record_Structure
bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)
bull Functions
o Constructor(name)
bull name name of the structure
o setText(field regexp type) creation of a new character string field in the record
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 20
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional
o setLink(field type) new Link-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setInt(field type) creation of a new Integer-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBoolean(field type) creation of a new boolean-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setLong(field type) creation of a new Long-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setFloat(field type) this creates a new Float-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setDouble(field type) creation of a new Double-type field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record
bull field name of the new field
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 21
o setDate(field regexp format type) creation of a new Date-type field in the record
bull field name of the new field
bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo
bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setRegister(record type) creation of a new Record-type field in the record
bull record record name
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o setArray(name structure type) creation of a new Array-type field in the record
bull name name of the array
bull structure data structure that represents the record structure contained in the array
bull type (optional) defines whether the parameter is mandatory or not By default the field is optional
o toString() This transforms the record into a string of characters for their representation
When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component
NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply
5322 Record List
bull Object List
bull Functions
o setListName(listName) name of the list
bull listName name of the list
o add(obj) addition of an element to the list
bull obj element to add
o toArray() transforms the list into a JavaScript object array
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 22
533 Common functions
Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions
5331 onError function
bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values
o errorId This indicates the type of error for which the behavior is to be managed The possible values are
bull RUNTIME_ERROR error while the component is being run
bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source
bull HTTP_ERROR error produced by an http error
bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser
bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)
o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression
bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter
bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 23
5332 debugLevel function
bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following
o TRACE
o DEBUG
o INFO
o WARN
o ERROR
o FATAL
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 24
534 Add Record To List
bull Object Add_Object_To_List
bull Description adds a record to a list
bull Functions
o Constructor()
o exec(record list) executes the function
bull record record to be added to the list
bull list list to which the record is added
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 25
535 Condition
bull Object Condition
bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not
bull Functions
o Constructor(expr)
bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements
bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 26
536 Create List
bull Object Create_List
bull Description creates an empty list
bull Functions
o Constructor(listname) creates an empty list
bull listname name of the list of records to be created
o exec() runs the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 27
537 Create Persistent Browser
bull Object Create_Persistent_Browser
bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it
bull Functions
o Constructor() creates a persistent browser and returns its handler
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 28
538 Diff
bull Object Diff
bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code
bull Functions
o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)
bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)
bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified
o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them
bull baseCode character string with the source page content
bull finalCode character string or page object with the target page content
o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag
bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)
o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag
bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)
o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag
bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 29
o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag
bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)
o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself
bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned
o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages
bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored
o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages
bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is
o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not
bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account
o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them
bull replacement Perl [PERL] regular expression
o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison
bull regexp Perl [PERL] regular expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 30
539 ExecuteJS
bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])
var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])
Figure 4 Using the ExecuteJS NSEQL command
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 31
5310 Expression
bull Object Expression
bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value
bull Functions
o Constructor(expression)
bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor
bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 32
5311 Extractor
bull Object Extractor
bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])
bull Functions
o Constructor(name page specification structure)
bull name name of the Extractor component instance
bull page page-type ITPilot structure from where data is to be extracted
bull specification DEXTL data extraction specification (see [DEXTL])
bull structure name of the record (previously created) that will be used to return the data extracted by the specification
o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter
o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)
bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default
o setI18n(i18n) Function that updates the process internationalization
bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 33
5312 Fetch
bull Object Fetch
bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format
bull Functions
o Constructor(url sequenceType reusableConnection binary page)
bull url URL where the resource to be downloaded can be found (OPTIONAL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format
bull page Optionally the page from which the http request is launched can be indicated
o exec(page) This runs the component returning the string- or binary-type value obtained
bull page Optionally the page from which the http request is launched can be indicated
o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send
bull encoding MIME type of the information to send
o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method
bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command
o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against
bull back back sequence NSEQL program
o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 34
bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session
o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 35
5313 Filter
bull Object Filter
bull Description this carries out a filtering operation from a list of records returning those meeting a given condition
bull Functions
o Constructor(expr auxiliaryRecords)
o expr regular expression of the filtering operation for a list of records which are described in the exec function
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor
o inputRecords list of input records
bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter
NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 36
5314 Form Iterator
bull Object Form_Iterator
bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run
bull Functions
o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)
bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)
bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component
bull inputPage input page from which the selected form can be iteratively invoked
bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel
o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 37
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form
bull field name of the multiple selection field
bull position position related to the field between those of the same name starting with position 0
bull valuesArray list of values that must be selected in the field
bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values
bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)
bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form
bull field name of the HTML selection field
bull position position occupied in the event of more than one field element with the same name
bull positions values of the elements on which the component must iterate
o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field
bull field name of the HTML text field
bull position position of the field in the event of several on the form with the same value
bull values list of values that must be selected in the field
bull positions list that indicates the position held for each value element in the event of replicated values
bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 38
o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it
bull field name of the HTML field on which the click is to be made
bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element
bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element
bull CLICKED_ELEMENT mark the element
bull NON_CLICKED_ELEMENT leave the element as unmarked
bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked
o input(field position values) function that indicates the values added to an input field
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o textarea(field position values) this indicates the values added to a text area
bull field name of the HTML input field
bull position position of the field in the event of several on the form with the same name
bull values list of values that must be selected in the field
o toList() returns the list with the NSEQL sequences used in each iteration
o setMaxIterations(count) sets the maximum number of iterations that can be executed
bull count number that determines the maximum number of iterations
o setRetries(count) update method for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o setParallelIterator(flag) the component launches the iteration in parallel
bull flag ldquotruerdquo the iterations will be executed in parallel
o next(inputPage) this returns the page resulting from running a component iteration
bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 39
o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not
o close() function that closes the iterator
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 40
5315 Get Page
bull Object Get_Page
bull Description obtains an active browser from the browser pool from a previously retrieved identification code
bull Functions
o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification
bull browserUuid browser id
o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)
bull pageType type of browser used to access the page
bull SEQUENCE_IEBROWSER = 1
bull SEQUENCE_HTTP_BROWSER = 2
bull lastURL last URL where the page is coming from
bull lastURLMethod access method (GET POST) of the URL the page is coming from
bull lastURLPostParameters POST-method parameters of the URL the page is coming from
bull cookie information storage ldquocookiesrdquo
bull proxyUser user name to access the Proxy if required
bull proxyPassword user password to access the Proxy if required
bull proxyDomain Proxy domain if required
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 41
5316 Init
bull Object Init
bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application
bull Functions
o Constructor(input output)
bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context
bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)
o get(name) this returns the value of a record field created as a group of initialization parameters
bull name name of the record field
o setText(field obl fixedValue) this creates a text-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setInt(field obl fixedValue) this creates an integer-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 42
o setLong(field obl fixedValue) this creates a long-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDouble(field obl fixedValue) this creates a double-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record
bull field name of the field to create
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 43
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setLink(field obl fixedValue) this creates a URL-type field in the initialization record
bull field name of the field to create
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setDate(field format obl fixedValue) this creates a date-type field in the initialization record
bull field name of the field to create
bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]
bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 44
bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query
bull OBLIGATORY The parameter is obligatory in any query made on the wrapper
bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below
bull fixedValue optional parameter that indicates a constant value assigned to the field
o setName(name) update function for the component name
bull name new component name
o setI18n(i18n) function which updates the process i18n
bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot
o exec() main function for running the component returning a record representing the wrapper initialization parameters
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 45
5317 Iterator
bull Object Iterator
bull Description component that iterates on a list of records one by one
bull Functions
o Constructor(list)
bull list list of records on which to iterate
o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result
o next() this returns the next iteration element The list is a sorted sequence of records
The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329
var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)
Figure 5 Using threads in the Iterator component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 46
5318 JDBCExtractor
bull Object JDBCExtractor
bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results
bull Functions
o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)
bull uuid component unique identifier
bull uri connection URL to the database
bull driver driver class to use to connect to the data source
bull userName user name
bull password user password
bull structure structure of the componentrsquos output record list It is defined as a record of values
bull baseRecords record list to be used
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
bull query SQL query that returns the results required by the component
o exec(query baseRecords) executes the JDBCExtractor component
bull query SQL query that returns the results required by the component
bull baseRecords record list to be used
o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration
bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time
bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used
bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 47
o disablePool() disables the connection pool
o addDriverProperty(propname propvalue) adds a JDBC driver property
bull propname property name
bull propvalue property value
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 48
5319 Loop
bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip
Figure 6 Using the Loop function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 49
5320 Next Interval Iterator
bull Object Next_Interval_Iterator
bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences
bull Functions
o Constructor(sequences iterations sequenceType reuse inputPage)
bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration
bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information
bull inputPage this indicates the page from which the next browsing sequence is to be made
o next(inputRecords inputPage) this returns the next iteration element
bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval
bull inputPage this indicates the page from which the next pages are to be accessed
o close() this closes the iterator
o setRetries(count) this configures the number of retries in the event of error in accessing the next page
bull count number of retries
o setRetryDelay(count) this configures the interval between two retries
bull count interval in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 50
o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation
bull 1 Internet Explorer browser implementation
bull 2 Firefox browser implementation
bull 3 HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 51
5321 Output
bull Object Output
bull Description this places a record in the wrapper output
bull Functions
o Constructor(structure)
bull structure parameter that indicates the component input record to be used as the wrapper result
o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added
bull record record to use
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 52
5322 Record Constructor
bull Object Record_Constructor
bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones
bull Functions
o Constructor(recordsObj name)
bull recordsObj list of input elements Each element from the list can be a record or a list of records
bull name name of the output record of the Record Constructor component
o add(fieldName expression errorAction) method for adding a new field to the record under construction
bull fieldname name of the field
bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are
bull ON_ERROR_RAISE stop wrapper run indicating the source of the error
bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run
o exec() this runs the Record Constructor component instance returning an object that represents the record obtained
NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 53
5323 Record Sequence or Extractor Sequence
bull Object Record_Sequence
bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component
bull Functions
o Constructor(sequences sequenceDepends sequenceType reuse inputPage)
bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component
bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list
bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it
bull inputPage optional this allows for a homepage to be indicated
o exec() this returns a page object that represents the target page of the browsing sequences
o All of the methods offered by the Sequence component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 54
5324 Release Persistent Browser
bull Object Release_Persistent_Browser
bull Description accepts a browser id or a page as browser identifier and releases that specific browser
bull Functions
o Constructor(page)
bull page page loaded on the browser that is going to be released
o Constructor(browserUuid)
bull browserUuid browser identifier
o exec() executes the component
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 55
5325 Repeat
bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]
var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))
Figure 7 Using the Repeat function
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 56
5326 Script
bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 57
5327 Sequence
bull Object Sequence
bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])
bull Functions
o Constructor(sequence sequenceType reusableConnection inputPage)
bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are
bull SEQUENCE_IEBROWSER
bull SEQUENCE_HTTP_BROWSER
bull SEQUENCE_FTP
bull SEQUENCE_LOCAL
bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information
bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly
o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached
bull inputValues list of values that can be used as input parameters within the browsing sequence
bull inputPage optional parameter this describes the page from which the component browsing sequence is run
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
o close() this closes the connection with the running browser
o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function
bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 58
o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out
bull back NSEQL back program
o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not
bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session
o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence
bull pages number of back pages
o toString() this returns the NSEQL (see [NSEQL]) sequence
o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are
bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 59
5328 Store File
bull Object StoreFile
bull Description this stores the contents entered as the input parameter in a file
bull Functions
o Constructor(content file)
bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored
bull file path and name of the file where the contents are to be stored
o exec() runs the component
o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory
bull generate indicates if the file name should be automatically generated
o setRetries(count) update function for the number of retries in the event of failures
bull count number of retries
o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated
bull mseconds this indicates the waiting time between retries in milliseconds
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 60
5329 Thread
bull Object Thread
bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently
bull Functions
o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished
o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function
bull functionName name of the JavaScript function to be run
bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function
o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish
bull int maximum number
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 61
54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
541 Developing Custom Components
Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions
bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output
o This is the main function where ldquo mycustomrdquo is the name of the custom component
bull mycustom_getInputStructure() hellip
o This function allows to define the input schema
bull mycustom_getOutputType() return ltTYPEgt
o This is the function that defines the component output type The possible values are
bull LIST_TYPE = 1
bull PAGE_TYPE = 2
bull RECORD_TYPE = 3
bull SIMPLE_TYPE = 4
bull ARRAY_TYPE = 5
bull BINARY_TYPE = 6
bull BOOLEAN_TYPE = 7
bull DATE_TYPE = 8
bull DOUBLE_TYPE = 9
bull FLOAT_TYPE = 10
bull INT_TYPE = 11
bull LONG_TYPE = 12
bull STRING_TYPE = 13
bull URL_TYPE = 14
bull BROWSER_ID_TYPE = 15
bull mycustom_getOutputStructure) hellip
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
Developing ITPilot Wrappers with JavaScript 62
o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE
542 Using Custom Components
If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used
try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()
Figure 8 Using custom components from JavaScript
where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input
55 WRAPPER DEVELOPMENT
Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows
CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode
where jscode is the recently generated JavaScript code
NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-
ITPilot 46 Developer Guide
References 63
REFERENCES
[AXIS] Apache Axis Web Server httpwsapacheorgaxis
[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml
[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011
[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet
[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011
[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30
[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011
[JDOC] Javadoc documentation of the Developer API
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)
[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011
[PERL] PERL Language httpwwwperlcom
[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011
[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap
[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011
[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl
- DENODO ITPILOT 46 DEVELOPER GUIDE
- INDEX
- FIGURES
- PREFACE
- 1 INTRODUCTION
- 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
-
- 21 WEB SERVICE TYPES
- 22 INVOKING SOAP WEB SERVICES
- 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
-
- 231 HTML Output Configuration
-
- 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
-
- 3 ITPILOT DEVELOPMENT API
-
- 31 CONNECTING TO THE SERVER
- 32 OBTAINING WRAPPERS
- 33 USING WRAPPERS
- 34 PROCESSING QUERY RESULTS
-
- 341 Canceling Queries
-
- 35 EXAMPLE OF USE
-
- 4 CREATING CUSTOM ITPILOT FUNCTIONS
-
- 41 NAMING CONVENTIONS AND ANNOTATIONS
- 42 COMPOUND TYPES
- 43 PAGE TYPE
- 44 CUSTOM FUNCTION RETURN TYPE
- 45 EXAMPLE
-
- 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
-
- 51 INTRODUCTION
- 52 REPRESENTATION FORMAT OF A WRAPPER
-
- 521 Initialization of Searchable Parameters
- 522 Main Function
- 523 Generating the Output Structure
-
- 53 PREDEFINED ITPILOT COMPONENT GUIDE
-
- 531 Introduction
- 532 Data Structures
-
- 5321 Record Structure
- 5322 Record List
-
- 533 Common functions
-
- 5331 onError function
- 5332 debugLevel function
-
- 534 Add Record To List
- 535 Condition
- 536 Create List
- 537 Create Persistent Browser
- 538 Diff
- 539 ExecuteJS
- 5310 Expression
- 5311 Extractor
- 5312 Fetch
- 5313 Filter
- 5314 Form Iterator
- 5315 Get Page
- 5316 Init
- 5317 Iterator
- 5318 JDBCExtractor
- 5319 Loop
- 5320 Next Interval Iterator
- 5321 Output
- 5322 Record Constructor
- 5323 Record Sequence or Extractor Sequence
- 5324 Release Persistent Browser
- 5325 Repeat
- 5326 Script
- 5327 Sequence
- 5328 Store File
- 5329 Thread
-
- 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
-
- 541 Developing Custom Components
- 542 Using Custom Components
-
- 55 WRAPPER DEVELOPMENT
-
- REFERENCES
-