basics of datastge 8
TRANSCRIPT
-
8/4/2019 Basics of Datastge 8
1/17
Prepared by:
Sanghamitra Barman
-
8/4/2019 Basics of Datastge 8
2/17
DataStage is a client/server application, in which somecomponents sit on a client system while other componentssit on a server system (which must be Windows NT, 2000,XP or UNIX).
The client components are the DataStage Administrator,the DataStage Designer, the DataStage Director and theRepository Manager. On the server there is the DataStage
Repository (where metadata, etc., are stored) andsoftware that supports interaction with the repository andthe outside world.
-
8/4/2019 Basics of Datastge 8
3/17
DataStage
"Engine" Repository
SERVER
DataStage
Administrator
Repository
Manager
DataStage
Designer DataStage
Director
UniVerse Objects
Connections to data sources, e.g. ODBC, native API, FTP, etc.
-
8/4/2019 Basics of Datastge 8
4/17
Designer:A design Interface to create DataStageapplications.(Jobs) Each job defines sources/targets and thetransformations.Allows you to compile jobs and also offers avisual debugger for job debugging. You also define sequencesand containers in this interface.
Director:A user interface to validate,schedule, monitor andrun DataStage Server jobs. You can view log files for each jobexecution and also used to group related jobs to createbatches.
Manager:A user interface to edit and view contents of the
repository and perform import and export of DataStageComponents.
Administrator:A user interface to perform administrative jobslike setting up users/adding projects.
-
8/4/2019 Basics of Datastge 8
5/17
UniVerse to Datastage Engine
More Products added
Datastage Manager been dissolved Quick Find and Advanced Find included
Resource Estimation & PerformanceAnalysis
Job Comparison is possible
Enhancements on Job Locking
New Stages incorporated
New Objects included
-
8/4/2019 Basics of Datastge 8
6/17
Earlier communication between the client tools
and the DataStage server were effected using
UniVerse Objects.
With the need to adapt to demands of volumeprocessing,Ascential acquired and integrated
the parallel processing engine into DataStage.
-
8/4/2019 Basics of Datastge 8
7/17
-
8/4/2019 Basics of Datastge 8
8/17
The activities of Datastage Manager is beenincorporated in Datastage Designer.
Newly added features of Designer include:
Import/Export of Datastage components
Multicompilation of jobs
Editing of configuration file
-
8/4/2019 Basics of Datastge 8
9/17
The Quick Find and the Advanced Find arevery useful tools for developers as it does aquick search within the project or within the
repository for object usage or dependencies.
The results are presented in a detailed view or
graphical view.
This facility can be used for impact analysis orwhen doing project-wide job changes
-
8/4/2019 Basics of Datastge 8
10/17
The Resource Estimation utility can projecthow much CPU, scratch and disk requirements
per operator per partition will be utilized bythe job given a sample size of data.
The output is presented in a graphical form
that can also be generated in a HTML report.
-
8/4/2019 Basics of Datastge 8
11/17
The PerformanceAnalysis tool is useful foridentifying bottlenecks in a job.
Offers several categories of visualizations: Record throughput (rows/sec) CPU utilization Job timing Job memory utilization Physical machine utilization
Performance visualizations charts can be saved andprinted.
-
8/4/2019 Basics of Datastge 8
12/17
1. Go to Designer.
2. Go to job properties
3. Click on Execution Tab
4. Turn on Record Job Performance Data radiobutton
5. Run the Job after selecting the Record Job
Performance Data6. To view the performance analysis results go to
designer and click on PerformanceAnalysis.
-
8/4/2019 Basics of Datastge 8
13/17
-
8/4/2019 Basics of Datastge 8
14/17
InWebSphere DataStage, jobs can be opened inread-only mode.
Developers need not guess who has the job
locked as this information is provided whenopening jobs that are in use.
Sessions can also be disconnected from the web
console if there is a need such as restarting theDataStage server while there are existingconnections.
-
8/4/2019 Basics of Datastge 8
15/17
ODBC Connectors
To connect external sources
Test database connectivity without running job or
viewing data Gives maximum parallel performance and offers
more features compared to enterprise/plugin stages
SCD
Supports both Type 1 and Type 2
allows for in-memory lookup updates, surrogate keygeneration and updates to dimension tables
-
8/4/2019 Basics of Datastge 8
16/17
Parameter Set
Contains parameter names and values that can be
shared across jobs.
Parameter Sets provide an easier and faster methodwhen adding parameters to a job, eliminating theneed to add parameters individually to each job.
Parameter values can be stored in a file, addingflexibility when changing parameter values atruntime or when changing environment.
-
8/4/2019 Basics of Datastge 8
17/17
Data Connection Object
Its a reusable components that store database
connection information.
One data connection object can be created for eachparticular database and has to be associated with aparticular stage type. Once configured, they can be
simply dragged and dropped during job design A very handy feature when creating jobs that read
from or write to the same database.