basics of datastge 8

Upload: sanghamitra-barman-nazir

Post on 07-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Basics of Datastge 8

    1/17

    Prepared by:

    Sanghamitra Barman

  • 8/4/2019 Basics of Datastge 8

    2/17

    DataStage is a client/server application, in which somecomponents sit on a client system while other componentssit on a server system (which must be Windows NT, 2000,XP or UNIX).

    The client components are the DataStage Administrator,the DataStage Designer, the DataStage Director and theRepository Manager. On the server there is the DataStage

    Repository (where metadata, etc., are stored) andsoftware that supports interaction with the repository andthe outside world.

  • 8/4/2019 Basics of Datastge 8

    3/17

    DataStage

    "Engine" Repository

    SERVER

    DataStage

    Administrator

    Repository

    Manager

    DataStage

    Designer DataStage

    Director

    UniVerse Objects

    Connections to data sources, e.g. ODBC, native API, FTP, etc.

  • 8/4/2019 Basics of Datastge 8

    4/17

    Designer:A design Interface to create DataStageapplications.(Jobs) Each job defines sources/targets and thetransformations.Allows you to compile jobs and also offers avisual debugger for job debugging. You also define sequencesand containers in this interface.

    Director:A user interface to validate,schedule, monitor andrun DataStage Server jobs. You can view log files for each jobexecution and also used to group related jobs to createbatches.

    Manager:A user interface to edit and view contents of the

    repository and perform import and export of DataStageComponents.

    Administrator:A user interface to perform administrative jobslike setting up users/adding projects.

  • 8/4/2019 Basics of Datastge 8

    5/17

    UniVerse to Datastage Engine

    More Products added

    Datastage Manager been dissolved Quick Find and Advanced Find included

    Resource Estimation & PerformanceAnalysis

    Job Comparison is possible

    Enhancements on Job Locking

    New Stages incorporated

    New Objects included

  • 8/4/2019 Basics of Datastge 8

    6/17

    Earlier communication between the client tools

    and the DataStage server were effected using

    UniVerse Objects.

    With the need to adapt to demands of volumeprocessing,Ascential acquired and integrated

    the parallel processing engine into DataStage.

  • 8/4/2019 Basics of Datastge 8

    7/17

  • 8/4/2019 Basics of Datastge 8

    8/17

    The activities of Datastage Manager is beenincorporated in Datastage Designer.

    Newly added features of Designer include:

    Import/Export of Datastage components

    Multicompilation of jobs

    Editing of configuration file

  • 8/4/2019 Basics of Datastge 8

    9/17

    The Quick Find and the Advanced Find arevery useful tools for developers as it does aquick search within the project or within the

    repository for object usage or dependencies.

    The results are presented in a detailed view or

    graphical view.

    This facility can be used for impact analysis orwhen doing project-wide job changes

  • 8/4/2019 Basics of Datastge 8

    10/17

    The Resource Estimation utility can projecthow much CPU, scratch and disk requirements

    per operator per partition will be utilized bythe job given a sample size of data.

    The output is presented in a graphical form

    that can also be generated in a HTML report.

  • 8/4/2019 Basics of Datastge 8

    11/17

    The PerformanceAnalysis tool is useful foridentifying bottlenecks in a job.

    Offers several categories of visualizations: Record throughput (rows/sec) CPU utilization Job timing Job memory utilization Physical machine utilization

    Performance visualizations charts can be saved andprinted.

  • 8/4/2019 Basics of Datastge 8

    12/17

    1. Go to Designer.

    2. Go to job properties

    3. Click on Execution Tab

    4. Turn on Record Job Performance Data radiobutton

    5. Run the Job after selecting the Record Job

    Performance Data6. To view the performance analysis results go to

    designer and click on PerformanceAnalysis.

  • 8/4/2019 Basics of Datastge 8

    13/17

  • 8/4/2019 Basics of Datastge 8

    14/17

    InWebSphere DataStage, jobs can be opened inread-only mode.

    Developers need not guess who has the job

    locked as this information is provided whenopening jobs that are in use.

    Sessions can also be disconnected from the web

    console if there is a need such as restarting theDataStage server while there are existingconnections.

  • 8/4/2019 Basics of Datastge 8

    15/17

    ODBC Connectors

    To connect external sources

    Test database connectivity without running job or

    viewing data Gives maximum parallel performance and offers

    more features compared to enterprise/plugin stages

    SCD

    Supports both Type 1 and Type 2

    allows for in-memory lookup updates, surrogate keygeneration and updates to dimension tables

  • 8/4/2019 Basics of Datastge 8

    16/17

    Parameter Set

    Contains parameter names and values that can be

    shared across jobs.

    Parameter Sets provide an easier and faster methodwhen adding parameters to a job, eliminating theneed to add parameters individually to each job.

    Parameter values can be stored in a file, addingflexibility when changing parameter values atruntime or when changing environment.

  • 8/4/2019 Basics of Datastge 8

    17/17

    Data Connection Object

    Its a reusable components that store database

    connection information.

    One data connection object can be created for eachparticular database and has to be associated with aparticular stage type. Once configured, they can be

    simply dragged and dropped during job design A very handy feature when creating jobs that read

    from or write to the same database.