ds stages
TRANSCRIPT
-
8/3/2019 ds stages
1/6
1.What is the Exact difference between BASIC Transformer and NORMAL
Transformer?
A. The Transformer stage is inherent PX functionality,
whereas the BASIC Transformer uses a Server interface to
call a Server Transformer stage. There's severe performance
impact as well as partitioning limitations, but it does
give a PX job some access to existing Server functionality.
2. There ate two types of transformer i. Basic
transformer and ii. Active transformer. Basic transformer
is used for SMP system and not in MPP or cluster. Basic
transformer (BASIC is the language supported by the Data
stage server engine and available in Server job). Where in
Datastage Px the Active transformer get use.
3. Transformer stages are always active stages. The
basic transform stage is part of the Server product, but
the PX engine allows this stage to be called (the opposite,
using a PX stage in Server is not possible)
2.Did sequential stage accepts .xl files ,xml? znd how?
yes it accepts. use fixed line pattern
3.what is main difference between change capture and change apply stages
the stage compares two data set(after and before) and
makes a record of the differences.
-
8/3/2019 ds stages
2/6
change apply stage combine the changes from the change
capture stage with the original before data set to
reproduce the after data set
4.difference between server shared container and parallel shared container
1. Server shared containers contain server stage types,
parallel shared containers contain parallel stage types.
2. When we go for parallel shared container the logic
can be reusable across many jobs
Introduction
DataStage Enterprise Edition is a package of three products: DataStage Server
Edition, the parallel extender with parallel ETL jobs and the MetaStage product
described on the Metadata Workbench entry. The flagship tool of Enterprise Edition
is parallel ETL jobs.
[edit]
-
8/3/2019 ds stages
3/6
History
During the 1990s the data integration vendors such as Ascential and Informatica
were competing to deliver tools that provided a wide range of data connectivity and
transformation functions in a mostly code free environment. Towards the late 1990s
data stores were becoming large, data warehouses and business intelligence wasdemanding larger volumes of data loads. The physical architecture of these loads
was hitting a limit on the volume that a single server could handle and was moving
towards clusters or grids of servers.
The data integration vendors need to be able to integrate data across a massively
scalable architecture to keep up with the increased data volumes.
Ascential started to roll out a parallel capability in the DataStage Server Edition
product called multiple instance jobs. This allowed some additional manual
programming to partition and process data in parallel. In November 2001 they
switched to a buy approach and purchased Torrent Systems for $46 million.
Torrent had the capability to run tools on a massively parallel processing (MPP)
platform.
[edit]
Versions
This section lists each major release of DataStage Enterprise Edition and the
enhancements for DataStage parallel jobs. For a list of enhancements to the client
tools see the versions on the DataStage Server Edition page is it is the version that
has been delivered with every release going back to DataStage 1.
All release of DataStage 7 can import and upgrade DataStage 6 export files.
DataStage 8 can only import and upgrade DataStage 7.5.1 or 7.5.2 jobs.
[edit]
DataStage 6
Released in September 2002, ten months after the acquisition of Torrent, it was the
first version of DataStage to feature the Parallel Extender (PX), the parallel platform
that allows processes to run in parallel across a multiple processor environment.
New parallel job type with a new set of parallel stages. Some with the same
name as server job stages but with different properties and options.
Server job shared container for parallel jobs.
-
8/3/2019 ds stages
4/6
CPU based licensing instead of server based licensing.
Support for SAS 6.12 and 8.2.
This release was followed by the client only 6.0.1 release that fixed a number
problems.
[edit]
DataStage 7
Release September 2003 it uses much the same architecture of the previous
version with improvements to the usability. This was the first release to have no
server job improvements but many parallel job improvements.
XML Pack 2.0 provides improved XML metadata support for parallel jobs.
National Language Support (NLS) for parallel jobs but not for all parallel
stages.
Parallel shared and local stages.
Enhanced transformer with improved reject row handling, string handling,
timestamp conversion and compile performance.
Modify, Switch and Filter stages added.
Multiple-instance parallel jobs.
Non blocking funnel stage.
[edit]
DataStage 7.5
Unknown release date.
Parallel complex flat file stage.
A parallel job message handler for demoting or removing warning messages
from the job log.
Lookup stage changes from a property screen to a drag and drop mappingscreen.
Multi node import of sequential files.
Additional options for sequential file and file set stages such as Read First
Rows, Row Number Column and First Line is Column Names.
-
8/3/2019 ds stages
5/6
View data support for custom stages.
New Parallel Advanced Job Developers Guide.
[edit]
DataStage 7.5.1
Released in March 2005.
New SQL Builder for building SQL query statements from a database plugin
stage.
Command line job search function added.
DataStage parallel jobs for Unix System Services (USS) on the mainframe.
Remote job deployment to deliver and run jobs across a cluster or grid.
Vector support in the parallel transformer stage.
Sybase and ODBC stages added to parallel jobs.
Complex Flat File stage improvements: multiple output links, automatically
generated fillers, MVS dataset support.
Thread based job monitoring for parallel jobs.
[edit]
DataStage 7.5X2
Released in December 2004 this was the first release of parallel jobs that could run
on Windows. While the Server runs on all the same Unix and Linux platforms as
7.5.1 it adds the additional platform of Windows 2003 Standard or Enterprise on the
Intel x86 Processor Family.
There were no changes to parallel jobs in this release apart from the capability to
compile and run them on Windows.
[edit]
DataStage 8
Released in October 2006 for Windows and April 2007 for Unix this is the first
version to run on the IBM Information Server. There are a number of parallel job
improvements in this release:
Lookup stage now supports two new lookup types: range lookup and caseless
lookup.
-
8/3/2019 ds stages
6/6
New Slowly Changing Dimension stage.
New QualityStage stages for parallel jobs.
What is the difference between a Filter and a Swit...
________________________________________
A Filter stage is used to filter the incoming data ,for suppose u want to get the
details of customer 20 if u give customer 20 as the constraint in filter it will display
only the customer 20 files and u can also give a reject link,the rest of the records
will go into reject link.
where as in the switch,
we need to give as cases,
like case1,case2.
case1=10;
case2=20;
it will give the outputs of 10 and 20 customer records.
switch will check the cases and execute them.