sorter transformation overview

8
Sorter Transformation Overview By PenchalaRaju.Yanamala Transformation type: Active Connected You can sort data with the Sorter transformation. You can sort data in ascending or descending order according to a specified sort key. You can also configure the Sorter transformation for case-sensitive sorting, and specify whether the output rows should be distinct. The Sorter transformation is an active transformation. It must be connected to the data flow. You can sort data from relational or flat file sources. You can also use the Sorter transformation to sort data passing through an Aggregator transformation configured to use sorted input. When you create a Sorter transformation in a mapping, you specify one or more ports as a sort key and configure each sort key port to sort in ascending or descending order. You also configure sort criteria the Integration Service applies to all sort key ports and the system resources it allocates to perform the sort operation.

Upload: ypraju

Post on 17-Nov-2014

103 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sorter Transformation Overview

Sorter Transformation Overview

By PenchalaRaju.Yanamala

Transformation type:ActiveConnected

You can sort data with the Sorter transformation. You can sort data in ascending or descending order according to a specified sort key. You can also configure the Sorter transformation for case-sensitive sorting, and specify whether the output rows should be distinct. The Sorter transformation is an active transformation. It must be connected to the data flow.

You can sort data from relational or flat file sources. You can also use the Sorter transformation to sort data passing through an Aggregator transformation configured to use sorted input.

When you create a Sorter transformation in a mapping, you specify one or more ports as a sort key and configure each sort key port to sort in ascending or descending order. You also configure sort criteria the Integration Service applies to all sort key ports and the system resources it allocates to perform the sort operation.

Page 2: Sorter Transformation Overview

At session run time, the Integration Service passes the following rows into the Sorter transformation:

ORDER_ID ITEM_ID QUANTITY DISCOUNT

45 123456 3 3.04

45 456789 2 12.02

43 000246 6 34.55

41 000468 5 .56

After sorting the data, the Integration Service passes the following rows out of the Sorter transformation:

ORDER_ID ITEM_ID QUANTITY DISCOUNT

41 000468 5 .56

43 000246 6 34.55

45 123456 3 3.04

45 456789 2 12.02

Sorter Transformation Properties

The Sorter transformation has several properties that specify additional sort criteria. The Integration Service applies these criteria to all sort key ports. The Sorter transformation properties also determine the system resources the Integration Service allocates when it sorts data.

Sorter Cache Size

Page 3: Sorter Transformation Overview

The Integration Service uses the Sorter Cache Size property to determine the maximum amount of memory it can allocate to perform the sort operation. The Integration Service passes all incoming data into the Sorter transformation before it performs the sort operation. You can configure a numeric value for the Sorter cache, or you can configure the Integration Service to determine the cache size at run time. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache. If the total configured session cache size is 2 GB (2,147,483,648 bytes) or greater, you must run the session on a 64-bit Integration Service.

Before starting the sort operation, the Integration Service allocates the amount of memory configured for the Sorter cache size. If the Integration Service runs a partitioned session, it allocates the specified amount of Sorter cache memory for each partition.

If it cannot allocate enough memory, the Integration Service fails the session. For best performance, configure Sorter cache size with a value less than or equal to the amount of available physical RAM on the Integration Service machine. Allocate at least 16 MB (16,777,216 bytes) of physical memory to sort data using the Sorter transformation. Sorter cache size is set to 16,777,216 bytes by default.

If the amount of incoming data is greater than the amount of Sorter cache size, the Integration Service temporarily stores data in the Sorter transformation work directory. The Integration Service requires disk space of at least twice the amount of incoming data when storing data in the work directory. If the amount of incoming data is significantly greater than the Sorter cache size, the Integration Service may require much more than twice the amount of disk space available to the work directory.

Use the following formula to determine the size of incoming data:

number_of_input_rows [( Σ column_size) + 16]

Table 20-1 gives the column size values by datatype for Sorter data calculations:

Table 20-1. Column Sizes for Sorter Data Calculations

Datatype Column SizeBinary precision + 8

Round to nearest multiple of 8Date/Time 29Decimal, high precision off (all precision) 16Decimal, high precision on (precision <=18) 24Decimal, high precision on (precision >18, <=28) 32Decimal, high precision on (precision >28) 16Decimal, high precision on (negative scale) 16Double 16Real 16Integer 16Small integer 16Bigint 64

Page 4: Sorter Transformation Overview

NString, NText, String, Text Unicode mode: 2*(precision + 5)ASCII mode: precision + 9

The column sizes include the bytes required for a null indicator.

To increase performance for the sort operation, the Integration Service aligns all data for the Sorter transformation memory on an 8-byte boundary. Each Sorter column includes rounding to the nearest multiple of eight.

The Integration Service also writes the row size and amount of memory the Sorter transformation uses to the session log when you configure the Sorter transformation tracing level to Normal.

Case Sensitive

The Case Sensitive property determines whether the Integration Service considers case when sorting data. When you enable the Case Sensitive property, the Integration Service sorts uppercase characters higher than lowercase characters.

Work Directory

You must specify a work directory the Integration Service uses to create temporary files while it sorts data. After the Integration Service sorts the data, it deletes the temporary files. You can specify any directory on the Integration Service machine to use as a work directory. By default, the Integration Service uses the value specified for the $PMTempDir process variable.

When you partition a session with a Sorter transformation, you can specify a different work directory for each partition in the pipeline. To increase session performance, specify work directories on physically separate disks on the Integration Service system.

Distinct Output Rows

You can configure the Sorter transformation to treat output rows as distinct. If you configure the Sorter transformation for distinct output rows, the Mapping Designer configures all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation.

Tracing Level

Configure the Sorter transformation tracing level to control the number and type of Sorter error and status messages the Integration Service writes to the session log. At Normal tracing level, the Integration Service writes the size of the row passed to the Sorter transformation and the amount of memory the Sorter transformation allocates for the sort operation. The Integration Service also writes the time and date when it passes the first and last input rows to the Sorter transformation.

If you configure the Sorter transformation tracing level to Verbose Data, the Integration Service writes the time the Sorter transformation finishes passing all data to the next transformation in the pipeline. The Integration Service also writes

Page 5: Sorter Transformation Overview

the time to the session log when the Sorter transformation releases memory resources and removes temporary files from the work directory.

Null Treated Low

You can configure the way the Sorter transformation treats null values. Enable this property if you want the Integration Service to treat null values as lower than any other value when it performs the sort operation. Disable this option if you want the Integration Service to treat null values as higher than any other value.

Transformation Scope

The transformation scope specifies how the Integration Service applies the transformation logic to incoming data:

Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when a row of data depends on all rows in the same transaction, but does not depend on rows in other transactions.All Input. Applies the transformation logic on all incoming data. When you choose All Input, the PowerCenter drops incoming transaction boundaries. Choose All Input when a row of data depends on all rows in the source.

Creating a Sorter Transformation

To add a Sorter transformation to a mapping, complete the following steps.

To create a Sorter transformation:

1. In the Mapping Designer, click Transformation > Create. Select the Sorter transformation.

The naming convention for Sorter transformations is SRT_TransformationName. Enter a description for the transformation. This description appears in the Repository Manager, making it easier to understand what the transformation does.2. Enter a name for the Sorter and click Create.The Designer creates the Sorter transformation.3. Click Done.4. Drag the ports you want to sort into the Sorter transformation.The Designer creates the input/output ports for each port you include.

5. Double-click the title bar of the transformation to open the Edit Transformations dialog box.

6. Select the Ports tab.7. Select the ports you want to use as the sort key.

8. For each port selected as part of the sort key, specify whether you want the Integration Service to sort data in ascending or descending order.

9. Select the Properties tab. Modify the Sorter transformation properties.

10. Select the Metadata Extensions tab. Create or edit metadata extensions for the Sorter transformation.

11. Click OK.