db2bp data life cycle 1009i

Upload: abhishek-mukherjee

Post on 09-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    1/35

    IBM DB2 for Linux, UNIX, and Windows

    Best PracticesData Life Cycle Management

    Christopher Tsounis

    Executive IT Specialist

    Information Management Technical Sales

    Enzo Cialini

    Senior Technical Staff Member

    DB2 Data Server Development

    Last updated: 2009-10-23

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    2/35

    Data Life Cycle Management Page 2

    Executive Summary............................................................................................. 4Introduction.......................................................................................................... 5Partitioning techniques ....................................................................................... 6

    What is database partitioning?..................................................................... 6What is table partitioning?............................................................................ 6

    Multi-dimensional clustering............................................................................. 8Features of MDC that benefit roll-in and roll-out of data........................ 9

    Using database partitioning, table partitioning and multi-dimensional

    clustering in the same database design .......................................................... 10Additional techniques to support life cycle management ........................... 11

    Large table spaces ........................................................................................ 11SET INTEGRITY operation......................................................................... 12Asynchronous index cleanup..................................................................... 12

    Designing and implementing your table partitioning strategy .................. 13Design best practices: .................................................................................. 13Maximizing the benefits of partition elimination: .................................. 16Operational considerations:........................................................................ 16

    Rolling in data: Which solution to use?.......................................................... 18Best practices for roll-in of compressed table partitions: ............................. 20Best practice for roll-in and roll-out with continuous updates: .................. 21After roll-out: How to manage data growth and retention? ....................... 22

    Using UNION ALL views .......................................................................... 22Using IBM Optim Data Growth Solution................................................. 23

    Best Practices....................................................................................................... 30

    Conclusion .......................................................................................................... 32Further reading................................................................................................... 33

    Contributors.................................................................................................. 33Notices ................................................................................................................. 34

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    3/35

    Data Life Cycle Management Page 3

    Trademarks ................................................................................................... 35

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    4/35

    Data Life Cycle Management Page 4

    Executive SummaryTodays database applications frequently require scalability and rapid roll-in and roll-out

    of data with minimal disruption to data access by applications. Roll-in of data refers to

    the addition of new data as it becomes available while roll-out of data refers to the

    moving out (usually archiving) of historic data. Many applications today are accessed 24

    X 7, therefore eliminating the previously available batch window for data updates. Also,

    many applications require the continuous feed of data updates while applications

    concurrently access the data.

    The DB2 database system provides a variety of facilities that enable scalability and

    facilitate the continuous feed or roll-in and roll-out of data, with minimal interruption of

    data access. This document recommends best practices to design and implement these

    DB2 facilities to achieve these goals.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    5/35

    Data Life Cycle Management Page 5

    IntroductionThis paper describes the best DB2 design practices to facilitate the life-cycle management

    of DB2 data. Life-cycle management is the efficient addition (roll-in) of new data, and

    archival (roll-out) of data no longer required in the main database. The DB2 database

    system provides the following features that you can use in combination to facilitate life

    cycle management:

    Database partitioning

    Table partitioning

    Multi-dimensional clustering

    UNION ALL views

    In addition to these DB2 features, the IBM Optim Data Growth solution facilitates

    archiving for data life cycle management.

    An important benefit of DB2 database system partitioning facilities is the ability to

    deploy and modify these facilities without impacting existing application code.

    This paper is part of a family of related best practice papers, you would also benefit by

    reading the following best practice papers:

    Physical Database Design

    Minimizing Planned Outages

    Row Compression.

    The target audience for this paper is personnel responsible for database design for DB2

    applications. Database personnel who want to achieve scalability and efficient life cycle

    management of data should also find it valuable. This paper assumes you have moderate

    experience in designing DB2 databases.

    This paper is based on the facilities available in DB2 Version 9.5 and DB2 Version 9.7.

    Subsequent releases of the DB2 database system might provide enhancements that alter

    the best practices recommendations in this document.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    6/35

    Data Life Cycle Management Page 6

    Partitioning techniques

    What is database partitioning?

    Database partitioning (formerly known as DPF) distributes data across logical nodes ofthe database by using a key-hashing algorithm. The goal of database partitioning is to

    maximize scalability by evenly distributing data across clusters of computers. Database

    partitioning further enhances scalability by reducing the granularity of DB2 utility

    operations. It also parallelizes query and update operations on the database.

    The following example demonstrates how to specify database partitioning:

    CREATE TABLE Test

    (Account_Number INTEGER,

    Trade_date DATE )

    DISTRIBUTE BY (Account Number) USING HASHING

    Note: In DB2 Version 9.1, the PARTITION KEY clause is renamed DISTRIBUTE BY.

    Database partitioning is completely transparent, so it does not impact existing

    application code. Also, you can modify partitioning online using the redistribution

    utility, without affecting application code.

    When you design your database partitioning strategy, use a partitioning key column

    with high cardinality to help ensure even distribution of data across logical nodes. A

    column with high cardinality has many unique values (rather than most values being the

    same). Also, unique indexes must be a superset of the partitioning key.

    Try to use the same partitioning key on tables that get joined. This increases the

    collocation of joins.

    What is table partitioning?Table partitioning (frequently called range partitioning) splits data by specific ranges of

    key values over one or more physical objects within a logical database partition. The

    goal of table partitioning is to organize data to facilitate optimal data access and the

    rollout of data. Table partitioning might also facilitate the roll-in of data for certainapplications, however, often multi-dimensional clustering (discussed in the section

    Multi-Dimensional Clustering, next) is a better choice to enhance roll-in. Database

    partitioning is the best practice for reducing the granularity of utility operations for the

    scalability of very large databases.

    Table partitioning has the following benefits:

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    7/35

    Data Life Cycle Management Page 7

    Improved query performance by eliminating irrelevant partitions. The optimizer

    can limit SQL access to only relevant partitions in the WHERE clause.

    Optimized roll-in and roll-out processing of ranges. Table partitioning allows

    easy addition and removal of table partitions with no data movement.

    Applications can perform read and write operations against older data when

    partitions are added (queries are drained for a brief period).

    Maintained compression ratios across data that changes over time. Each table

    partition has its own compression dictionary. Thus, compressed data in older

    partitions is not affected by the changing characteristics of newly inserted data.

    Optimized management of very large tables. Table-partitioned tables can be

    virtually unlimited in size, because the limits are per partition (not per table).

    You can place ranges of data across multiple table spaces to facilitate the backup

    and restore of this data.

    Greater flexibility of index placement in SMS table spaces. You can store indexesin separate SMS large table spaces (supported for table-partitioned tables only).

    Separate index placement into DMS table spaces is available for all tables.

    DB2 Version 9.7 enhances table partitioning with the ability to create partitioned indexes.

    Partitioned indexes are stored locally with the table partitions. The benefits of partition

    indexes are:

    Avoids the overhead of maintaining global indexes during Set Integrity

    processing when attaching table partitions.

    Avoids Asynchronous Index Cleanup when detaching table partitions

    Improves the performance of reorganize by partition operations

    May improve query performance by reducing the cost of index processing due to

    more compact indexes.

    The following example demonstrates specifying table partitioning:

    CREATE TABLE Test

    (Account_Number INTEGER,

    Trade_date DATE)

    IN ts1, ts2, ts3

    PARTITION BY RANGE (Trade_date)

    (STARTING '1/1/2000' ENDING '3/31/2000',

    STARTING '4/1/2000' ENDING '6/30/2000',

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    8/35

    Data Life Cycle Management Page 8

    STARTING '7/1/2000' ENDING '9/30/2000')

    The following example demonstrates the creation of a partitioned table with a partitioned

    index. Partitioned indexes are created by default in DB2 Version 9.7 whenever possible:

    CREATE TABLE T1 (I1 INTEGER, I2 INTEGER) PARTITION BY RANGE(I1)(STARTING(1) ENDING (10) EVERY (5));

    CREATE INDEX IND1 ON T1(I1) PARTITIONED;

    The following example demonstrates the creation of a non-partitioned index:

    CREATE INDEX IND2 ON T1(I2) NOT PARTITIONED;

    There are many additional techniques available to specify how a table is partitioned that

    are described in your DB2 documentation.

    Multi-dimensional clusteringMulti-dimensional clustering (MDC) is a unique capability available only with the DB2

    database system. MDC organizes data in a table by multiple key values (cells). The goal

    of MDC is to facilitate access to data by using multiple dimensions, therefore keeping

    data access only to selected relevant cells. MDC helps to ensure that the data is always

    clustered by dimensions avoiding the need for reorganization of data (MDC is designed

    to keep data in order).

    MDC also utilizes block indexes on each dimension (and the combined dimensions)

    versus row ID (RID) indexes. This can result in a substantial reduction in the index size

    and index levels. For example, if 100 rows can fit into a DB2 cell, the block index will

    only point to the cell rather than each of the 100 rows. This results in a reduction in I/O

    for reading and updating data (the index is only updated when the block is full).

    MDC facilities the roll-in and roll-out of data and is completely transparent to

    applications.

    The following example demonstrates how to specify multi-dimensional clustering:

    CREATE TABLE order

    (Account_Number INTEGER,

    Trade_Date DATE,

    Region CHAR(10,

    order_month INTEGER generated always as month(order_dt))

    IN ts1

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    9/35

    Data Life Cycle Management Page 9

    ORGANIZED BY DIMENSIONS (region, order_month)

    When designing your MDC strategy, specify low-cardinality columns to avoid sparsely

    populated cells. Sparsely populated cells can significantly increase disk space usage. A

    column with low cardinality is likely to have many values that are the same (rather than

    many unique values). You can also use a generated column to produce a highly clustereddimension. For example, a generated column or built-in-function can convert a date into

    a month. This reduces the cardinality significantly (for a year of data, the cardinality is

    reduced from 365 to 12).

    Features of MDC that benefit roll-in and roll-out of dataMDC is designed to maintain clustering in all dimensions avoiding the need for

    reorganization of data. This can greatly reduce I/O during the roll-in process (but does

    use sequential big block I/O). Also, because indexes on MDC dimensions are block

    indexes, this allows MDC to avoid excessive index I/O during the roll-in process. Block

    indexes are smaller and shallower than a normal RID-based index, because the index

    entries point to a block rather than an entire row.

    Also, during the roll-in process, MDC reduces index maintenance because the block

    index is only updated once when the block is full (not for each row inserted as with other

    indexes). This also helps to reduce I/O.

    INSERT statements run faster when you use MDC, because MDC reuses existing empty

    blocks without the need for page splitting. Locking is also reduced for inserts because

    they occur at a block level rather than a row level.

    MDC improves the roll-out of data, because entire pages are deleted rather than each

    row. Logging is also reduced with MDC deletes (just a few bytes per page).

    Use a single-column MDC design to facilitate roll-in and roll-out and minimize an

    increase in disk space usage.

    See the section called Best practice for roll-in and roll-out with continuous updates for

    a hypothetical application with characteristics that benefit from using MDC for rolling in

    data.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    10/35

    Data Life Cycle Management Page 10

    Using database partitioning, table partitioning and

    multi-dimensional clustering in the same database

    designThe best practice approach for deploying large scale applications is to implement

    database partitioning, table partitioning, and MDC simultaneously in the same database

    design. Database partitioning provides scalability and helps ensure the even distribution

    of data across logical partitions; table partitioning facilitates query partition elimination

    and rollout of data; and MDC improves query performance and facilitates the roll-in of

    data.

    For example:

    CREATE TABLE Test

    (A INT, B INT, C INT, D INT )

    IN Tablespace A, Tablespace B, Tablespace C

    INDEX IN Tablespace B

    DISTRIBUTE BY HASH (A)

    PARTITION BY RANGE (B) (STARTING FROM (100) ENDING (300) EVERY (100))

    ORGANIZE BY DIMENSIONS (C,D)

    Table partitioning may not fully solve scaling issues in DB2. Continue to use database

    partitioning to solve scalability issues for large scale data warehouses. DB2 databasepartitioning and the shared nothing architecture is the best way to provide linear scaling

    of your application while minimizing the software bottleneck.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    11/35

    Data Life Cycle Management Page 11

    Additional techniques to support life cycle

    management

    Large table spacesUsing large table spaces (the default for DB2 Version 9.1) better accommodates larger

    tables or indexes. It also allows more rows per page within the DB2 server.

    Use large table spaces for tables for deep compression (many rows per page) and table

    partition global indexes (when expected to exceed 64 GB in a 4K page). If you are not

    affected by these issues, then large table spaces do not have to be utilized. Also, you can

    avoid the need for large table spaces by placing each table partition global index into a

    separate table space (highly recommended). Local partition indexes further reduce the

    need for the deployment of large table spaces.

    The following diagram compares the space available, in terms of the number of recordsthat can be stored on various-sized pages, for a regular table space and a large table

    space.

    Page Size REG TBSP

    Max Size

    RID 4 Bytes

    REG TBSP

    Max

    Records/Min

    Record

    Length

    LARGE TBSP

    Max Size

    RID 6 Bytes

    LARGE TBSP

    Max

    Records/Min

    Record Length

    4 KB 64 GB 251 / 14 2 TB 287 / 12

    8 KB 128 GB 253 / 30 4 TB 580 / 12

    16 KB 256 GB 254 / 62 8 TB 1165 / 12

    32 KB 512 GB 253 / 127 16 TB 2335 / 12

    Note: If you alter a table space to large, it does not take effect until all indexes for the

    tables have been reorganized.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    12/35

    Data Life Cycle Management Page 12

    SET INTEGRITY operationRunning SET INTEGRITY is required when you attach a new partition to a table and

    when you detach a table partition with a materialized query table (MQT). (Note that data

    in the new partition is not visible until the SET INTEGRITY process completes.) SET

    INTEGRITY is a potentially long-running operation that validates data and maintains

    global indexes. This maintenance activity is logged and might produce a large volume oflog entries. DB2 Version 9.7 supports partitioned indexes that can be created prior to

    attaching a new partition. The greatly reduces the time required for the SET INTEGRITY

    operation.

    The key benefit of SET INTEGRITY is that existing data is available for read and write

    access during its operation. You can minimize the impact of SET INTEGRITY for large

    volumes of data by using MDC, implementing partition indexes and by minimizing your

    use of global indexes and MQTs. User-maintained MQTs are an alternative that you can

    specify to speed up SET INTEGRITY.

    The section Designing and implementing your table partitioning strategy containsrecommendations on the use of SET INTEGRITY.

    The section Best practices for roll-in of compressed table partitions describes how to

    attach a table partition without requiring the execution of SET INTEGRITY.

    Asynchronous index cleanupAsynchronous index cleanup (AIC) is a new DB2 feature that reclaims space in an index

    after a table partition is detached (the cleanup runs at a low priority as a background

    process). The detachment of a table partition is near instantaneous because of the AIC

    feature. Detachment does not have to wait for index cleanup to complete. AIC is a

    background process that is invoked automatically by the DB2 database system. AIC isnot performed for partitioned local indexes.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    13/35

    Data Life Cycle Management Page 13

    Designing and implementing your table partitioning

    strategyApplications that benefit from table partitioning use the following kinds of tables:

    Very large tables

    Tables with queries accessing range-subsets of a table

    Tables with roll-out requirements

    Tables with roll-in requirements. As an alternative, consider MDC for roll-in of

    data.

    Design best practices:

    When designing your table partitioning strategy, consider the following tablepartitioning design best practices:

    Partition a column (or columns) on dates (to facilitate roll-out)

    Partition a column (or columns) that assist partition elimination, as discussed

    later in this section

    Match the granularity of ranges with roll-in and roll-out criteria. This avoids the

    need for reorganization to reclaim space when you run DETACH.

    Consider placing different ranges in separate table spaces to facilitate back up

    and recovery. The DB2 database system can backup and restore an entire range

    partition when it is placed in a separate table space.

    Consider separating active from historical data.

    To position for enhancements in a future release of the DB2 database system,

    consider making unique indexes a superset of the table partitioning key. Non-

    unique indexes can be on any columns.

    Specify the placement of each of your global indexes into their own table space

    (use large table spaces, if required). It is a good practice to minimize the size of

    the table spaces containing global indexes, in order to improve backup time.Also, use database partitioning to reduce the granularity of global indexes.

    Use partitioned indexes instead of global partition indexes where possible.

    Split up the global index into multiple table spaces to ensure that a single table

    space does not grow too large.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    14/35

    Data Life Cycle Management Page 14

    Consider strategies to minimize the impact of SET INTEGRITY. Consider the

    logging impact and elapsed time of SET INTEGRITY when attaching large

    ranges. SET INTEGRITY can also impact restart time if there is a failure.

    Prototype in your environment to see if the elapsed time is acceptable.

    Otherwise, consider alternative design strategies, discussed in the sections

    Rolling in data: Which solution to use? and Best practices for roll-in ofcompressed table partitions.

    For deep compression, DB2 Version 9.5 is strongly recommended, because of its

    ability to automatically build compression dictionaries during LOAD, IMPORT

    or INSERT operations. DB2 Version 9.1 requires table reorganization to compress

    data in a table partition if a compression dictionary is not present.

    With DB2 Version 9.7, consider the following partitioned index design best practices

    Partitioned indexes improve ATTACH and DETACH processing time and may

    improve query performance in a large database environment. The design

    guidelines for the creation of partitioned local indexes are:

    o Non-Unique indexes are partitioned indexes by default. There are no

    design restrictions to create non-unique indexes partitioned.

    o Unique Indexes can be partitioned only if the key is a superset of the

    table partitioning key (For DPF configurations the key must be a

    superset of the database partitioning key). For example:

    Database partition key: Account_Num

    Table partition key: Sales_Month

    Potential unique index that can be partitioned:

    Account_Num, Sales_Month, Store_Num

    To gain the benefits of partitioned indexes verify that uniqueness is required for

    the application:

    o A downstream data source may enforce uniqueness

    o Non-unique indexes may increase sorting time for DISTINCT, ORDER

    BY, and GROUP BY predicates

    o Uniqueness may not be required.

    Unique partitioned indexes have to be created prior to the attachment to avoid

    index maintenance overhead

    Placing index partitions in a separate table space is a best practice

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    15/35

    Data Life Cycle Management Page 15

    A major benefit of placing partition indexes in their own table space is that there

    is no data movement when attaching a partition if the separate index was built

    prior to the ATTACH.

    Partition indexes are placed in the same table space as the table by default.

    Partition indexes may be placed in a separate table space. To place a partition

    index in a separate table space, use the partition level INDEX IN clause of

    CREATE TABLEDDL or ALTER ADD the partition (DMS storage only). The table

    level INDEX IN clause of the CREATE TABLE DDL is for non-partition indexes

    only.

    Partition index migration considerations and best practices

    Partition indexes created in DB2 Version 9.5 and migrated to DB2 Version 9.7 are

    placed in the same table space as the table. Data movement is required to put the

    index into a separate table space after a migration from DB2 Version 9.5.

    To migrate to partition indexes in separate table spaces in DB2 Version 9.7

    1. Create a new partition index in a separate table space.

    create index date_part on sales(date, status)

    partitioned;

    2. Drop the existing original partition index from the same table space as the

    data.

    drop index dateidx;

    3. Rename the new partition index in a separate table space to the same name

    as the original partition index.

    rename index date_part to dateidx;

    An alternative method is to create a new table with partitioned indexes and

    move the data using an online table move in order to place the partitioned

    indexes into a separate tablespace.

    Other partition index design considerations

    o Indexes on the source table to be attached need to match indexes on thetarget table

    Some correction possible before a failure occurs

    Indexes that do not match will be dropped

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    16/35

    Data Life Cycle Management Page 16

    o Although several catalog statistics are moved during an ATTACH, the

    best practice is to run RUNSTATS after an ATTACH.

    Maximizing the benefits of partition elimination:Table partitioning provides a powerful facility to limit data access to the partitions that

    are required to satisfy the SQL WHERE clause. In order to benefit from partition

    elimination, do the following tasks:

    Prefix the cluster key with the partition key

    Ensure the range partitioning column is frequently used in the WHERE clause

    Ensure the leading columns of the composite partition key are in the WHERE

    clause

    If you are using generated columns, use them where appropriate to assist in

    partition elimination. Generated columns can be partition keys.

    Use generated columns as MDC dimensions, where appropriate to reduce the

    granularity of the dimension.

    Use multiple, separate ranges to eliminate unnecessary searches, if possible. For

    example, partition elimination could access only the months of January and

    December instead of the whole year.

    If you are using joins, partition elimination is used for inner access of the nested

    loop join only.

    Partition elimination of parameter markers is pushed down at run time whenvalues are bound at execution time

    Operational considerations:Use the following best practices to enhance the operational characteristics of table

    partitioning:

    Issue a COMMIT statement after each step of your roll-in or roll-out procedure to

    release locks (for example, after ATTACH, DETACH, or SET INTEGRITY, and so

    on)

    Explicitly name each table partition. These names are easier to manage than the

    system generated names.

    Always terminate a failed LOAD utility run. Subsequent operations (for

    example, DROP TABLE) cannot proceed until LOAD is terminated.

    If you are appending data to a partition, specify LOAD INSERT. Performing

    LOAD REPLACE of a partition replaces an entire table (all partitions).

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    17/35

    Data Life Cycle Management Page 17

    Avoid attaching a partition with the same name as a detached partition. This

    results in a duplicate name until asynchronous index cleanup (AIC) completes.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    18/35

    Data Life Cycle Management Page 18

    Rolling in data: Which solution to use?There are several factors that affect how you choose the best roll-in solution for your

    installation:

    Minimizing the time it takes to bring new data into the system and make itavailable

    Minimizing the amount of logging activity that occurs as part of the SET

    INTEGRITY operation during roll in

    Whether you have a requirement for continuous updates rather than daily batch

    process.

    Maximizing compression for new ranges to effectively manage data skew

    The following methods are the two different techniques for the roll-in of data with table

    partitions.

    1. ALTER/ATTACH

    With the ALTER/ATTACH method you first populate the table offline, and then

    attach the partition. You must run SET INTEGRITY (a potentially long-running

    operation for large data volumes). The impact of running SET INTEGRITY may

    be reduced by using partitioned indexes in DB2 version 9.7.

    Advantages:

    Concurrent access

    All previous partitions are available for updates

    No partial data view (new data cannot be seen until Set Integrity

    completes)

    Disadvantages:

    Additional log space is required

    Long elapsed times

    Draining of queries is required

    2. ALTER/Add

    With the ALTER/Add method, you attach an empty table partition, and then

    populate it using the LOAD utility or INSERT statements.

    You do not need to run SET INTEGRITY.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    19/35

    Data Life Cycle Management Page 19

    Advantages:

    Faster elapsed times

    SET INTEGRITY is not required

    Less log space for global index maintenance

    Disadvantages:

    Partial data view occurs when you use INSERT statements (not with

    LOAD utility).

    LOAD utility allows read-only access to older partitions

    Recommendation:

    For larger data volumes, utilize the ALTER/Add method for roll-in of a table partition or

    utilize MDC for roll-in if many non-partitioned indexes are deployed.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    20/35

    Data Life Cycle Management Page 20

    Best practices for roll-in of compressed table

    partitions:These best practices use the ALTER/Attach method of attaching a table partition, which

    are described in the preceding section.

    For Version 9.1, rapidly attach a table partition with compressed data (large data

    volumes) by using the following technique:

    1. Load a subset of data (a true random sample) into a separate DB2 table

    2. Alter the standalone table to enable compression

    3. Reorganize the subset of data to build a compression dictionary

    4. Empty the table or retain minimal data (so the dictionary is retained)

    5. ALTER/ATTACH the table as a new table partition (the dictionary is retained)

    6. Execute SET INTEGRITY (this is rapid, due to minimal data)

    7. Populate data by using the LOAD utility or INSERT statements (compression

    will occur). For applications with continuous updates, load data into a staging

    table using the LOAD utility. Then, use an insert with a sub-select from the

    staging table or run an ETL (extract, transform, and load) job to update the

    primary tables (compression will occur). The roll-in of data can be improved

    further if you exploit the benefits of MDC within the table partition.

    For Version 9.5, the technique to rapidly attach a table partition is simplified byautomatic dictionary creation:

    1. ALTER/Add the empty table.

    2. Populate the table with data, by using the LOAD utility or an INSERT/SELECT

    statement (data is compressed with automatic dictionary creation).

    Note that a full offline reorganization of a fully-loaded partition is likely to achieve better

    compression than can be achieved with this method. DB2 Version 9.7 fix pack 1 supports

    rapid reorganization by partition when using partitioned indexes to improve

    compression results.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    21/35

    Data Life Cycle Management Page 21

    Best practice for roll-in and roll-out with continuous

    updates:This database design combines various features of the DB2 database system to facilitate

    roll-in and roll-out of data with continuous update requirements.

    This design is for applications with the following characteristics:

    Continuous updates occur all day long (which prevents performing ALTER/Add

    to attach a partition).

    Data is added daily.

    Queries frequently access a certain day.

    Table partitioning on day results in too many partitions (for example, 365 days

    times 3 years).

    Roll-out occurs weekly or monthly (typically on a reporting boundary).

    Recommended database design:

    To facilitate the roll-in of data, specify a single-dimension MDC on day (see the section

    Features of MDC that benefit roll-in and roll-out of data).

    To facilitate the roll-out of data, specify a table partition range per week or month. This

    provides the same time dimension as MDC but at a coarser scale.

    Applications with long running reports might not be able to drain queries for theexecution of the DB2 LOAD utility. The best practice in this case is to use the LOAD

    utility to rapidly load data into staging tables. Then populate the primary tables using an

    insert with a sub-select.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    22/35

    Data Life Cycle Management Page 22

    After roll-out: How to manage data growth and

    retention?To satisfy corporate policy, government regulations, or audit requirements, you might

    need to retain your data and keep it accessible for long durations of time. For example,the Health Insurance Portability and Accountability Act (HIPAA) contains medical

    record retention requirements for health-care organizations. The Sarbanes-Oxley Act sets

    out certain record retention requirements for corporate accountants. Additionally, some

    enterprises are also finding value in performing analytics on historical data and are

    therefore retaining data for longer durations.

    Therefore, in addition to implementing a suitable roll-in and roll-out strategy and an

    appropriate database design, you need to consider the complete lifespan of your data

    and include a policy for data retention and retrieval. You could do nothing and

    continually add hardware capacity and resources to maintain the additional data growth

    for retention purposes, however there are better practices for data retention, as described

    in this paper.

    Using UNION ALL viewsOne practice is to keep all the data in the database but roll out certain ranges for retention

    and create UNION ALL views over the ranges that require easy accessibility.

    The following example demonstrates how to create a UNION ALL view:

    CREATE VIEW all_sales AS

    (

    SELECT * FROM sales_0105

    WHERE sales_date BETWEEN '01-01-2005' AND '01-31-2005'

    UNION ALL

    SELECT * FROM sales_0205

    WHERE sales_date BETWEEN '02-01-2005' AND '02-28-2005'

    UNION ALL

    ...

    UNION ALL

    SELECT * FROM sales_1207

    WHERE sales_date BETWEEN '12-01-2007' AND '12-31-2007'

    );

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    23/35

    Data Life Cycle Management Page 23

    Using UNION ALL views addresses data retention and real time accessibility while

    keeping all the data maintained online in the database using primary storage. A problem

    caused by this method is that you might be unnecessarily maintaining this data in

    associated backup images. Also, historical data typically does not require high

    performance, so does not need the indexing or other high-cost factors encountered with

    your primary data.

    There are a variety of ways you could use UNION ALL views:

    Access active data using UNION ALL views and keep your historical data

    compressed in a range-partitioned table.

    Keep active data in a range-partitioned table and use a UNION ALL view to

    access historical data in another a range-partitioned table.

    Using UNION ALL views has some limitations. When you have a large number of

    ranges, use range-partitioned tables because, for UNION ALL views some complex

    predicates and joins are not pushed down.

    However, in some situations UNION ALL views are advantageous. For example, a

    UNION ALL view may work in a federated environment, whereas a range-partitioned

    table does not.

    Although UNION ALL views may be useful in some environments, DB2 version 9.7

    users should strongly consider migrating to table partitioning.

    Using IBM Optim Data Growth SolutionDepending on your service level agreement (SLA) objectives for your historical data,

    usually the best practice to address both data growth and retention is to implement dataarchiving with IBM Optim Data Growth Solution.

    IBM Optim Data Growth Solution is a leading solution for addressing growth,

    compliance and management of data. It preserves application integrity by archiving

    complete business objects, rather than single tables. For example, it retains foreign keys

    and preserves metadata within the archive. These features enable you to have:

    Flexible access to data.

    The ability to selectively or fully restore archived data into the original database

    table, or into a new table, or even into an alternate database.

    The following steps guide you through the process of determining how best to

    implement your archiving strategy.

    STEP 1: Classify your applications

    First, you need to classify your applications according to their archival requirements.

    By understanding which transactions you need to retain from your application data,

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    24/35

    Data Life Cycle Management Page 24

    you can group applications with similar data requirements for archive accessibility

    and performance. Some applications require only current transactions be retained;

    some require access to only historical transactions; and others require access to a mix

    of current and historical transactions (with a varying current-to- historical ratio).

    Also, consider the service level agreement (SLA) objectives for your archived data.

    An SLA is a formal agreement between groups that defines the expectations between

    them and includes objectives for items such as services, priorities, and

    responsibilities. SLA objectives are often formulated using response time goals. For

    example, a specific human resources report might need to run, on average, within 5

    minutes.

    STEP 2: Assess the temperature of your data:

    Data derives its temperature from the following criteria:

    How frequently the data is accessed

    How long it takes to access the data

    How rapidly the data changes (volatility)

    User and application requirements

    The temperature varies from enterprise to enterprise, but typically the data

    temperatures fall into common classifications across industries. The following table

    provides guidelines for data temperatures.

    Regulatory Data that needs to be available on an exception basis.Dormant

    Deep Historical Data Queries rarely access this data but it must beavailable for periodic access.

    Cold

    Traditional Decision Support Data Queries access this data lessfrequently and data retrieval doesnt require the urgency of a quickturnaround in response time.

    Warm

    Tactical Data The bulk of the queries are for current data, accessedfrequently, heavily and requiring quick response time turnaround.Hot

    FactoidData

    Temperature

    Regulatory Data that needs to be available on an exception basis.Dormant

    Deep Historical Data Queries rarely access this data but it must beavailable for periodic access.

    Cold

    Traditional Decision Support Data Queries access this data lessfrequently and data retrieval doesnt require the urgency of a quickturnaround in response time.

    Warm

    Tactical Data The bulk of the queries are for current data, accessedfrequently, heavily and requiring quick response time turnaround.Hot

    FactoidData

    Temperature

    There are various means of assessing the temperature of data. Consider business and

    application definitions and requirements, roll-out criteria, and workload and query

    tracking statistics as potential methods for determining how to classify your data

    according to temperature. Gather the following potential workload and query

    information to assess the data temperature:

    Which objects are (and are not) being accessed

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    25/35

    Data Life Cycle Management Page 25

    The frequency each object is accessed

    The common time intervals at which objects are accessed,

    For example: THIS_WEEK, LAST_WEEK, THIS_QUARTER,

    LAST_QUARTER.

    Which data within an object is being accessed

    You can use DB2 Version 9.5 workload management (WLM) to assist in discovering

    data temperatures. The WLM historical analysis tool provides statistics on which

    tables, indexes and columns have, or have not, been accessed, along with the

    associated frequency.

    The WLM historical analysis tool consists of 2 scripts:

    wlmhist.pl: generates historical data

    wlmhisrep.pl: produces reports from the historical data

    To discover which data within an object is being accessed, analyze the SQL statement

    using an ACTIVITIES event monitor to collect data on workload activities, including

    the SQL statement text. You might want to collect information about workload

    management objects such as workloads, service classes, and work classes (through

    work actions). Enable activity collection using the COLLECT ACTIVITY DATA

    WITH DETAILS clause of the CREATE or ALTER statements for the workload

    management objects for which you want to collect information, as shown in the

    following example:

    ALTER SERVICE CLASS sysdefaultsubclass

    UNDER sysdefaultuserclass

    COLLECT ACTIVITY DATAON ALLWITH DETAILS

    The WITH DETAILS clause enables collection of the statement text for both static and

    dynamic SQL.

    If applications make use of parameter markers within the statement text, you should

    also include the AND VALUES clause, (so that you have COLLECT ACTIVITY

    DATA WITH DETAILS AND VALUES). The AND VALUES clause collects the

    data values associated with the parameter markers in addition to the detailedstatement information.

    STEP 3: Discover and classify your business objects

    Business objects, such as insurance claims, invoices, or purchase orders, represent

    business transactions. By classifying your business objects, you can begin to define

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    26/35

    Data Life Cycle Management Page 26

    rules and associated business drivers for managing these objects at different stages in

    the data life cycle.

    From a database perspective, a business object represents a group of related rows

    from related tables.

    Simplified example of a business object:

    Given the following three tables:

    2/1/2006E11OPERATIONOP1010

    2/1/2006E01OPERATION SUPPORTOP1000

    12/1/2002D11W L PROD CONT PROGSMA2113

    12/1/2002D11W L ROBOT DESIGNMA2112

    12/1/2002D11W L PROGRAM DESIGNMA2111

    2/1/2006D11W L PROGRAMMINGMA2110

    2/1/2006D01WELD LINE AUTOMATIONMA2100

    2/1/2006C01USER EDUCATIONIF2000

    PRJENDATEDEPTNOPROJNAMEPROJNO

    2/1/2006E11OPERATIONOP1010

    2/1/2006E01OPERATION SUPPORTOP1000

    12/1/2002D11W L PROD CONT PROGSMA2113

    12/1/2002D11W L ROBOT DESIGNMA2112

    12/1/2002D11W L PROGRAM DESIGNMA2111

    2/1/2006D11W L PROGRAMMINGMA2110

    2/1/2006D01WELD LINE AUTOMATIONMA2100

    2/1/2006C01USER EDUCATIONIF2000

    PRJENDATEDEPTNOPROJNAMEPROJNO

    PROJECT

    E11OCONNELL310

    D11CIALINI170

    C01TYRRELL140

    D11CASSELLS160

    D11GOODMAN150

    C01VINCENT130

    D11TSOUNIS60

    WORKDEPTLASTNAMEEMPNO

    E11OCONNELL310

    D11CIALINI170

    C01TYRRELL140

    D11CASSELLS160

    D11GOODMAN150

    C01VINCENT130

    D11TSOUNIS60

    WORKDEPTLASTNAMEEMPNO

    EMPLOYEE

    OPERATIONSE11

    ADMINISTRATIONSYSTEMSD21

    MANUFACTURING

    SYSTEMSD11

    INFORMATIONCENTERC01

    DEPTNAME

    DEPT

    NO

    OPERATIONSE11

    ADMINISTRATIONSYSTEMSD21

    MANUFACTURING

    SYSTEMSD11

    INFORMATIONCENTERC01

    DEPTNAME

    DEPT

    NO

    DEPARTMENT

    The business object is:

    ProjectDepartment

    Employee

    For data retention and archiving purposes, you want the complete business object to

    be represented such that you have a historical point-in-time snapshot of a businesstransaction. Creating a historical snapshot requires both transactional detail and

    related master information, which involves multiple tables in the database.

    Archiving complete business objects allows the archives to be intact and accurate and

    to provide a standalone repository of transaction history. To respond to inquiries or

    discovery requests, you can query this repository without the need to access hot

    data.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    27/35

    Data Life Cycle Management Page 27

    In this example, to ensure the complete object is available, the archived business

    object must consist of associated data from the DEPARTMENT and EMPLOYEE

    tables. After archiving, you would only want to delete the data in the production

    PROJECT table and not in the associated EMPLOYEE and DEPARTMENT data.

    You can discover business objects based on data relationships within the schema, as

    demonstrated in this example. However, you might also want to include other

    related tables that do not have any schema relationship, but, for example, might be

    related through use of an application. In addition, you might elect to remove certain

    discovered relationships from the business object.

    STEP 4: Produce your comprehensive data classification:

    After you have classified your applications and business objects and determined

    their associated data temperatures, you can produce a data classification table to

    summarize this information. This table articulates the aging of the data.

    The following table provides a sample data classification:

    >10yrs6-10yrs3-5yrs0-2yrsClaimsAppA

    DeleteOffline

    Archive

    Online

    Archive

    ProductionBusiness

    Object

    Application

    >10yrs6-10yrs3-5yrs0-2yrsClaimsAppA

    DeleteOffline

    Archive

    Online

    Archive

    ProductionBusiness

    Object

    Application

    STEP 5: Determine the post-archive storage type

    To determine what storage type is most appropriate for your aged data, consider the

    following questions:

    Who needs to access the archive data, and for what purpose?

    What are the response time expectations?

    How will the archive data age?

    How many storage tiers and what type of storage should be deployed, for

    example, SAN, WORM, or tape?

    For example, for online archive you could use ATA disks or large capacity slower

    drives. For offline archive, you could use tape or WORM (IBM DR550, EMC

    Centera).

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    28/35

    Data Life Cycle Management Page 28

    Non DBMSRetention PlatformATA File ServerIBM DR550EMC Centera

    CurrentData

    0-2 years

    OfflineRetention Platform

    CDTapeOptical

    ProductionDatabase

    Archive

    OnlineArchive

    3-5 years

    OfflineArchive

    6+ years

    RestoreRestore

    IBM Federation

    Report WriterXMLODBC / JDBCNativeApplication

    Universal Access to Application Data

    Application Independent Access

    STEP 6: Access to archived data

    The Optim Data Growth Solution access layer uses SQL92 capability and various

    protocols (as shown in the above figure) to provide access to the archived data. This

    accessibility is out-of-line from the production database, and so does not use any

    resources from the production database system.

    Alternatively, you can use a federated system (using IBM DB2 Federated Server) to

    provide transparent access to the archive from the production database.

    Both methods allow for direct access to archived data, without the need to retrieve or

    restore the archived data.

    The following example demonstrates how to use a UNION ALL view to access both

    active and archived data. The example renames the database table called project to a

    different name, and then creates a UNION ALL view that is also named project.

    RENAME TABLE project TO project_active

    CREATE VIEW project AS

    SELECT * FROM project_active

    WHERE prjendate >= (CURRENT_DATE 5 YEARS)

    UNION ALL

    SELECT * FROM project_arch

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    29/35

    Data Life Cycle Management Page 29

    WHERE prjendate < (CURRENT_DATE 5 YEARS)

    As an alternative, the following example avoids the need to rename the table in the

    database. Instead, the example creates a UNION ALL view called project_all that

    the application can query from to get the complete project data set:

    CREATE VIEW project_all AS

    SELECT * FROM project

    WHERE prjendate >= (CURRENT_DATE 5 YEARS)

    UNION ALL

    SELECT * FROM project_arch

    WHERE prjendate < (CURRENT_DATE 5 YEARS)

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    30/35

    Data Life Cycle Management Page 30

    Best Practices For database partitioning, use a partitioning key column with

    high cardinality and frequently used by a join predicate.

    Use database partitioning to improve scalability for large scale

    data warehouses.

    Use table partitioning for very large tables, tables with queries

    that access range-subsets of data, and for roll-out requirements.

    For MDC, specify low-cardinality columns or use generated

    columns to reduce cardinality.

    Use a single-column MDC design to facilitate roll-in and roll-out

    to minimize increased disk space usage.

    For large scale applications, implement database partitioning,

    table partitioning, and MDC simultaneously.

    Use large table spaces for tables with deep compression if you

    believe you will have very small row sizes. For table partitioning,

    place each table partition global index in a separate table space

    (this might avoid the need for large table spaces) or use

    partitioned local indexes.

    For larger data volumes, use the ALTER/Add method to roll-in a

    table partition, or use MDC.

    For Version 9.1, to attach a table partition with compressed data

    build a dictionary with minimal data prior to ALTER/ATTACH

    to avoid table reorganization

    For Version 9.5, to attach a table partition, use the ALTER/Add

    method.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    31/35

    Data Life Cycle Management Page 31

    For continuous updates, facilitate roll-in of data by specifying a

    single-dimension MDC on day

    Use federation to facilitate access to archived data from

    production databases.

    Use UNION ALL views for transparent access to archived data.

    IBM Optim Data Growth Solution is the recommended tool for

    data retention and retrieval.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    32/35

    Data Life Cycle Management Page 32

    Conclusion

    Careful selection of the most appropriate partitioning method for your DB2 database,

    and using the most efficient roll-in and roll-out technique for your system can maximize

    your systems overall performance and efficiency.

    Devote sufficient time to analyzing and understanding your data so that you can make

    the best use of the guidelines in this paper and take advantage of the features the DB2

    database system provides to help make your system as efficient as possible.

    You can use database partitioning to provide scalability and to help ensure even

    distribution of data across partitions. Follow the guidelines in the section Designing and

    implementing your table partitioning strategy to devise the most effective table

    partitioning strategy. Use MDC to help improve the performance of queries and to

    facilitate the roll-in of data.

    If you need to roll-in large volumes of data from compressed table-partitions, upgrade to

    Version 9.5 of the DB2 database system and use the ALTER/Add method to attach a table

    partition.

    If you need to accommodate continuous updates, your best strategy is to use MDC to

    facilitate the roll-in process.

    To determine how to handle the needs of your historical data, follow the guidelines in

    the section After roll-out: How to manage data growth and retention?.

    Before you are ready to roll out your data and archive it, you need to determine a policy

    for data retention and retrieval-of-data-from-archive that suits your organization.

    You can better understand your organizations technical requirements for retention and

    retrieval by analyzing the following factors:

    The kind of transactions you need to retain

    The temperature of your data

    How your business objects are composed

    Your policy should include what kind of post-archive storage is most appropriate, and

    how best to access the archived data. The guidelines in the section After roll-out: Howto manage data growth and retention? can assist you in producing your policy.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    33/35

    Data Life Cycle Management Page 33

    Further reading DB2 Best Practices - http://www.ibm.com/developerworks/db2/bestpractices/

    Leveraging DB2 Data Warehouse Edition for Business Intelligence -

    http://www.redbooks.ibm.com/redbooks/SG247274/wwhelp/wwhimpl/java/html/wwhelp.htm

    Database Partitioning, Table Partitioning, and MDC for DB2 9 -

    http://www.redbooks.ibm.com/redbooks/SG247467/wwhelp/wwhimpl/java/html

    /wwhelp.htm

    DB2 V9.5 Information Center -

    http://publib.boulder.ibm.com/infocenter/db2luw/v9r5/index.jsp

    DB2 V9.7 Information Center -

    http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp

    Optim Data Growth Management -

    http://www.optimsolution.com/solutions/DataGrowth.asp

    Contributors

    Tim Vincent

    Chief Architect DB2 LUW

    Bill OConnell

    Data Warehousing CTO

    Miriam Goodwin

    Technical Sales Specialist

    Tim Smith

    Optim Product Manager

    Phrederick Tyrrell

    Data Warehousing Competitive Specialist

    Aamer Sachedina

    Senior Technical Staff Member

    DB2 Technology Development

    Matthew Huras

    DB2 LUW Kernel, Chief Architect

    Joyce Simmonds

    DB2 Information Management

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    34/35

    Data Life Cycle Management Page 34

    NoticesThis information was developed for products and services offered in the U.S.A.

    IBM may not offer the products, services, or features discussed in this document in other

    countries. Consult your local IBM representative for information on the products and servicescurrently available in your area. Any reference to an IBM product, program, or service is not

    intended to state or imply that only that IBM product, program, or service may be used. Any

    functionally equivalent product, program, or service that does not infringe any IBM

    intellectual property right may be used instead. However, it is the user's responsibility to

    evaluate and verify the operation of any non-IBM product, program, or service.

    IBM may have patents or pending patent applications covering subject matter described inthis document. The furnishing of this document does not grant you any license to these

    patents. You can send license inquiries, in writing, to:

    IBM Director of Licensing

    IBM Corporation

    North Castle Drive

    Armonk, NY 10504-1785

    U.S.A.

    The following paragraph does not apply to the United Kingdom or any other country where

    such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES

    CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER

    EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-

    INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do

    not allow disclaimer of express or implied warranties in certain transactions, therefore, this

    statement may not apply to you.

    Without limiting the above disclaimers, IBM provides no representations or warrantiesregarding the accuracy, reliability or serviceability of any information or recommendations

    provided in this publication, or with respect to any results that may be obtained by the use of

    the information or observance of any recommendations provided herein. The information

    contained in this document has not been submitted to any formal IBM test and is distributed

    AS IS. The use of this information or the implementation of any recommendations or

    techniques herein is a customer responsibility and depends on the customers ability to

    evaluate and integrate them into the customers operational environment. While each itemmay have been reviewed by IBM for accuracy in a specific situation, there is no guarantee

    that the same or similar results will be obtained elsewhere. Anyone attempting to adapt

    these techniques to their own environment do so at their own risk.

    This document and the information contained herein may be used solely in connection with

    the IBM products discussed in this document.

    This information could include technical inaccuracies or typographical errors. Changes are

    periodically made to the information herein; these changes will be incorporated in new

    editions of the publication. IBM may make improvements and/or changes in the product(s)

    and/or the program(s) described in this publication at any time without notice.

    Any references in this information to non-IBM Web sites are provided for convenience only

    and do not in any manner serve as an endorsement of those Web sites. The materials at

    those Web sites are not part of the materials for this IBM product and use of those Web sites is

    at your own risk.

    IBM may use or distribute any of the information you supply in any way i t believes

    appropriate without incurring any obligation to you.

    Any performance data contained herein was determined in a controlled environment.

    Therefore, the results obtained in other operating environments may vary significantly. Some

    measurements may have been made on development-level systems and there is no

    guarantee that these measurements will be the same on generally available systems.

    Furthermore, some measurements may have been estimated through extrapolation. Actual

    results may vary. Users of this document should verify the applicable data for their specific

    environment.

  • 8/8/2019 DB2BP Data Life Cycle 1009I

    35/35

    Data Life Cycle Management Page 35

    Information concerning non-IBM products was obtained from the suppliers of those products,

    their published announcements or other publicly available sources. IBM has not tested those

    products and cannot confirm the accuracy of performance, compatibility or any other

    claims related to non-IBM products. Questions on the capabilities of non-IBM products should

    be addressed to the suppliers of those products.

    All statements regarding IBM's future direction or intent are subject to change or withdrawal

    without notice, and represent goals and objectives only.

    This information contains examples of data and reports used in daily business operations. To

    illustrate them as completely as possible, the examples include the names of individuals,

    companies, brands, and products. All of these names are fictitious and any similarity to the

    names and addresses used by an actual business enterprise is entirely coincidental.

    COPYRIGHT LICENSE:

    This information contains sample application programs in source language, which illustrate

    programming techniques on various operating platforms. You may copy, modify, and

    distribute these sample programs in any form without payment to IBM, for the purposes of

    developing, using, marketing or distributing application programs conforming to the

    application programming interface for the operating platform for which the sample

    programs are written. These examples have not been thoroughly tested under all conditions.

    IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these

    programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall

    not be liable for any damages arising out of your use of the sample programs.

    Trademarks

    IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International

    Business Machines Corporation in the United States, other countries, or both. If these and

    other IBM trademarked terms are marked on their first occurrence in this information with a

    trademark symbol ( or ), these symbols indicate U.S. registered or common law

    trademarks owned by IBM at the time this information was published. Such trademarks may

    also be registered or common law trademarks in other countries. A current list of IBM

    trademarks is available on the Web at Copyright and trademark information atwww.ibm.com/legal/copytrade.shtml

    Windows is a trademark of Microsoft Corporation in the United States, other countries, or

    both.

    UNIX is a registered trademark of The Open Group in the United States and other countries.

    Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

    Other company, product, or service names may be trademarks or service marks of others.