essentials for test data management - itpsc...• drivers for effective test data management (tdm)...

36
© 2009 IBM Corporation Essentials for Test Data Management

Upload: others

Post on 16-Aug-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

© 2009 IBM Corporation

Essentials for Test Data Management

Page 2: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

2 © 2009 IBM Corporation

Agenda

• Drivers for Effective Test Data Management (TDM)

• Effective Test Data Management

• Test Environment Creation

• Data Masking Considerations

• Editing Test Data

• Compare

• Refreshing Test Environments

• IBM Optim

• Q&A

Page 3: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

3 © 2009 IBM Corporation

The Challenge

1

Production 500GB

Training 500GB

Unit Test 500GB

System Test 500GB

UAT 500GB

Integration 500GB

Total 3 TB

Production 500GB

Training 500GB

Unit Test 500GB

System Test 500GB

UAT 500GB

Integration 500GB

Total 3 TB

2

3

4

5

6

Page 4: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

4 © 2009 IBM Corporation

Test Data Management (TDM): What & Why?

• Your company can implement a reliable database upgrade

• ensure positive customer

experience

• Compare “before” and “after” images of test data

• De-identify (mask) test data to protect privacy

• Your business can benefit from using IT resources more effectively

• reduce costs

• Edit data to create error and boundary conditions

• Extract related subsets of production data that are targeted to functionality under test

• Your business can deploy new/improved enterprise applications faster without sacrificing quality

• increase revenue generation

• TDM refers to the need to manage data used in testing and other non-production environments

Why?What?

Page 5: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

5 © 2009 IBM Corporation

Data Privacy Considerations

• Organizations need the ability to de-identify, mask and transform sensitive data

• Companies can apply a range of transformation techniques to substitute customer data with contextually-accurate but fictionalized data to produce accurate test results

• By masking personally-identifying information, you protect the privacy and security of confidential customer data, and support compliance with local, state, national, international and industry-based privacy regulations

Page 6: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

6 © 2009 IBM Corporation

What If I Don’t Do Anything?

• Infrastructure Costs – higher storage costs

• Cloning databases requires more storage hardware

• Larger databases could mean more license costs

• Higher staff costs

• Greater data volumes take longer to clone

• Greater data volumes equates to longer test cycles

• Defects can be expensive

• Costs to resolve defects in production can be 10 – 100 times greater than those caught in the development environment

• Privacy breaches

Page 7: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

7 © 2009 IBM Corporation7 6/19/2009

The Symptoms of Poor Testing Strategies

• Management notices that new application functionality is delayed three months

• The business is unable to compete for customers because their software lacks “state-of-the-art”functionality

• The CFO is complaining over how high the IT budget has become to fix application defects

• Developers are sitting around waiting for their copy of the database to work with

Page 8: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

8 © 2009 IBM Corporation

TDM: Benefits to Key Stakeholders

CIO� Speed-time-to-market

without sacrificing quality.

� Ensure consistent testing methodologies and reduce costs.

� Minimize threat of data breach.

VP, Line of Business� Ensure a reliable, positive

customer experience.

� Sustain or react to competitive situations quickly.

� Provide customers with sense of security.

Director of IT� Populate realistic test data

to improve testing and quality.

� Streamline testing processes for optimal environment.

� Consistent methodology for privatization of data.

Page 9: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

© 2009 IBM Corporation

Effective Test Data Management

Page 10: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

10 © 2009 IBM Corporation

Test Data Management – Building Blocks

Archive Old DataArchive Old DataInspect and Edit Datato Test Error RoutinesInspect and Edit Datato Test Error Routines

Refresh Test DataRefresh Test Data

Correct Errors inProduction Data

Correct Errors inProduction Data

Compare Before/AfterData

Compare Before/AfterData

Create Test EnvironmentCreate Test

Environment

TEST

Go Production !

Create/ModifyApplication

Create/ModifyApplication

Privatization of Personal Information

Privatization of Personal Information

11

22

33

44

55

Page 11: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

11 © 2009 IBM Corporation

Environment Creation: Some Current Practices11#1 - Clone Production

WaitWait

Manual examination:Manual examination:Right data?

What Changed?Correct results?

Unintended Result?Someone else modify?

Clone ProductionClone Production

Request for CopyRequest for Copy

Production

Database

Copy

Production

Database

Copy

AfterAfter

ChangesChanges

#2 – Write SQL

Share test databasewith everyone else

• RI Accuracy?• Right Data?

Expensive,Dedicated Staff,Ongoing Responsibility

ChangesChanges

• Complex

• Subject to Change

Write SQLWrite SQL

ExtractExtract

ExtractExtract AfterAfter

Page 12: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

12 © 2009 IBM Corporation

11

DevelopmentDevelopment

EnvironmentEnvironment

QAQA

EnvironmentEnvironment

TestTest

EnvironmentEnvironment

TrainingTraining

EnvironmentEnvironment

Production orProduction or

Production CloneProduction Clone DevelopmentDevelopment

EnvironmentEnvironment

Create targeted, right-sized test environments instead of cloning entire production environments

Development environments are then more manageable, speeding the testing process!

What is Subsetting?

Page 13: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

13 © 2009 IBM Corporation

11

“When performing the development upgrade, it is important to leverage a representative subset of production data instead of an exact copy; this is because the development environment usually has less capacity in both memory and hard drive space than the test and production environments. Limiting the size of the conversion files during the development upgrade will better ensure that the processes will complete in a timely manner.”

Testing Best Practices – Oracle

Tip #27—Test with a Representative Subset of Production Data

Page 14: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

14 © 2009 IBM Corporation

Production

Environment

Baseline Subset

Test(DB2 LUW/ AIX)

Dev(Oracle/ Solaris)

QA(Sybase/ Linux)

Extract/ Archive File

Dynamically load

relational intact data

sets & objects based

on selection criteria

Test Environment Creation Using Subsetting11

Page 15: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

15 © 2009 IBM Corporation

Production

Environment

Baseline Subset

Creating the Baseline Subset

2 Common Approaches:

• Clone production and truncate transactions

• Extract and seed common set up data

11

Page 16: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

16 © 2009 IBM Corporation

11Extracting a Subset Using Templates

• Criteria can be based on one or more modules

• All Date Values

• Create Date

• Transaction Date

• Effective Date

• Organizations

• Status

• Order number(s)

• “And/Or” combinations

• More….

Page 17: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

17 © 2009 IBM Corporation

Ensure Referential Integrity in SubsetComplete Business Object11

Page 18: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

18 © 2009 IBM Corporation

22 Data Masking

• Also known as: data de-identification, depersonalization, desensitization, obfuscation, data scrubbing

• Technology that helps conceal real data

• Scrambles data to create new, legible data

• Retains the data's properties, such as its width, type and format

• Common data masking algorithms include random, substring, concatenation, date aging

• Used in non-production environments as a Best Practice to protect sensitive data

Page 19: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

19 © 2009 IBM Corporation

22 Data Privacy – General Principles

• Do What is Needed – But Not more

• Balance Costs vs. Data Breach Risks.

• Identify Company Best Practices

• Designate an internal champion

• Meet Regulatory/Legal Needs

• Government Regulations/Internal Privacy Policies

• Understand Application and Business Requirements

• Developers should be debugging the test application not the test data. Data should be masked appropriately and consistently in the application

• Volume of Data – Independent Test Environments

• Use smaller test beds of data for frequent refreshes

Page 20: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

20 © 2009 IBM Corporation

22 Data Masking Techniques

Example 2Example 2Example 1Example 1

PersNbr FstNEvtOwn LstNEvtOwn

27645 Elliot Flynn

27645 Elliot Flynn

Event TableEvent Table

PersNbr FstNEvtOwn LstNEvtOwn

10002 Pablo Picasso

10002 Pablo Picasso

Event TableEvent Table

Personal Info TablePersonal Info Table

PersNbr FirstName LastName

08054 Alice Bennett

19101 Carl Davis

27645 Elliot Flynn

Personal Info TablePersonal Info Table

PersNbr FirstName LastName

10000 Jeanne Renoir

10001 Claude Monet

10002 Pablo Picasso

•Lookup values

•Intelligence

•Arithmetic expressions

•Concatenated expressions

•Date aging

•String literal values

•Character substrings

•Random/sequential numbers

Referential integrity is maintained with key propagation

Client InformationClient InformationClient Information

Client No. SSN

Name

Address

City State Zip

Client No. SSN

Name

Address

City State Zip

112233 123-45-6789

Amanda Winters

40 Bayberry Drive

Elgin IL 60123

123456 333-22-4444

Erica Schafer

12 Murray Court

Austin TX 78704

Data is masked with contextually correct data to preserve integrity of test data

Page 21: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

21 © 2009 IBM Corporation

33 Browse and Edit Test Data

• You must be sure that all logic paths are tested

• BUT…

• Your production data may not contain all the needed test cases

• Errors

• Boundary conditions

• Unusual combinations of data

Page 22: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

22 © 2009 IBM Corporation

33 Editing Test Data

• Browse or edit referentially intact sets of data, from multiple related tables, simultaneously on one screen

• Create data values to test program logic

• Inspect and correct data that is causing problems

• Verify execution results

• Dynamically “join” related tables and views, synchronously scroll related data, and edit the data displayed.

• Boundary conditions

• Error conditions

• Rare combinations of data.

Page 23: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

23 © 2009 IBM Corporation

44Comparing Data

• Compare the "before" and "after" data from an application test

• Compare results after running modified application during regression testing

• Identify differences between separate databases

• Audit changes to a database

• Compare analyzes complete sets data –finding changes in rows in tables

• Single-table or multi-table compare

• Creates compare file of results

• Displays results on screen

Page 24: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

24 © 2009 IBM Corporation

44Analyzing Test Data Results

• Both Invoices total $100

• Composition is different

• Could we have missed an error?

27645 86-4538 Widget#1 $80.00

27645 86-4538 Widget#PG13 $20.00

Invoice Total $100.00

Version 1

Version 2

INVOICES

27645 86-4538 Widget#1 $50.00

27645 86-4538 Widget#PG13 $50.00

Invoice Total $100.00

INVOICES

Page 25: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

25 © 2009 IBM Corporation

44Browsing the Compare File

• Generated for each pair of tables

• Identifies tables containing unmatched rows

• Identifies tables containing duplicate match keys

Page 26: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

26 © 2009 IBM Corporation

44View Details of Discrepancies

Page 27: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

27 © 2009 IBM Corporation

55

Production

Environment

Baseline Subset

Test(DB2 LUW/ AIX)

Dev(Oracle/ Solaris)

QA(Sybase/ Linux)

Extract/ Archive File

Dynamically load

relational intact data

sets & objects based

on selection criteria

Easily Refresh Test Environments

Page 28: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

28 © 2009 IBM Corporation

Additional TDM Features to Consider

• Compare Pre and Post Mask

• Extract File Browsing

• Schedule Jobs

• Command Line Interface

• Federated Data Access

• MetaData Extracts

• Re-startability

Page 29: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

29 © 2009 IBM Corporation

Benefits of Test Data Management

• Efficient creation and management of test environments

• Environment size reduction

• Replication time reduction

• Fewer users per environment through a segmented test process.

• Increase accuracy of testing through fresher Data

• Reduced time to conduct the tests

• More parallel testing possible

• Reusable tool and methodology

• Reduced risk to test data

• Reduced volume of exposed data.

• Reduced value of exposed data via masking

• Increased regulatory compliance

• Reduced risk of legal exposure

Page 30: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

30 © 2009 IBM Corporation

Why Do Something? TDM Saves Money

Leading North American Financial Institution –

Eliminated downtime associated with rebuilding test environments -

savings of up to $250,000 per year. Achieved more than $100,000 annual savings collectively for 10 to 15 projects.

$Large International Financial Services Group –

Reduced the time needed to create a test environment by up to 90% (from 20 days to just 2 days). Improved time-to-deployment of new application

functionality, contributing to critical business/financial initiatives.

Leading Banking & Payment Technology Solutions –

Reduced operational cost and improved efficiencies by reducing

the size of test database from 1.2TB to 24GB

Page 31: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

31 © 2009 IBM Corporation

Questions?

Page 32: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

32 © 2009 IBM Corporation

Thank You

Page 33: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

33 © 2009 IBM Corporation

About IBM Optim

• Proven leader in Integrated Data Management (IDM):

• Manage and Control Data Growth

• Data Retention, Compliance & Discovery

• Speed Application Delivery & Quality with Test Data Management

• Speed Application Upgrades & Migrations

• Application Retirement

• Improve Storage Management – ILM

• Improve Application Performance and SLAs

• Solving complex data management issues since 1989

• Global company: 2500 clients; 50% of Fortune 500

• Recognized by Gartner, IDC, META as EDM industry leader with 46% market share.

Page 34: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

34 © 2009 IBM Corporation

Optim™ Solves the IDM Challenge

• Archiving

• Improve performance

• Control data growth, save storage

• Support retention compliance

• Streamline upgrades

• Test Data Management

• Create targeted, right sized test environments

• Improve application quality

• Speed iterative testing processes

• Data Privacy

• Mask confidential data

• Comply with privacy policies

• Application Migration & Retirement

• Maintain referential integrity

Page 35: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

35 © 2009 IBM Corporation

IBM Integrated Data Management

IBM Optim: Enterprise Architecture

Database Design, Development & Administration, Data Growth, Data Privacy, Test Data

Management, Application Upgrades & Retirements, Data Retention & E-Discovery

Page 36: Essentials for Test Data Management - ITPSC...• Drivers for Effective Test Data Management (TDM) • Effective Test Data Management • Test Environment Creation • Data Masking

36 © 2009 IBM Corporation

Trademarks and disclaimers

Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries./ Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. UNIX is a registered trademark of The Open Group in the United States and other countries. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others. Information is provided "AS IS" without warranty of any kind.

The customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer.

Information concerning non-IBM products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not constitute an endorsement of such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly available information, including vendor announcements and vendor worldwide homepages. IBM has not tested these products and cannot confirm the accuracy of performance, capability, or any other claims related to non-IBM products. Questions on the capability of non-IBM products should be addressed to the supplier of those products.

All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.

Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The information is presented here to communicate IBM's current investment and development activities as a good faith effort to help with our customers' future planning.

Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements equivalent to the ratios stated here.

Prices are suggested U.S. list prices and are subject to change without notice. Starting price may not include a hard drive, operating system or other features. Contact your IBM representative or Business Partner for the most current pricing in your geography.

Photographs shown may be engineering prototypes. Changes may be incorporated in production models.

© IBM Corporation 1994-2008. All rights reserved.

References in this document to IBM products or services do not imply that IBM intends to make them available in every country.

Trademarks of International Business Machines Corporation in the United States, other countries, or both can be found on the World Wide Web at http://www.ibm.com/legal/copytrade.shtml.