a system integration test approach mitigating the unavailability of good quality data in dwt

31
A system integration test approach mitigating the unavailability of good quality data in DWT Saurabh Shinde Partha Majhi Infosys Limited (NASDAQ: INFY)

Upload: amber-bowen

Post on 31-Dec-2015

19 views

Category:

Documents


0 download

DESCRIPTION

A system integration test approach mitigating the unavailability of good quality data in DWT. Saurabh Shinde Partha Majhi Infosys Limited (NASDAQ: INFY). Abstract. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A system integration test approach mitigating the unavailability of good quality data in DWT

A system integration test approach mitigating the unavailability of

good quality data in DWTSaurabh Shinde

Partha Majhi

Infosys Limited (NASDAQ: INFY)

Page 2: A system integration test approach mitigating the unavailability of good quality data in DWT

Abstract

Retail banks have ever increasing challenge of servicing millions of customers with varied needs and hence posses the necessity to maintain sound functional systems which deliver on time. Some of the key challenges with performing testing related to such systems are proper knowledge of the interacting systems, the business awareness and availability of good quality test data. A bad production release might impact the overall functioning of the bank, affect the credit risk analyzers or even loose the trust of its customers. Hence it is imperative need of the quality management team for the bank to have system integration test approach that also gives due diligence to ensure good test data.

2

Page 3: A system integration test approach mitigating the unavailability of good quality data in DWT

Abstract (cont..)

We would like to share our experience with an approach related to such quality testing for banking organization whose need was to ensure that its data is Basel compliant. Typical challenges in such area of testing are lack of domain and technical expertise, stringent timeline for delivery with no expected issues, etc. Hence a testing approach is required which encompasses testing techniques that are driven by business rules and data, and possess means to validate with good quality test data.

3

Page 4: A system integration test approach mitigating the unavailability of good quality data in DWT

Target Audience

This tutorial can benefit:

Test Managers to plan approach for creating test data and performing system integration test.

Test Leads/Engineers to practice the approach to perform the system integration test along with creating valid test data

4

Page 5: A system integration test approach mitigating the unavailability of good quality data in DWT

Outline of the Tutorial1. Introduction

a. Objectives of the sessionb. User expectationsc. Context setting

2. Banking Operationsa. Overview and relation to context

3. Data warehouse testinga. Overview and relation to context

4. Limitations of test data encountered in testing

a. Limitations

5. Managing test dataa. Approach to manage test data

5

5. System Integration test approach a. System integration test

approach

6. Case Study (application of the approach)

a. Case studies

7. Summarization

8. Closure a. Q&A

Page 6: A system integration test approach mitigating the unavailability of good quality data in DWT

Objectives of this session – is to prepare you for the situations that we come across daily in our testing life with respect to test data tend to ignore it.

Your Expectations for this session – It won’t solve all of your test data related woes but promise handle a handful that trouble you most.

6

Objectives, Expectations…

Page 7: A system integration test approach mitigating the unavailability of good quality data in DWT

Banking Operations

Retail Banks provide varied payment services to its customer.• Personal accounts (checking, saving)• Cards (Credit, Debit)• Mortgage loans• Home equity loans• Personal loans

They have to manage the related data for day-to-day functioning.

They charge interests on the services and generate revenue which keeps them functioning. They possess risk of ‘failure to generate revenue’ in the event that the customer defaults. Hence, it is critical for every bank to analyze and manage their risks to keep them profitable. They process the data through such ‘risk application’ systems. Typically, the data would flow such as

Page 8: A system integration test approach mitigating the unavailability of good quality data in DWT

Banking Operations

The application systems are supported by data warehouses. Testing these systems requires having sound understanding of the data processing done by them.

Page 9: A system integration test approach mitigating the unavailability of good quality data in DWT

Data Warehouse testing

  Typically, DWT project testing is done at 3 phases:

o EXTRACTo TRANSFORMo LOAD

Page 10: A system integration test approach mitigating the unavailability of good quality data in DWT

Banking Operations

The testing that is to be done in EXTRACT and LOAD phase is predominantly process oriented and has no to little data dependency. A subset of production data will be sufficient to test these two phases.

   The challenge comes when we are to test TRANSFORM phase. This is

the process of altering data based on a set of business rules. o Simple conversion: Value of a field converted based on another.

Eg: a currency field is converted from native currency to local currency value.

o Complex conversion: Value of a field is derived based on relation between multiple fields. This might also necessitate joining different tables.

o Filter out: Some data is filtered out based on defined business criteria. Eg: Duplicate records are dropped.

  

Page 11: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

To visualize an example, consider a transform process that has many logical branches based on which the Source data is transformed.

e.g.  If (0<= amount_field <100) set attribute_1 = 1          else If (100 <= amount_field <200) then attribute_1 = 2          else If (200 <= amount_field <300) then attribute_1 = 3          else If (300 <= amount_field <400) then attribute_1 = 4          else If (400 <= amount_field <500) then attribute_1 = 5         .         .         .        Else If (1000 <= amount_field <1100) then attribute_1 = A

We require test data that could test all the conditions in order to validate the logical branches.

Page 12: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

To visualize an example, consider a transform process that has many logical branches based on which the Source data is transformed.

e.g.  If (0<= amount_field <100) set attribute_1 = 1          else If (100 <= amount_field <200) then attribute_1 = 2          else If (200 <= amount_field <300) then attribute_1 = 3          else If (300 <= amount_field <400) then attribute_1 = 4          else If (400 <= amount_field <500) then attribute_1 = 5         .         .         .        Else If (1000 <= amount_field <1100) then attribute_1 = A

We require test data that could test all the conditions in order to validate the logical branches.

Page 13: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

But most often, we experience  

o Test data do not contain values to test all scenarioso Test data is not available for certain fields that have restricted

permissionso Test data that is sourced from upstream may not be available in

time for the testing cycle, leading to reduced testing window for the release

o Test data is not production like, leading to possibly miss out on identifying specific production issues

o Even if we get production sanitized test data, it may not provide coverage for all business rules covering the application domain

To mitigate this unavailability of test data, we need to manage the test data so that it resembles to production data and at same time it covers business scenarios and is based on our requirements.

Page 14: A system integration test approach mitigating the unavailability of good quality data in DWT

Managing test data Managing test data is a process involving several activities. 

o Creating instances of test environment layers• Layer 1: Procuring data from production• Layer 2: Manipulating data as per test requirements• Layer 3: Storing data as ‘Regression Set’ for future

regression use• Layer 4: Test instance where data would be made available

for test

o Procure data from Production• Analyze the test scope and identify the data that is to be

fetched from production• Batch jobs programs could be created to fetch the data from

production (with assist from Development team) into the Test Layer 1 instance

• Handling production data is difficult if it is of huge volume, hence based on the test scope and test requirements, a proper sub-set should be chosen

Page 15: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing o Masking the data for security

• Customer sensitive information such as customer name, customer account number, customer address, customer phone number, customer identification number (country specific), etc should be masked in order to provide security for such confidential data.

o Analyze data for test coverage• Analyze the data fetched from production to verify it is

sufficient to cover all the business scenarios of the test requirements

• Usually it is found that the data do not provide coverage for all business scenarios and performing test with such data potential leads to letting the defects move into production

• In order to manipulate the data as per business scenarios under test, copy the Test Layer 1 instance data to Test Layer 2 instance

• Data could directly be modified in Layer 2 or else if the volume of data is minimum, excel files could be used for modification

Page 16: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

o Manipulating data as per test requirements• Study the business scenarios• Identify data that is nearest to match to test the business

scenarios• Update or create test data as needed• The test data as per test requirements to cover the business

scenarios is available in Layer 2o Test data regression set

• Identify regression test scenarios/test cases• Store data created for these in Layer 3 for future use• Create a mapping between test scenarios/business

scenarios to test data

Page 17: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

o Move test data to testing environment•  Once the Layer 2 data is ready, it could be copied to the

desired test environment instance (Layer 4)

Other salient activitieso Timely Clean up mechanism should be planned

• Layer 1 and Layer 2 instances serve temporary instances for creating the required test data

• A periodical cleanup plan as applicable to the testing cycles should be planned

Page 18: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

o New project initiative• If test is to perform for a completely new project and hence

no production data is available, one should understand the business scenarios and create data meets the coverage

 o Multiple project requirements

•  Mechanism for provisioning simultaneous projects requiring different sets of data should be planned

Page 19: A system integration test approach mitigating the unavailability of good quality data in DWT

System Integration test approach

An integration test could be made more effective with the following approach

Page 20: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

Referenceo Get the count of records sourced from the input sourceo Get the sum of critical amount fieldso This will be form as the reference to check the loading process

into staging areao The data at this stage is selected from production

  Test 1

o Compare the count of records and sum of critical amount fields against the reference values

o Take into account records that are designed to be dropped (eg: duplicate records or null value records, etc)

o This validates the loading process into staging areao Manipulate the data to cover all business scenarioso This data forms the input to perform the functional test on the

systemo Get the count of records and sum of critical amount fields

Page 21: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

Test 2o Compare the count of records and sum of critical amount fields

against the values from staging areao Take into account records that are designed to be dropped or

modified as per business design

Page 22: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

o Perform thorough business functionality testing, some of the tests here would include• Straight move:

• Valid values / lookup values validation:

• Data derivation:

Page 23: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

Test 3o Compare the count of records and sum of critical amount fields

against the values from system intermediate area. In most systems, these should match

o Perform tests for critical business functionalitieso This is final system area which is available for business users

and downstream applications Benefits of the approach

o Identification of defect earlier in the system processo Identification of precise defect origination stage within the

systemo Uncovering defects not just related to data, but also related to

processes within the systemo Early defect fix reduces cost to fix the defect, improves system

quality and accelerates the release cycleo Gains confidence of the end user that the system is delivered as

expected quality and functionality

Page 24: A system integration test approach mitigating the unavailability of good quality data in DWT

Case Study (application of the approach)

Consider a credit risk reporting system of a Financial Institution that gets data from multiple sources and this data is transformed and used further for regulatory reporting. There exist numerous processes that are used to transform this incoming data. We explain here the difficulty that we faced when we tried to test a process whose purpose was to set a flag called Asset flag. Asset flag indicated what kind of asset we are dealing with. Based on combination of parameters, assets (in this case loan) used to be classified in to "Commercial" or "Retail" category.Asset flag was a very important flag as further BASEL calculation were done based on this flag's value. These calculations are very important for Regulatory reporting. These kinds of things make or break a bank in today's world. We have represented a subset of scenarios and test data requirements using below GRID.

Page 25: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

• Column 1 - Gives you the scenario number• Columns 2 to Column 9 are the input parameters• Column 10 is the output.

Page 26: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

Here we are showing only a sample example of the actual grid; in practice this grid had few thousands of rows. Due to sheer volume of data (millions of records), each row cannot be validated individually.In order to test, one approach was to execute these processes for a month end (production) data and the output of process to be validated using exception queries. The month end data may or may not satisfy all the scenarios that are given in above grid.So even though test scenarios weren’t satisfied by data, the test is passed when no exception is returned by exception query.Hence many of the logical branches were not tested but passed.When you consider few hundreds of sources of data the problem multiplies by that factor and you end with huge amount of code that has not been tested but still QA certified. To overcome this we had to resort to test data management. What we did was create a golden copy of data which is a subset of production data which satisfied maximum number of our test scenarios. Then based on testing requirement this data was cloned and sanitized further based on test requirement. So whenever a new source of data was introduced the data was sanitized and manipulated to satisfy all the test scenarios and was made available for testing

Page 27: A system integration test approach mitigating the unavailability of good quality data in DWT

Limitations of test data encountered in testing

Page 28: A system integration test approach mitigating the unavailability of good quality data in DWT

Summarization To summarize, in today’s time where data ware house testing is an

inseparable part of banking industry, there is an urgent need of incorporating test data management in QA process so that no part of code remains untested.

 o Incorporating test data management will result in:

• Data for all test scenarios• Increase data quality• Better testing • Better test coverage

Page 29: A system integration test approach mitigating the unavailability of good quality data in DWT

Closure

29

Did we meet the objectives?

Page 30: A system integration test approach mitigating the unavailability of good quality data in DWT

References

Infosys project experience Infosys resources (www.infosys.com)