etl test scenarios and test cases

5
ETL Test Scenarios and Test Cases Based on my experience I prepared maximum test scenarios and test cases to validate process. I will keep on update this content. Thanks Test Scenario Test Cases Mapping doc validation Verify mapping doc whether corresponding ETL information is provided or not. Change log should maintain in every mapping doc. Define the default test strategy If mapping docs are missed out some optional information. Ex: data types length etc Structure validation . Validate the source and target ta!le structure against corresponding mapping doc. ". #ource data type and Target data type should !e same. $. Length of data types in !oth source and target should !e e%ual. &. Verify that data field types and formats are specified '. #ource data type length should not less than the target data type len (. Validate the name of columns in ta!le against mapping doc. Constraint Validation Ensure the constraints are defined for specific ta!le as expected. Data Consistency Issues . The data type and length for a particular attri!ute may vary in files ta!les though the semantic definition is the same. Example: )ccount num!er may !e defined as: *um!er +,- in one field or ta!le and Varchar"+- in another ta!le ". isuse of Integrity Constraints: /hen referential integrity constrain misused0 foreign 1ey values may !e left 2dangling3 or inadvertently deleted. Example: )n account record is missing !ut dependent records are not deleted.

Upload: muralikrishna

Post on 04-Nov-2015

213 views

Category:

Documents


0 download

DESCRIPTION

ETL Test Scenarios and Test Cases

TRANSCRIPT

ETL Test Scenarios and Test Cases Based on my experience I prepared maximum test scenarios and test cases to validate the ETL process. I will keep on update this content. Thanks

Test ScenarioTest Cases

Mapping doc validationVerify mapping doc whether corresponding ETL information is provided or not. Change log should maintain in every mapping doc.Define the default test strategy If mapping docs are missed out some optional information. Ex: data types length etc

Structure validation1. Validate the source and target table structure against corresponding mapping doc.

2. Source data type and Target data type should be same.

3. Length of data types in both source and target should be equal.

4. Verify that data field types and formats are specified

5. Source data type length should not less than the target data type length.

6. Validate the name of columns in table against mapping doc.

Constraint ValidationEnsure the constraints are defined for specific table as expected.

Data Consistency Issues1. The data type and length for a particular attribute may vary in files or tables though the semantic definition is the same. Example: Account number may be defined as: Number (9) in one field or table and Varchar2(11) in another table 2. Misuse of Integrity Constraints: When referential integrity constraints are misused, foreign key values may be left dangling orinadvertently deleted.Example: An account record is missing but dependent records are not deleted.

Data Completeness IssuesEnsures that all expected data is loaded in to target table1. Compare records counts between source and target. Check for any rejected records.

2. Check Data should not be truncated in the column of target table.

3. Check boundary value analysis (ex: only >=2008 year data has to load into the target)

4. Comparing unique values of key fields between source data and data loaded to the warehouse. This is a valuable technique that points out a variety of possible data errors without doing a full validation on all fields.

Data Correctness Issues1. Data that is misspelled or inaccurately recorded.

2. Null, non-unique, or out of range data may be stored when the integrity constraints are disabled.Example: The primary key constraint is disabled during an import function. Data is entered into the existing data with null unique identifiers.

Data Transformation1. Create a spread sheet of scenarios of input data and expected results and validate these with the business customer. This is an excellent requirements elicitation step during design and could also be used as part of testing.

2. Create test data that includes all scenarios. Utilize an ETL developer to automate the entire process of populating data sets with the scenario spread sheet to permit versatility and mobility for the reason that scenarios are likely to change.

3. Utilize data profiling results to compare range and submission of values in each field between target and source data.

4. Validate accurate processing of ETL generated fields; for example, surrogate keys.

5. Validate that the data types within the warehouse are the same as was specified in the data model or design.

6. Create data scenarios between tables that test referential integrity.

7. Validate parent-to-child relationships in the data. Create data scenarios that test the management of orphaned child records.

Data Quality1. Number check: if in the source format of numbering the columns are as xx_30 but if the target is only 30 then it has to load not pre_fix(xx_). we need to validate.

2. Date Check: They have to follow Date format and it should be same across all the records. Standard format: yyyy-mm-dd etc..

3. Precision Check: Precision value should display as expected in the target table.Example: In source 19.123456 but in the target it should display as 19.123 or round of 20.

4. Data Check: Based on business logic, few record which does not meet certain criteria should be filtered out. Example: only record whose date_sid >=2008 and GLAccount != CM001 should only load in the target table.

5. Null Check: Few columns should display Null based on business requirement.Example: Termination Date column should display null unless & until if his Active status Column is T or Deceased.Note: Data cleanness will be decided during design phase only.

Null ValidationVerify the null values where "Not Null" specified for specified column.

Duplicate check1. Needs to validate the unique key, primary key and any other column should be unique as per the business requirements are having any duplicate rows.

2. Check if any duplicate values exist in any column which is extracting from multiple columns in source and combining into one column.

3. Some time as per the client requirements we needs ensure that no duplicates in combination of multiple columns within target only.

Example: One policy holder can take multiple polices and multiple claims. In this case we need to verify the CLAIM_NO, CLAIMANT_NO, COVEREGE_NAME, EXPOSURE_TYPE, EXPOSURE_OPEN_DATE, EXPOSURE_CLOSED_DATE, EXPOSURE_STATUS, PAYMENT

DATE ValidationDate values are using many areas in ETL development for:

1. To know the row creation date ex: CRT_TS

2. Identify active records as per the ETL development perspective Ex: VLD_FROM, VLD_TO

3. Identify active records as per the business requirements perspective Ex: CLM_EFCTV_T_TS, CLM_EFCTV_FROM_TS

4. Sometimes based on the date values the updates and inserts are generated.

Possible Test scenarios to validate the Date values:

a. From_Date should not greater than To_Dateb. Format of date values should be proper.c. Date values should not any junk values or null values

Complete Data Validation(using minus and intersect)1. To validate the complete data set in source and target table minus query is best solution.

2. We need to source minus target and target minus source.

3. If minus query returns any value those should be considered as mismatching rows.

4. And also we needs to matching rows among source and target using Intersect statement.

5. The count returned by intersect should match with individual counts of source and target tables.

6. If minus query returns o rows and count intersect is less than source count or target table count then we can considered as duplicate rows are exists.

Some Useful test scenarios1. Verify that extraction process did not extract duplicate data from the source (usually this happens in repeatable processes where at point zero we need to extract all data from the source file, but the during the next intervals we only need to capture the modified, and new rows.)

2. The QA team will maintain a set of SQL statements that are automatically run at this stage to validate that no duplicate data have been extracted from the source systems.

Data cleannessUnnecessary columns should be deleted before loading into the staging area.

Example2: If a column have name but it is taking extra space , we have to trim space so before loading in the staging area with the help of expression transformation space will be trimmed.

Example1: Suppose telephone number and STD code in different columns and requirement says it should be in one column then with the help of expression transformation we will concatenate the values in one column.