holy warehouse, holy grail the quest for the single version of truth august 2, 2006 michael covert

18
Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

Upload: nayeli-elkin

Post on 16-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

Holy Warehouse, Holy Grail

The Quest for the Single Version of Truth

August 2, 2006

Michael Covert

Page 2: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

2

Agenda

Warehouse scenarios– Why are they built?– What can we expect when building one?

Guiding principles– Leadership and Communication– Modeling your business– Incremental Delivery and Construction– Technology Decisions

Building a Data Management road map A Case Study

Page 3: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

3

Warehouse Scenarios

Why do people build data warehouses?– To allow people to focus on analysis instead of collection and

assembly of data. This is in response to growing needs to understand a business holistically.

Finding it – where is that data? Integrating it – merging data from multiple sources. This is often a

source of error and inconsistency. Reporting on it – spreadsheets and expensive tools. Spreadsheets

continue to be where most analysis occurs. – To produce a single version of truth

Believing it – people and data errors have been corrected. Differing business meanings have been reconciled so that aggregations and variances make sense.

– To reduce internal friction and improve decision making Reports from non-integrated systems often produce inconsistent and

contradictory information. This leads to slowed decision making and in some cases, infighting.

Page 4: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

4

Warehouse Scenarios

What can we expect when building a data warehouse?

– Data warehouses typically integrate data from multiple areas:

Finance, HR, Sales, Inventory, etc.

– This invariably leads to larger than normal decision making processes and to multi-departmental involvement

Politics will occur. Big picture (enterprise) issues will arise and they will be difficult

to solve.– Definitions, processes, frequency, and reliability

Page 5: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

5

Warehouse Scenarios

Business data can vary greatly in complexity and in definition– Variation in attributes associated with a common entity type leads

to an explosion in number of tables Loans – auto loan, revolving credit, overdraft protection, mortgage, ad

infinitum Apparel – a myriad of categories, subcategories, and lower level

hierarchies of product-specific attributes– Integration and aggregation of these entities requires cross-

departmental alignment of the most basic definitions Risk elements – probability of default, loss given default, collateral

codes, service charge allocation Sales and marketing – promotion and ring code usage, campaign

phase definition, category definition, even SKU and package encoding

Page 6: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

6

Warehouse Scenarios

The great technology debates– How will I organize my warehouse?

Should I build a star schema or a 3NF database?– How many technologies will I use?

The database itself– Warehouse, warehouse farms– Operational Data Store– Downstream data marts

The ETL Layer A metadata repository Master data management How will I cleanse my data? Usage of data marts and down stream systems How will I implement my analytics and reporting

layers?

Page 7: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

7

Guiding Principles

Leadership– The best functioning data warehouse teams are cross-

departmental and collaborative. Leadership is essential.– A shared expense model is preferred, but again requires

leadership and cultural adoption. Project based expenses are tempting, but can lead to departmental development of an enterprise asset.

Communication– Communicate frequently. Advise on new data areas as they

become available. Strive for smaller but more frequent additions.– Track adoption and use communication to increase usage.– When sufficient adoption as been achieved, if possible TURN OFF

THE OLD SYSTEM(S)!!!

Page 8: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

8

Guiding Principles

Model your business and maintain these models.

Use a data modeling tool to maintain your models.

Page 9: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

9

Guiding Principles

Incremental delivery– Break the warehouse into subject areas that can be developed and

evolved incrementally.– Think multi-dimensionally.

Devise a multi-dimensional structure for each subject area. Identify overlaps where shared dimensions exist.

3NF versus Star schemas– In most cases, 3NF versus star schema decisions should be based

on: Skill sets In-place technology and processes Technology governance

– Many times, 3NF schemas have views that emulate star schemas. Beware of view join overhead.

* Build on successes

Page 10: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

10

Guiding Principles

Technology selection– Limit technology through IT stewardship and

governance. Stay within product families where possible. Implement an ETL layer. Use it to reuse data

interfaces and reduce point-to-point data complexity. Use multi-dimensional systems to offload data

aggregation complexity.– Maintain conforming dimensions.

Strive for enterprise reporting, but realize that it is very difficult to achieve.

Page 11: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

11

Guiding Principles

Page 12: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

12

Guiding Principles

Layer your database to provide:– A data “landing zone” (also referred to as a staging area)– A data cleansing and integration layer

Assign data ownership to the owning line of business. Build in “data lineage” traceability. Program defensively from the very beginning. Put cleaning rules here. Avoid intermingling them into operational code at

all costs. Strive to reverse engineer them out of surviving systems.– A cleansed, integrated layer that is used to:

Feed downstream systems. Provide the primary reporting interface for end user systems. Resist allowing access to any other layer, specifically the landing zone! ***

– Use this layered technique to build in audit controls and “restartability”.

Page 13: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

13

Building a Data Management Road Map

A data management roadmap defines all data management processes and control objectives. – CoBiT, ITIL, et al

Page 14: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

14

A Case Study

This case study involves a large financial institution with s significant portion of their business involving collateral-secured loans. Loan reporting environment was manually intensive and heavily

driven off of Microsoft Access databases (approx. 400 Access databases)

End user queries were run against transactional systems. The data that was retrieved was then integrated in redundant, and often inconsistent processes

Limited scalability for future growth (Microsoft Access databases) Similar queries and duplicated analyses were performed by

Business Analysts Business Analysts spent approximately 80% of their time

gathering, re-keying and developing reports Major inconsistencies in these reports were producing political

pressures between sales and the financial analysts.

Page 15: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

15

Case Study – Initial EnvironmentData Sources

DRA Loan Originators Dealers Locations Employees DSS

ILS TSER RMS EDS Other Sales Goals

Reporting(PDF , Prompt Rpts, Excel)

Indirect Dealer Finance AccessDatabase Environment

AccessDatabase

AccessDatabase

AccessDatabase

AccessDatabase

AccessDatabase

AccessDatabase

AccessDatabase

FinancialSpreadsheetsand Reports

FinancialSpreadsheetsand Reports

FinancialSpreadsheetsand Reports

FinancialSpreadsheetsand Reports

AccessDatabaseAccess

DatabaseAccessDatabaseAccess

Database

Excel

Additional 390+Access

Databases

AccessDatabase

*

*

FinancialSpreadsheetsand Reports *

Key:

* Various “Group By” Report views

AccessDatabase

Data Sources Origination Dealers Locations Employees Servicing Risk Securitization

Page 16: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

16

Case Study – New Environment

Landing ZoneCleanse and Integrate

Page 17: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

17

Case Study - Results

Excellent user acceptance of new system Consolidated database now reduces departmental time required to

access data Simplified and pre-integrated data has reduced many inconsistencies Excellent load and response times Microsoft Reporting Services is being used to produce increasingly

complex reports used by end users Microsoft Analysis Services produces cubes that are easy to access

and have nearly instantaneous response time The new system architecture is much easier to change since it is

simplified. New data sources have been added incrementally.

Page 18: Holy Warehouse, Holy Grail The Quest for the Single Version of Truth August 2, 2006 Michael Covert

18

Conclusion

Leadership, Stewardship, and Communication Business modeling and mapping Data quality and ownership Technology Governance and Control

– Adoption of a data management roadmap– Data architecture– Technological risk management

Incremental delivery