dw architecture

29
Data Warehouse Lutfi Freij Konstantin Rimarchuk Vasken Chamlaian John Sahakian Suzan Ton

Upload: sudharani84

Post on 03-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 1/29

Data Warehouse

Lutfi Freij

Konstantin Rimarchuk 

Vasken ChamlaianJohn Sahakian

Suzan Ton

Page 2: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 2/29

Inmon

Father of the data warehouse

Co-creator of the Corporate

Information Factory.He has 35 years of 

experience in database

technology managementand data warehouse design.

Page 3: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 3/29

Inmon-Cont’d 

Bill has written about a variety

of topics on the building, usage,

& maintenance of the data warehouse

& the Corporate Information Factory.

He has written more than 650

articles (Datamation, ComputerWorld,

and Byte Magazine).

Inmon has published 45 books.

Many of books has been translated to Chinese, Dutch, French, German,Japanese, Korean, Portuguese, Russian, and Spanish.

Page 4: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 4/29

Introduction

What is Data Warehouse?

A data warehouse is a collection of integrateddatabases designed to support a DSS.

According to Inmon’s (father of data warehousing)definition(Inmon,1992a,p.5):

It is a collection of integrated, subject-oriented

databases designed to support the DSS function,where each unit of data is non-volatile and relevantto some moment in time.

Page 5: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 5/29

Introduction-Cont’d. 

Where is it used?

It is used for evaluating future strategy.

It needs a successful technician:

Flexible.

Team player. Good balance of business and technical

understanding. 

Page 6: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 6/29

Introduction-Cont’d. 

The ultimate use of data warehouse is Mass Customization.

For example, it increased Capital One’s customers from 1

million to approximately 9 millions in 8 years.

Just like a muscle: DW increases in strength with active use. With each new test and product, valuable information is

added to the DW, allowing the analyst to learn from the

success and failure of the past.

The key to survival: Is the ability to analyze, plan, and react to changing

 business conditions in a much more rapid fashion.

Page 7: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 7/29

Data Warehouse

In order for data to be effective, DW must be:

Consistent.

Well integrated.

Well defined.

Time stamped.

DW environment:

The data store, data mart & the metadata.

Page 8: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 8/29

The Data Store

An operational data store (ODS) stores data for a

specific application. It feeds the data warehouse a

stream of desired raw data.

Is the most common component of DW environment.

Data store is generally subject oriented, volatile,current commonly focused on customers, products,

orders, policies, claims, etc… 

Page 9: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 9/29

Data Store & Data Warehouse

Data store & Data warehouse, table 10-1 page

296

Page 10: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 10/29

The data store-Cont’d. 

Its day-to-day function is to store the data for a

single specific set of operational application.

Its function is to feed the data warehouse data

for the purpose of analysis. 

Page 11: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 11/29

The Data Mart

It is lower-cost, scaled down version of the

DW.

Data Mart offer a targeted and less costly

method of gaining the advantages associated

with data warehousing and can be scaled up to

a full DW environment over time.

Page 12: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 12/29

The Meta Data

Last component of DW environments.

It is information that is kept about the warehouse

rather than information kept within the warehouse.

Legacy systems generally don’t keep a record of characteristics of the data (such as what pieces of data

exist and where they are located).

The metadata is simply data about data.

Page 13: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 13/29

Conclusion

A Data Warehouse is a collection of integrated subject-oriented databases designed to support a DSS. Each unit of data is non-volatile and relevant to some moment in time.

An operational data store (ODS) stores data for a specificapplication. It feeds the data warehouse a stream of desiredraw data.

A data mart is a lower-cost, scaled-down version of a data

warehouse, usually designed to support a small group of users(rather than the entire firm).

The metadata is information that is kept about the warehouse.

Page 14: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 14/29

Data Warehouse

Subject oriented

Data integrated

Time variant

Nonvolatile

Page 15: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 15/29

Characteristics of Data Warehouse

Subject oriented. Data are organized based onhow the users refer to them.

Integrated. All inconsistencies regarding

naming convention and value representationsare removed.

Nonvolatile. Data are stored in read-only formatand do not change over time.

Time variant. Data are not current but normallytime series.

Page 16: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 16/29

A Data Warehouse is Subject Oriented

Page 17: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 17/29

Subject Orientation

Application Environment  Data warehouse

Environment Design activities must be equally

focused on both process and database

design

DW world is primarily void of process

design and tends to focus exclusively on

issues of data modeling and database

design 

Page 18: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 18/29

Data Integrated

Integration   –consistency naming

conventions and measurement attributers,

accuracy, and common aggregation.

Establishment of a common unit of 

measure for all synonymous data

elements from dissimilar database.

The data must be stored in the DW in an

integrated, globally acceptable manner 

Page 19: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 19/29

Data Integrated

Page 20: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 20/29

Time Variant

In an operational application system, theexpectation is that all data within the databaseare accurate as of the moment of access. In theDW data are simply assumed to be accurate asof some moment in time and not necessarilyright now.

One of the places where DW data display timevariance is in the structure of the record key.Every primary key contained within the DWmust contain, either implicitly or explicitly anelement of time( day, week, month, etc)

Page 21: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 21/29

Time Variant

Every piece of data contained within the

warehouse must be associated with a

particular point in time if any useful

analysis is to be conducted with it.

 Another aspect of time variance in DW

data is that, once recorded, data within the

warehouse cannot be updated or changed.

Page 22: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 22/29

Nonvolatility

Typical activities such as deletes, inserts,

and changes that are performed in an

operational application environment are

completely nonexistent in a DWenvironment.

Only two data operations are ever 

performed in the DW: data loading anddata access

Page 23: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 23/29

The Data Warehouse

ArchitectureThe architecture consists of various

interconnected elements:

Operational and external database layer   – the

source data for the DW Information access layer   – the tools the end

user access to extract and analyze the data

Data access layer   – the interface between the

operational and information access layers Metadata layer    – the data directory or 

repository of metadata information

Page 24: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 24/29

Components of the Data

Warehouse Architecture

Page 25: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 25/29

The Data Warehouse

Architecture Additional layers are:

Process management layer   – the scheduler or job

controller 

 Application messaging layer   –

 the “middleware” that

transports information around the firm

Physical data warehouse layer   – where the actual

data used in the DSS are located

Data staging layer   –

all of the processes necessary toselect, edit, summarize and load warehouse data

from the operational and external data bases

Page 26: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 26/29

Data Warehousing Typology

The vir tual data warehouse   – the end users

have direct access to the data stores, using tools

enabled at the data access layer 

The central data warehouse   – a single physicaldatabase contains all of the data for a specific

functional area

The dist r ibuted data warehouse    – the

components are distributed across several

physical databases

D t W h A hit t

Page 27: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 27/29

Data Warehouse Architecture

Data Warehouse

Engine

Optimized Loader  

Extraction

Cleansing

 Analyze

Query

Metadata Repository 

Relational

Databases

Legacy

Data

Purchased

Data

ERPSystems

Architecture of data

Page 28: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 28/29

 Architecture of datawarehousing

28

External

data

Data

 Acquisition

Data Manager 

Warehous

e data

External data

Data

Dictionary

Information

Directiory

Warehous

e data

Middleware

Design

Management

Data

 Access

Page 29: DW Architecture

7/28/2019 DW Architecture

http://slidepdf.com/reader/full/dw-architecture 29/29

 Architecture of 

29