dw architecture
TRANSCRIPT
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 1/29
Data Warehouse
Lutfi Freij
Konstantin Rimarchuk
Vasken ChamlaianJohn Sahakian
Suzan Ton
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 2/29
Inmon
Father of the data warehouse
Co-creator of the Corporate
Information Factory.He has 35 years of
experience in database
technology managementand data warehouse design.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 3/29
Inmon-Cont’d
Bill has written about a variety
of topics on the building, usage,
& maintenance of the data warehouse
& the Corporate Information Factory.
He has written more than 650
articles (Datamation, ComputerWorld,
and Byte Magazine).
Inmon has published 45 books.
Many of books has been translated to Chinese, Dutch, French, German,Japanese, Korean, Portuguese, Russian, and Spanish.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 4/29
Introduction
What is Data Warehouse?
A data warehouse is a collection of integrateddatabases designed to support a DSS.
According to Inmon’s (father of data warehousing)definition(Inmon,1992a,p.5):
It is a collection of integrated, subject-oriented
databases designed to support the DSS function,where each unit of data is non-volatile and relevantto some moment in time.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 5/29
Introduction-Cont’d.
Where is it used?
It is used for evaluating future strategy.
It needs a successful technician:
Flexible.
Team player. Good balance of business and technical
understanding.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 6/29
Introduction-Cont’d.
The ultimate use of data warehouse is Mass Customization.
For example, it increased Capital One’s customers from 1
million to approximately 9 millions in 8 years.
Just like a muscle: DW increases in strength with active use. With each new test and product, valuable information is
added to the DW, allowing the analyst to learn from the
success and failure of the past.
The key to survival: Is the ability to analyze, plan, and react to changing
business conditions in a much more rapid fashion.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 7/29
Data Warehouse
In order for data to be effective, DW must be:
Consistent.
Well integrated.
Well defined.
Time stamped.
DW environment:
The data store, data mart & the metadata.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 8/29
The Data Store
An operational data store (ODS) stores data for a
specific application. It feeds the data warehouse a
stream of desired raw data.
Is the most common component of DW environment.
Data store is generally subject oriented, volatile,current commonly focused on customers, products,
orders, policies, claims, etc…
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 9/29
Data Store & Data Warehouse
Data store & Data warehouse, table 10-1 page
296
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 10/29
The data store-Cont’d.
Its day-to-day function is to store the data for a
single specific set of operational application.
Its function is to feed the data warehouse data
for the purpose of analysis.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 11/29
The Data Mart
It is lower-cost, scaled down version of the
DW.
Data Mart offer a targeted and less costly
method of gaining the advantages associated
with data warehousing and can be scaled up to
a full DW environment over time.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 12/29
The Meta Data
Last component of DW environments.
It is information that is kept about the warehouse
rather than information kept within the warehouse.
Legacy systems generally don’t keep a record of characteristics of the data (such as what pieces of data
exist and where they are located).
The metadata is simply data about data.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 13/29
Conclusion
A Data Warehouse is a collection of integrated subject-oriented databases designed to support a DSS. Each unit of data is non-volatile and relevant to some moment in time.
An operational data store (ODS) stores data for a specificapplication. It feeds the data warehouse a stream of desiredraw data.
A data mart is a lower-cost, scaled-down version of a data
warehouse, usually designed to support a small group of users(rather than the entire firm).
The metadata is information that is kept about the warehouse.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 14/29
Data Warehouse
Subject oriented
Data integrated
Time variant
Nonvolatile
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 15/29
Characteristics of Data Warehouse
Subject oriented. Data are organized based onhow the users refer to them.
Integrated. All inconsistencies regarding
naming convention and value representationsare removed.
Nonvolatile. Data are stored in read-only formatand do not change over time.
Time variant. Data are not current but normallytime series.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 16/29
A Data Warehouse is Subject Oriented
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 17/29
Subject Orientation
Application Environment Data warehouse
Environment Design activities must be equally
focused on both process and database
design
DW world is primarily void of process
design and tends to focus exclusively on
issues of data modeling and database
design
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 18/29
Data Integrated
Integration –consistency naming
conventions and measurement attributers,
accuracy, and common aggregation.
Establishment of a common unit of
measure for all synonymous data
elements from dissimilar database.
The data must be stored in the DW in an
integrated, globally acceptable manner
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 19/29
Data Integrated
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 20/29
Time Variant
In an operational application system, theexpectation is that all data within the databaseare accurate as of the moment of access. In theDW data are simply assumed to be accurate asof some moment in time and not necessarilyright now.
One of the places where DW data display timevariance is in the structure of the record key.Every primary key contained within the DWmust contain, either implicitly or explicitly anelement of time( day, week, month, etc)
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 21/29
Time Variant
Every piece of data contained within the
warehouse must be associated with a
particular point in time if any useful
analysis is to be conducted with it.
Another aspect of time variance in DW
data is that, once recorded, data within the
warehouse cannot be updated or changed.
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 22/29
Nonvolatility
Typical activities such as deletes, inserts,
and changes that are performed in an
operational application environment are
completely nonexistent in a DWenvironment.
Only two data operations are ever
performed in the DW: data loading anddata access
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 23/29
The Data Warehouse
ArchitectureThe architecture consists of various
interconnected elements:
Operational and external database layer – the
source data for the DW Information access layer – the tools the end
user access to extract and analyze the data
Data access layer – the interface between the
operational and information access layers Metadata layer – the data directory or
repository of metadata information
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 24/29
Components of the Data
Warehouse Architecture
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 25/29
The Data Warehouse
Architecture Additional layers are:
Process management layer – the scheduler or job
controller
Application messaging layer –
the “middleware” that
transports information around the firm
Physical data warehouse layer – where the actual
data used in the DSS are located
Data staging layer –
all of the processes necessary toselect, edit, summarize and load warehouse data
from the operational and external data bases
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 26/29
Data Warehousing Typology
The vir tual data warehouse – the end users
have direct access to the data stores, using tools
enabled at the data access layer
The central data warehouse – a single physicaldatabase contains all of the data for a specific
functional area
The dist r ibuted data warehouse – the
components are distributed across several
physical databases
D t W h A hit t
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 27/29
Data Warehouse Architecture
Data Warehouse
Engine
Optimized Loader
Extraction
Cleansing
Analyze
Query
Metadata Repository
Relational
Databases
Legacy
Data
Purchased
Data
ERPSystems
Architecture of data
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 28/29
Architecture of datawarehousing
28
External
data
Data
Acquisition
Data Manager
Warehous
e data
External data
Data
Dictionary
Information
Directiory
Warehous
e data
Middleware
Design
Management
Data
Access
7/28/2019 DW Architecture
http://slidepdf.com/reader/full/dw-architecture 29/29
Architecture of
29