[db tech showcase tokyo 2015] data warehouse basics by wiliiam inmon

25
Forest R im Technology Copyright Inmon Consulting Services, 2008 C DATA WAREHOUSE BASICS a presentation by W H Inmon

Upload: insight-technology-inc

Post on 23-Jan-2018

2.765 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

DATA WAREHOUSE BASICS

a presentation by

W H Inmon

Page 2: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

The data warehouse

- a definition

A subject oriented, non volatile,

integrated, time variant collection

of data for the support of management’s

decisions

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 3: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Granular, detailed data and lots of it

Data that can be shaped and reshaped

A foundation of reconcilability

A basis for new, unknown analysis

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 4: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

key

time

primary data

secondary data

What a typical record of the data warehouse

looks like

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 5: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

key

An identifier

Unique or non unique

Often a compound key

May be natural or blind

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 6: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

time

Time variancy

- continuous

- from date/to date

- periodic discrete

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 7: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Name

Address

Phone

Zip

Email

…….. A continuous

time span record

from

date

to

date

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 8: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Name

Address

Phone

Zip

Email

……..

from

date

to

dateName

Address

Phone

Zip

Email

……..

from

date

to

dateName

Address

Phone

Zip

Email

……..

from

dateto

date

A sequence of time span records

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 9: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

No overlap

Discontinuity is a possibility

999000 From the beginning of time to the end of time

Continuous time span data

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 10: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Periodic discrete structure

Jan 1Expenses

Revenues

No of employees

Stock price

Price per share

………………….

Feb 1Expenses

Revenues

No of employees

Stock price

Price per share

………………….

Mar 1Expenses

Revenues

No of employees

Stock price

Price per share

………………….

Apr 1Expenses

Revenues

No of employees

Stock price

Price per share

………………….

The notion of taking a snapshot as of some one

moment in timeForest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 11: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Periodic discrete structure

Jan 1Expenses

Revenues

No of employees

Stock price

Price per share

………………….

Feb 1Expenses

Revenues

No of employees

Stock price

Price per share

………………….

Mar 1Expenses

Revenues

No of employees

Stock price

Price per share

………………….

Apr 1Expenses

Revenues

No of employees

Stock price

Price per share

………………….

The structure says nothing about values as of any other

date

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 12: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Periodic discrete structure

For few variables

For slow changing variables

Continuous time span data

For many variables

For quickly changing variables

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 13: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Primary data

Primary data relates directly to the key

Example – key – ssno

- primary data – name, date of birth

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 14: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Secondary data

Secondary data relates directly to

the primary data

Example – key – ssno

- primary data – name, date of birth

- secondary data – address, zip, phoneForest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 15: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

The granular data in the

data warehouse –

- serves as a basis for

many other forms of DSS

- is instantly available

- forms a foundation of

reconcilability

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 16: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Relational

structures Star joins

requirements

The data warehouse is shaped by the data model;

The star join world is shaped by requirements

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 17: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Often called

Multi dimensional data

Often called

Atomic data

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 18: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

applications

Legacy data

Operational data

Transactional data

Atomic

data

Data

warehouse

The source of data warehouse data

is the operational environment

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 19: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

m/f

1/0

x/y

male/

female

gender

m/f

integration of data in the data warehouse

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 20: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

inches

cms

feet

miles

unit of

measure

cms

units of measurement need

to be integrated

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 21: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

ETL

Extract/transform/load

The integration and conversion of data

is the most difficult part of the data warehouse

processForest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 22: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

Transformation code can

be generated manually or

automatically.

Automatically is always

preferred

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 23: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

The functions performed

by the ETL process are

not trivial -

Convert

Reformat

Add time element

Restructure

New key

Add default values

Change dbms

Change operating system

Summarize

Break into multiple records

Convert key structure

Merge records

Collect metadata

Conform to data model

Select data/reject data

Add indexes

Change encoding

Change hardware environments

Resequence data

Ascii to ebcdic;ebcdic to ascii

Partition data

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 24: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

ETL performed in host

environment

ETL performed in

source environment

ETL processing can be

performed in different places

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C

Page 25: [db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

data warehouse –

at the center of the

decision making of

the corporation

Forest Rim Technology

Copyright Inmon Consulting Services, 2008C