dw design 1_dim_facts

Post on 27-Jan-2015

107 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Modelado dimensional

TRANSCRIPT

DATA WAREHOUSING Multi Dimensional Data Modeling. Facts and Dimensions

2

While an entity-relationship modeling approach from relational database design could be used, the dimensional modeling approach to logical design is more often used for a data warehouse.

3

End users cannot understand, remember, navigate an E/R model (not even with a GUI)

One reason is that an enterprise-level ERM would be too complex to understand.

4

Software cannot usefully query an E/R model

5

Use of E/R modeling doesn’t meet the DW purpose: intuitive and high performance querying

6

7

Fact Table Dimension Table

Time_Dim TimeKey

TheDate . . .

Sales_Fact TimeKey EmployeeKey ProductKey CustomerKey ShipperKey

$ . . .

Employee_Dim EmployeeKey

EmployeeID . . .

Product_Dim ProductKey

ProductID . . .

Customer_Dim CustomerKey

CustomerID . . .

Shipper_Dim ShipperKey ShipperID . . .

8

Geographic Product Time Units $

Dimension

Tables

Geographic

Product

Time

Fact Table Measures

Facts

Dimension

Several distinct dimensions, combined with

facts, enable you to answer business

questions.

They are normally textual and descriptive descriptions of the business.

9

Dimensions

dimension tables contain relatively small amounts of relatively static data

10

Dimensions

dimension table: usually not-normalized

11

Dimensions

Independent of each other, not hierarchically related

12

Dimensions

Dimensional attributes (attributes no key) help to describe the dimensional value.

13

Dimensional attributes

Fact are (usually numerical) measures of business.

14

Facts

Fact table is the largest table in the star schema and is composed of large volumes of data

15

Facts

Fact table is (often) normalized

16

Facts

fact table has a composite primary key made up of foreign keys

17

Facts

PK = FKi

fact table usually contains one or more numerical facts that occur for the combination of keys that define each record

18

Facts

measures

A fact table contains either detail-level facts or facts that have been aggregated (summary tables)

19

Facts

Σ

Facts are:

additive

semi-additive

non-additive

20

Facts

Non-additive facts cannot be added at all.

An example of this is averages. Semi-additive facts can be aggregated along some of

the dimensions and not along others:

current_Balance is a semi-additive fact as it makes sense to add them up for all accounts (what's the total current balance for all accounts in the bank?) but it does not make sense to add them up through time (adding up all current balances for a given account for each day of the month does not give us any useful information

The most useful measures are: Numeric, Additive

21

Facts

Atomic level of data of the business process

A definition of the highest level of detail that is supported in a data warehouse

22

A fact table usually contains facts with the same level of aggregation

a proper dimensional design allows only facts of a uniform grain (the same dimensionality) to coexist in a single fact table

23

Some perfectly good fact tables represent measurements that have no facts! This kind of measurements is often called an event. The classic example of such a factless fact table is a record representing a student attending a class on a specific day. The dimensions are Day, Student, Professor, Course, and Location, but there are no obvious numeric facts. The tuition paid and grade received are good facts but not at the grain of the daily attendance.

24

Dimensions without attributes. (Such as a transaction number or order number.)

Put the attribute value into the fact table even though it is not an additive fact.

25

26

27

Employee_Dim EmployeeKey

EmployeeID . . .

EmployeeKey

Time_Dim TimeKey

TheDate . . .

TimeKey

Product_Dim ProductKey

ProductID . . .

ProductKey

Customer_Dim CustomerKey

CustomerID . . .

CustomerKey

Shipper_Dim ShipperKey

ShipperID . . .

ShipperKey

Sales_Fact TimeKey EmployeeKey ProductKey CustomerKey ShipperKey $ . . .

TimeKey

CustomerKey ShipperKey

ProductKey EmployeeKey

Multipart Key

Measures

Dimensional Keys

Fact table provides statistics

for sales broken down by

product, time, employee, shipper

and customer, dimensions

28

1. Choosing the data mart for the small group of end users we deal with.

Choose a business process to model, e.g., orders, invoices, etc.

29

2. Fact table granularity (the smallest defined level of data in the table) is determined.

30

3. Fact table dimensions are selected.

Choose the dimensions that will apply to each fact table record

Add dimensions for "everything you know" about this grain.

31

4. Determine the facts for the table. In most cases, the granularity is at the transaction level, so the fact is the amount.

Choose the measure that will populate each fact table record

Add numeric measured facts true to the grain

32

The Data Warehouse Toolkit.Second Edition.The Complete Guide to Dimensional Modeling.Ralph Kimball.Margy Ross

top related