business intelligence: multidimensional analysis

47
Business Intelligence Michael Lamont, ’12 [email protected]

Upload: michael-lamont

Post on 20-Jun-2015

332 views

Category:

Data & Analytics


7 download

DESCRIPTION

An introduction to multidimensional business intelligence and OnLine Analytical Processing (OLAP) suitable for both a technical and non-technical audience. Covers dimensions, attributes, measures, Key Performance Indicators (KPIs), aggregates, hierarchies, and data cubes.

TRANSCRIPT

Page 1: Business Intelligence: Multidimensional Analysis

Business Intelligence

Michael Lamont, ’12

[email protected]

Page 2: Business Intelligence: Multidimensional Analysis

The Analysis Gap

The Analysis Gap

Page 3: Business Intelligence: Multidimensional Analysis

Soda Example

Cola Cherry Grape Lemon-Lime

Munich Frankfurt Cologne Berlin

Page 4: Business Intelligence: Multidimensional Analysis

Soda Example

Time $ Sales

Q3 $16,000

Q4 $16,000

Total $32,000

Page 5: Business Intelligence: Multidimensional Analysis

Soda Example

Time $ Sales

Q3 $16,000

Q4 $16,000

Total $32,000

Product $ Sales

Cola $8,000

Cherry $8,000

Grape $8,000

Lemon-Lime $8,000

Total $32,000

Geography $ Sales

Munich $8,000

Frankfurt $8,000

Cologne $8,000

Berlin $8,000

Total $32,000

Page 6: Business Intelligence: Multidimensional Analysis

Soda Example Munich Frankfurt Cologne Berlin Total

Q3 Cola $ - $ - $2,500 $1,500 $4,000

Cherry $ - $ - $2,000 $2,000 $4,000

Grape $1,000 $3,000 $ - $ - $4,000

Lem-Lime $2,000 $2,000 $ - $ - $4,000

Total Q3 $3,000 $5,000 $4,500 $3,500 $16,000

Q4 Cola $4,000 $ - $ - $ - $4,000

Cherry $1,000 $3,000 $ - $ - $4,000

Grape $ - $ - $1,500 $2,500 $4,000

Lem-Line $ - $ - $2,000 $2,000 $4,000

Total Q4 $5,000 $3,000 $3,500 $4,500 $16,000

Total $8,000 $8,000 $8,000 $8,000 $32,000

Page 7: Business Intelligence: Multidimensional Analysis

Multidimensional Analysis

Intuitive way for people with business

training to analyze data

Natural

Easy

Effective

Difficult to get data into a format that

supports multidimensional analysis

Page 8: Business Intelligence: Multidimensional Analysis

Operational Databases

Where did our data come from?

Lots of individual shoppers buying a soda

Each transaction stored in database

designed to store checkout transactions

Operational Database: supports the

day-to-day operations of a company

Data in operational databases can’t

easily be analyzed

Page 9: Business Intelligence: Multidimensional Analysis

Operational Databases

Core operational database functionality:

Gather data

Update data

Store data

Retrieve data

Archive data

Page 10: Business Intelligence: Multidimensional Analysis

Operational Databases

OLTP: Online Transaction Processing

Page 11: Business Intelligence: Multidimensional Analysis

OLTP Example

Buying toothpaste at Target:

1. You place toothpaste on conveyor belt

2. Cashier swipes barcode over POS scanner

3. POS system looks up price of toothpaste

4. POS totals cost of transaction + tax

5. POS prompts for payment

6. You swipe debit card and enter PIN

7. POS system xfers cost of toothpaste from your

bank account to Target’s account

8. POS generates receipt and cashier bags

purchase

Page 12: Business Intelligence: Multidimensional Analysis

Key OLTP Characteristics

Processes a transaction according to

rules

Performs all elements of a transaction in

real time

Continually processes multiple

transactions

Page 13: Business Intelligence: Multidimensional Analysis

OLTP Systems

OLTP systems are everywhere:

Order tracking

Invoicing

Credit card processing

Retail POS

Banking

Airline reservations

OLTP is optimized for managing low-

level business data

Page 14: Business Intelligence: Multidimensional Analysis

OLTP Systems

OLTP systems can be used to answer

transactional questions

Raw transactional data not really useful

for business intelligence

OLTP systems can’t be used to answer

most analysis questions

Can’t search, sort, & summarize large

numbers of records

Can’t handle required calculations

Negative impact on OLTP system performance

Page 15: Business Intelligence: Multidimensional Analysis

OLTP Systems

OLTP systems gather raw data used for

multidimensional analysis

Raw data has to be converted into

something suitable for analysis

Converting raw data to something useful

isn’t easy

Page 16: Business Intelligence: Multidimensional Analysis

OLTP Systems

IT dept used to spend most of their time

and resources on operational systems

Usually purchased as packaged apps

today

Today’s operational apps usually include

some meaningful reporting capabilities

Page 17: Business Intelligence: Multidimensional Analysis

OLTP Systems

Packaged systems have 2 big limitations:

1. Can only report on their own data – “silos” of data

2. Don’t really support multidimensional analysis

Sales Marketing Accounting Finance

Page 18: Business Intelligence: Multidimensional Analysis

OLTP Systems

Every large company has some sort of

BI system to analyze operational data

OLTP system vendors are constantly

improving their ability to integrate with BI

systems

Page 19: Business Intelligence: Multidimensional Analysis

OLAP

Modern BI systems designed to follow

OnLine Analytic Processing (OLAP)

model

Named by IBM’s E.F. Codd (inventor of

SQL and relational databases)

All OLAP systems have to meet three

key criteria

Page 20: Business Intelligence: Multidimensional Analysis

Three Key OLAP Criteria

1. Must support multidimensional analysis

Top managers/analysts have always

thought multidimensionally

View “by” qualifiers are usually dimensions

OLAP systems organize data into

multidimensional structures

Provide tools for users to examine/filter

dimensional data

Page 21: Business Intelligence: Multidimensional Analysis

Three Key OLAP Criteria

2. Fast retrieval times

Answer more questions in less time

“Infinite Question Syndrome”

3. Calculation engine that can handle

specialized multidimensional math

Lets analysts use simple formulas that are

auto-performed across dimensions

Page 22: Business Intelligence: Multidimensional Analysis

Dimensions

Dimension: categorically consistent

view of data

Two tests for dimensionality:

1. Can data about members be compared?

○ Sales numbers of one product compared to

sales numbers of another product

2. Can data from members be aggregated

into summaries?

○ Jan, Feb, Mar aggregate together as Q1

Page 23: Business Intelligence: Multidimensional Analysis

Slicing & Dicing

Dimensions let you “slice and dice”

multidimensional data

Page 24: Business Intelligence: Multidimensional Analysis

Slicing & Dicing

Product X

Page 25: Business Intelligence: Multidimensional Analysis

Slicing & Dicing

Jan Feb Mar Apr May

Boston

New York

Philadelphia

Baltimore

Washington

Page 26: Business Intelligence: Multidimensional Analysis

Pivoted Soda Data Cola Cherry Grape Lem-Lime Total

Munich Qtr 3 $ - $ - $1,000 $2,000 $3,000

Qtr 4 $4,000 $1,000 $ - $ - $5,000

Total $4,000 $1,000 $1,000 $2,000 $8,000

Frankfurt Qtr 3 $ - $ - $3,000 $2,000 $5,000

Qtr 4 $ - $3,000 $ - $ - $3,000

Total $ - $3,000 $3,000 $2,000 $8,000

Cologne Qtr 3 $2,500 $2,000 $ - $ - $4,500

Qtr 4 $ - $ - $1,500 $2,000 $3,500

Total $2,500 $2,000 $1,500 $2,000 $8,000

Berlin Qtr 3 $1,500 $2,000 $ - $ - $3,500

Qtr 4 $ - $ - $2,500 $2,000 $4,500

Total $1,500 $2,000 $2,500 $2,000 $8,000

Grand Total $8,000 $8,000 $8,000 $8,000 $32,000

Page 27: Business Intelligence: Multidimensional Analysis

OLAP

Munich

Frankfurt

Cologne

Berlin

Geogra

phy D

imensio

n

Page 28: Business Intelligence: Multidimensional Analysis

OLAP

Q1 Q2 Q3 Q4

Time Dimension

Page 29: Business Intelligence: Multidimensional Analysis

OLAP

Cola

Cherry

Grape

Lemon-Lime

Page 30: Business Intelligence: Multidimensional Analysis

OLAP

Munich

Frankfurt

Cologne

Berlin

Geogra

phy D

imensio

n

Q1 Q2 Q3 Q4

Time Dimension

Cola

Cherry

Grape

Lemon-Lime

Page 31: Business Intelligence: Multidimensional Analysis

OLAP

Munich

Frankfurt

Cologne

Berlin

Geogra

phy D

imensio

n

Q1 Q2 Q3 Q4

Time Dimension

Cola

Cherry

Grape

Lemon-Lime

$2,000

Page 32: Business Intelligence: Multidimensional Analysis

$32,000

OLAP

Munich

Frankfurt

Cologne

Berlin

Geogra

phy D

imensio

n

Q1 Q2 Q3 Q4

Time Dimension

Cola

Cherry

Grape

Lemon-Lime

Page 33: Business Intelligence: Multidimensional Analysis

OLAP

Munich

Frankfurt

Cologne

Berlin

Geogra

phy D

imensio

n

Q1 Q2 Q3 Q4

Time Dimension

Cola

Cherry

Grape

Lemon-Lime

Page 34: Business Intelligence: Multidimensional Analysis

OLAP

Munich

Frankfurt

Cologne

Berlin

Geogra

phy D

imensio

n

Q1 Q2 Q3 Q4

Time Dimension

Cola

Cherry

Grape

Lemon-Lime

$8,000

Page 35: Business Intelligence: Multidimensional Analysis

OLAP

Data cubes can have very large

numbers of members

OLAP Cube: multidimensional structure

that stores and maintains discrete

intersection values

Some OLAP systems let cubes intersect

with each other

Page 36: Business Intelligence: Multidimensional Analysis

Hierarchies

Typical analysis task:

Units Sold, Average Price, Dollar Sales

100 products

24 months

200 major cities

Total data points: 1,440,000

Not all products sold in all cities during

all months

Page 37: Business Intelligence: Multidimensional Analysis

Hierarchies

Hierarchy – organizes data by levels

Each level in the hierarchy is the

aggregate of the levels beneath it

Examples:

Monthly data rolls up to quarters and years

Cities roll up to regions and states

Products roll up to product lines and groups

Calculations, like Average Price, can be

back-calculated at each hierarchy level

Page 38: Business Intelligence: Multidimensional Analysis

Hierarchies

Hierarchies let you drill-down into data

to explore interesting patterns and

anomalies

Top-down approach is like “20

Questions”

Start by exploring broad trends

Become more focused as analysis

progresses

Top-down thinking is natural way for

humans to organize complex info

Page 39: Business Intelligence: Multidimensional Analysis

Ad hoc Analysis

Point-and-click drill-down is made

usable by OLAP’s rapid response model

Lets managers and analysts perform ad

hoc analysis

Paper-based reporting gives fixed

answers to fixed questions

OLAP-based ad hoc analysis lets

virtually any question be answered

quickly

Page 40: Business Intelligence: Multidimensional Analysis

Ad hoc Analysis

Virtually any report can be formatted

multidimensionally (pivoting & nesting

dimensions)

Virtually anyone can be taught how to

do their own analysis work with minimal

training

Page 41: Business Intelligence: Multidimensional Analysis

Sample Hierarchy

2013

Q1

Jan Feb Mar

Q2

Apr May Jun

Q3

Jul Aug Sep

Q4

Oct Nov Dec

Page 42: Business Intelligence: Multidimensional Analysis

Attributes

Attribute: descriptive non-hierarchical

information

Examples:

Model number

Size

List price

Color

Flavor

Street address

Page 43: Business Intelligence: Multidimensional Analysis

Measures

Measure: any quantitative expression

contained in an OLAP system

A measure is the data that’s being

analyzed across multiple dimensions

Example: Dollar Sales of soda by

month, by product, and by city

Page 44: Business Intelligence: Multidimensional Analysis

Measures

Four important properties of a measure:

1. Always a quantity or expression that yields

a quantity

2. Can take any quantitative format

3. Can be derived from any original data

source or calculation

4. At least one measure required to perform

OLAP analysis

Page 45: Business Intelligence: Multidimensional Analysis

Measures

The measures to be analyzed depend

on the purpose of the OLAP system

In BI, measures known by different

names depending on application:

Metric/Key Performance Indicator (KPI)

Benchmark

Ratio

Page 46: Business Intelligence: Multidimensional Analysis

Summary

Analysis gap between raw data and BI

can be bridged by combining OLTP

systems with BI systems

OLAP systems provide ad hoc analysis,

slicing and dicing, pivoting dimensions,

and drilling down through hierarchies

OLAP provides significant capabilities

over standard single-dimensional

analysis

Page 47: Business Intelligence: Multidimensional Analysis

Michael Lamont, ’12

[email protected]