business intelligence: multidimensional analysis
DESCRIPTION
An introduction to multidimensional business intelligence and OnLine Analytical Processing (OLAP) suitable for both a technical and non-technical audience. Covers dimensions, attributes, measures, Key Performance Indicators (KPIs), aggregates, hierarchies, and data cubes.TRANSCRIPT
The Analysis Gap
The Analysis Gap
Soda Example
Cola Cherry Grape Lemon-Lime
Munich Frankfurt Cologne Berlin
Soda Example
Time $ Sales
Q3 $16,000
Q4 $16,000
Total $32,000
Soda Example
Time $ Sales
Q3 $16,000
Q4 $16,000
Total $32,000
Product $ Sales
Cola $8,000
Cherry $8,000
Grape $8,000
Lemon-Lime $8,000
Total $32,000
Geography $ Sales
Munich $8,000
Frankfurt $8,000
Cologne $8,000
Berlin $8,000
Total $32,000
Soda Example Munich Frankfurt Cologne Berlin Total
Q3 Cola $ - $ - $2,500 $1,500 $4,000
Cherry $ - $ - $2,000 $2,000 $4,000
Grape $1,000 $3,000 $ - $ - $4,000
Lem-Lime $2,000 $2,000 $ - $ - $4,000
Total Q3 $3,000 $5,000 $4,500 $3,500 $16,000
Q4 Cola $4,000 $ - $ - $ - $4,000
Cherry $1,000 $3,000 $ - $ - $4,000
Grape $ - $ - $1,500 $2,500 $4,000
Lem-Line $ - $ - $2,000 $2,000 $4,000
Total Q4 $5,000 $3,000 $3,500 $4,500 $16,000
Total $8,000 $8,000 $8,000 $8,000 $32,000
Multidimensional Analysis
Intuitive way for people with business
training to analyze data
Natural
Easy
Effective
Difficult to get data into a format that
supports multidimensional analysis
Operational Databases
Where did our data come from?
Lots of individual shoppers buying a soda
Each transaction stored in database
designed to store checkout transactions
Operational Database: supports the
day-to-day operations of a company
Data in operational databases can’t
easily be analyzed
Operational Databases
Core operational database functionality:
Gather data
Update data
Store data
Retrieve data
Archive data
Operational Databases
OLTP: Online Transaction Processing
OLTP Example
Buying toothpaste at Target:
1. You place toothpaste on conveyor belt
2. Cashier swipes barcode over POS scanner
3. POS system looks up price of toothpaste
4. POS totals cost of transaction + tax
5. POS prompts for payment
6. You swipe debit card and enter PIN
7. POS system xfers cost of toothpaste from your
bank account to Target’s account
8. POS generates receipt and cashier bags
purchase
Key OLTP Characteristics
Processes a transaction according to
rules
Performs all elements of a transaction in
real time
Continually processes multiple
transactions
OLTP Systems
OLTP systems are everywhere:
Order tracking
Invoicing
Credit card processing
Retail POS
Banking
Airline reservations
OLTP is optimized for managing low-
level business data
OLTP Systems
OLTP systems can be used to answer
transactional questions
Raw transactional data not really useful
for business intelligence
OLTP systems can’t be used to answer
most analysis questions
Can’t search, sort, & summarize large
numbers of records
Can’t handle required calculations
Negative impact on OLTP system performance
OLTP Systems
OLTP systems gather raw data used for
multidimensional analysis
Raw data has to be converted into
something suitable for analysis
Converting raw data to something useful
isn’t easy
OLTP Systems
IT dept used to spend most of their time
and resources on operational systems
Usually purchased as packaged apps
today
Today’s operational apps usually include
some meaningful reporting capabilities
OLTP Systems
Packaged systems have 2 big limitations:
1. Can only report on their own data – “silos” of data
2. Don’t really support multidimensional analysis
Sales Marketing Accounting Finance
OLTP Systems
Every large company has some sort of
BI system to analyze operational data
OLTP system vendors are constantly
improving their ability to integrate with BI
systems
OLAP
Modern BI systems designed to follow
OnLine Analytic Processing (OLAP)
model
Named by IBM’s E.F. Codd (inventor of
SQL and relational databases)
All OLAP systems have to meet three
key criteria
Three Key OLAP Criteria
1. Must support multidimensional analysis
Top managers/analysts have always
thought multidimensionally
View “by” qualifiers are usually dimensions
OLAP systems organize data into
multidimensional structures
Provide tools for users to examine/filter
dimensional data
Three Key OLAP Criteria
2. Fast retrieval times
Answer more questions in less time
“Infinite Question Syndrome”
3. Calculation engine that can handle
specialized multidimensional math
Lets analysts use simple formulas that are
auto-performed across dimensions
Dimensions
Dimension: categorically consistent
view of data
Two tests for dimensionality:
1. Can data about members be compared?
○ Sales numbers of one product compared to
sales numbers of another product
2. Can data from members be aggregated
into summaries?
○ Jan, Feb, Mar aggregate together as Q1
Slicing & Dicing
Dimensions let you “slice and dice”
multidimensional data
Slicing & Dicing
Product X
Slicing & Dicing
Jan Feb Mar Apr May
Boston
New York
Philadelphia
Baltimore
Washington
Pivoted Soda Data Cola Cherry Grape Lem-Lime Total
Munich Qtr 3 $ - $ - $1,000 $2,000 $3,000
Qtr 4 $4,000 $1,000 $ - $ - $5,000
Total $4,000 $1,000 $1,000 $2,000 $8,000
Frankfurt Qtr 3 $ - $ - $3,000 $2,000 $5,000
Qtr 4 $ - $3,000 $ - $ - $3,000
Total $ - $3,000 $3,000 $2,000 $8,000
Cologne Qtr 3 $2,500 $2,000 $ - $ - $4,500
Qtr 4 $ - $ - $1,500 $2,000 $3,500
Total $2,500 $2,000 $1,500 $2,000 $8,000
Berlin Qtr 3 $1,500 $2,000 $ - $ - $3,500
Qtr 4 $ - $ - $2,500 $2,000 $4,500
Total $1,500 $2,000 $2,500 $2,000 $8,000
Grand Total $8,000 $8,000 $8,000 $8,000 $32,000
OLAP
Munich
Frankfurt
Cologne
Berlin
Geogra
phy D
imensio
n
OLAP
Q1 Q2 Q3 Q4
Time Dimension
OLAP
Cola
Cherry
Grape
Lemon-Lime
OLAP
Munich
Frankfurt
Cologne
Berlin
Geogra
phy D
imensio
n
Q1 Q2 Q3 Q4
Time Dimension
Cola
Cherry
Grape
Lemon-Lime
OLAP
Munich
Frankfurt
Cologne
Berlin
Geogra
phy D
imensio
n
Q1 Q2 Q3 Q4
Time Dimension
Cola
Cherry
Grape
Lemon-Lime
$2,000
$32,000
OLAP
Munich
Frankfurt
Cologne
Berlin
Geogra
phy D
imensio
n
Q1 Q2 Q3 Q4
Time Dimension
Cola
Cherry
Grape
Lemon-Lime
OLAP
Munich
Frankfurt
Cologne
Berlin
Geogra
phy D
imensio
n
Q1 Q2 Q3 Q4
Time Dimension
Cola
Cherry
Grape
Lemon-Lime
OLAP
Munich
Frankfurt
Cologne
Berlin
Geogra
phy D
imensio
n
Q1 Q2 Q3 Q4
Time Dimension
Cola
Cherry
Grape
Lemon-Lime
$8,000
OLAP
Data cubes can have very large
numbers of members
OLAP Cube: multidimensional structure
that stores and maintains discrete
intersection values
Some OLAP systems let cubes intersect
with each other
Hierarchies
Typical analysis task:
Units Sold, Average Price, Dollar Sales
100 products
24 months
200 major cities
Total data points: 1,440,000
Not all products sold in all cities during
all months
Hierarchies
Hierarchy – organizes data by levels
Each level in the hierarchy is the
aggregate of the levels beneath it
Examples:
Monthly data rolls up to quarters and years
Cities roll up to regions and states
Products roll up to product lines and groups
Calculations, like Average Price, can be
back-calculated at each hierarchy level
Hierarchies
Hierarchies let you drill-down into data
to explore interesting patterns and
anomalies
Top-down approach is like “20
Questions”
Start by exploring broad trends
Become more focused as analysis
progresses
Top-down thinking is natural way for
humans to organize complex info
Ad hoc Analysis
Point-and-click drill-down is made
usable by OLAP’s rapid response model
Lets managers and analysts perform ad
hoc analysis
Paper-based reporting gives fixed
answers to fixed questions
OLAP-based ad hoc analysis lets
virtually any question be answered
quickly
Ad hoc Analysis
Virtually any report can be formatted
multidimensionally (pivoting & nesting
dimensions)
Virtually anyone can be taught how to
do their own analysis work with minimal
training
Sample Hierarchy
2013
Q1
Jan Feb Mar
Q2
Apr May Jun
Q3
Jul Aug Sep
Q4
Oct Nov Dec
Attributes
Attribute: descriptive non-hierarchical
information
Examples:
Model number
Size
List price
Color
Flavor
Street address
Measures
Measure: any quantitative expression
contained in an OLAP system
A measure is the data that’s being
analyzed across multiple dimensions
Example: Dollar Sales of soda by
month, by product, and by city
Measures
Four important properties of a measure:
1. Always a quantity or expression that yields
a quantity
2. Can take any quantitative format
3. Can be derived from any original data
source or calculation
4. At least one measure required to perform
OLAP analysis
Measures
The measures to be analyzed depend
on the purpose of the OLAP system
In BI, measures known by different
names depending on application:
Metric/Key Performance Indicator (KPI)
Benchmark
Ratio
Summary
Analysis gap between raw data and BI
can be bridged by combining OLTP
systems with BI systems
OLAP systems provide ad hoc analysis,
slicing and dicing, pivoting dimensions,
and drilling down through hierarchies
OLAP provides significant capabilities
over standard single-dimensional
analysis
Michael Lamont, ’12