online analytical processing (olap) hweichao lu cs157b-02 spring 2007
Post on 15-Jan-2016
214 Views
Preview:
TRANSCRIPT
Online Analytical Online Analytical Processing (OLAP)Processing (OLAP)Hweichao LuHweichao Lu
CS157B-02 Spring 2007CS157B-02 Spring 2007
What is OLAPWhat is OLAP
Basic idea: Basic idea: converting data into converting data into information that decision makers needinformation that decision makers need
Concept to analyze data by multiple Concept to analyze data by multiple dimension in a structure called data cubedimension in a structure called data cube
HistoryHistory
In 1993, E. F. Codd came up with the In 1993, E. F. Codd came up with the term term online analytical processing (OLAP)online analytical processing (OLAP) and proposed 12 criteria to define an and proposed 12 criteria to define an OLAP databaseOLAP database
the term OLAP seems perfect to describe the term OLAP seems perfect to describe databases designed to facilitate decision databases designed to facilitate decision making (analysis) in an organization making (analysis) in an organization
Purpose of OLAPPurpose of OLAP
To derive summarized information from To derive summarized information from large volume databaselarge volume database
To generate automated reports for To generate automated reports for human viewhuman view
Why need OLAP over Why need OLAP over Relational Database IRelational Database I
Consistently fast responseConsistently fast response
OLAP obtains a consistently fast OLAP obtains a consistently fast response is by prestoring calculated response is by prestoring calculated valuesvalues
Why need OLAP over Why need OLAP over Relational Database IIRelational Database II
Metadata-based queriesMetadata-based queries
provide analysis functions that are provide analysis functions that are difficult or impossible to express in SQLdifficult or impossible to express in SQL
SQL SQL was developed primarily for was developed primarily for transaction systems, not for reporting transaction systems, not for reporting applicationsapplications
Why need OLAP over Why need OLAP over Relational Database IIIRelational Database III
Spreadsheet-style formulasSpreadsheet-style formulas
design the data structure with users in design the data structure with users in mind.mind.
Spreadsheets are Spreadsheets are key components of key components of business management because they are business management because they are intuitive to createintuitive to create
Step IStep I
1.1. identify multidimensional dataidentify multidimensional data
measure attributemeasure attribute (measure some value, can be (measure some value, can be
aggregated upon)aggregated upon) dimension attributedimension attribute (define the dimension and summary of (define the dimension and summary of
measure attribute)measure attribute)
(Cont.)(Cont.)
Each dimension is typically expressed as Each dimension is typically expressed as a “hierarchy”a “hierarchy”
Hierarchy: Analyst is interested in Hierarchy: Analyst is interested in different level of detail of a dimensiondifferent level of detail of a dimension
Step IIStep II
2.2. Analyze multidimensional data into Analyze multidimensional data into cross-tabulationcross-tabulation
row header: value for one attributerow header: value for one attribute
column header: value for another attr.column header: value for another attr.
individual cell: value aggregationindividual cell: value aggregation
Step IIIStep III
3.3. Visualize n-dimensional cube - data Visualize n-dimensional cube - data cubecube
the word CUBE describe what in thethe word CUBE describe what in the
relational world would be the integrationrelational world would be the integration
of the fact table with dimension tables of the fact table with dimension tables
Step IVStep IV
After you design the cube, you will use After you design the cube, you will use the cube's structure to build a relational the cube's structure to build a relational database (known as a star schema) to database (known as a star schema) to house the data for the cubehouse the data for the cube
Step VStep V
Once you load data into the relational Once you load data into the relational database, and then into the cube, you'll database, and then into the cube, you'll be able to see how attributes, be able to see how attributes, dimensions, measures, and measure dimensions, measures, and measure groups fit together within a cube to create groups fit together within a cube to create a powerful analytical tool. a powerful analytical tool.
Star SchemaStar Schema
Cubes are easily stored in relational Cubes are easily stored in relational databases, using a denormalized data databases, using a denormalized data structure called the star schema, developed by structure called the star schema, developed by Ralph KimballRalph Kimball
starts with a central fact tablestarts with a central fact table Each row in the central fact table contains Each row in the central fact table contains
some combination of keys that makes it some combination of keys that makes it unique. These keys are called dimensions.unique. These keys are called dimensions.
Slicing & DicingSlicing & Dicing
Additional Functionality that can be Additional Functionality that can be thought of as viewing a slice of the data thought of as viewing a slice of the data cube, particularly when values for cube, particularly when values for multiple dimensions are fixed.multiple dimensions are fixed.
Slicing/Dicing simply consists of selecting Slicing/Dicing simply consists of selecting specific values for these attributes, which specific values for these attributes, which are then displayed on top of the cross-tabare then displayed on top of the cross-tab
Rollup & Drill-downRollup & Drill-down
OLAP permit users to view data at ay OLAP permit users to view data at ay desired level of granularity.desired level of granularity.
Rollup: moving from finer-granularity data Rollup: moving from finer-granularity data to coarser granularityto coarser granularity
Drill-down: opposite to RollupDrill-down: opposite to Rollup
OLAP InplementationOLAP Inplementation
Multidimensional OLAP (MOLAP)Multidimensional OLAP (MOLAP) Relational OLAP (ROLAP)Relational OLAP (ROLAP) Hybrid OLAP (HOLAP)Hybrid OLAP (HOLAP)
MOLAPMOLAP
The database is stored in a special, usually The database is stored in a special, usually proprietary, structure that is optimized for proprietary, structure that is optimized for multidimensional analysis.multidimensional analysis.
+ : very fast query response time because data + : very fast query response time because data is mostly pre-calculatedis mostly pre-calculated
-: -: practical limit on the size because practical limit on the size because the time the time taken to calculate the database and the space taken to calculate the database and the space required to hold required to hold these pre-calculated values these pre-calculated values
ROLAPROLAP
The database is a standard relational database The database is a standard relational database and the database model is a multidimensional and the database model is a multidimensional model, often referred to as a star or snowflake model, often referred to as a star or snowflake model or schema.model or schema.
+: more scalable solution +: more scalable solution -: -: performance of the queries will be largely performance of the queries will be largely
governed by the complexity of the SQL and the governed by the complexity of the SQL and the number and size of the number and size of the tables being joined in tables being joined in the query the query
HOLAPHOLAP
a hybrid of ROLAP a hybrid of ROLAP and MOLAPand MOLAP can be thought of as a virtual database can be thought of as a virtual database
whereby the higher levels of the whereby the higher levels of the database are implemented as MOLAP database are implemented as MOLAP and the lower levels of the and the lower levels of the database as database as ROLAP ROLAP
DOLAPDOLAP
The previous terms are used to refer to The previous terms are used to refer to server based OLAP technologiesserver based OLAP technologies
DOLAP (Desktop OLAP)DOLAP (Desktop OLAP) DOLAP enables DOLAP enables users to quickly pull users to quickly pull
together small cubes that run on their together small cubes that run on their desktops or laptops desktops or laptops
ConclusionConclusion
OLAP is a significant improvement over OLAP is a significant improvement over query systemsquery systems
OLAP is an interactive system to show OLAP is an interactive system to show different summaries of multidimensional different summaries of multidimensional data by interactively selecting the data by interactively selecting the attributes in a multidimensional data cubeattributes in a multidimensional data cube
ReferencesReferences
IBM Redbooks. IBM Redbooks. DB2 Cube Views: A Primer.DB2 Cube Views: A Primer. Durham, NC, Durham, NC, USA: IBM, 2003. ebrary collections. San Jose State USA: IBM, 2003. ebrary collections. San Jose State University. <University. <http://site.ebrary.com/lib/sjsu/Doc?http://site.ebrary.com/lib/sjsu/Doc?id=10113016&ppg=43id=10113016&ppg=43>>
Jacobson, Reed, Jacobson, Reed, Microsoft® SQL Server™ 2005 Analysis Microsoft® SQL Server™ 2005 Analysis Services Step by StepServices Step by Step. Microsoft Press.. Microsoft Press.
Berry, Michael J. A. Berry, Michael J. A. Data Mining Techniques : For Data Mining Techniques : For Marketing, Sales, and Customer Relationship Management.Marketing, Sales, and Customer Relationship Management. Hoboken, NJ, USA: John Wiley & Sons, Incorporated, 2004. Hoboken, NJ, USA: John Wiley & Sons, Incorporated, 2004. ebrary collections. San Jose State University. ebrary collections. San Jose State University. <<http://site.ebrary.com/lib/sjsu/Doc?http://site.ebrary.com/lib/sjsu/Doc?id=10114278&ppg=522id=10114278&ppg=522>.>.
top related