Download - Data Aggregation in Today's Data Warehouse
![Page 1: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/1.jpg)
Data Aggregation in Today's Data Aggregation in Today's Data WarehouseData Warehouse
New England Business Objects New England Business Objects User GroupUser Group
Yossi MatiasYossi MatiasCTOCTO
HyperRollHyperRoll
![Page 2: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/2.jpg)
2 1/24/2006
Recap Recap –– BI made easyBI made easy
![Page 3: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/3.jpg)
3 1/24/2006
Reports made easyReports made easy
![Page 4: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/4.jpg)
4 1/24/2006
…… and wide spread ..and wide spread ..
![Page 5: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/5.jpg)
5 1/24/2006
Business Objects XI Platform CapabilitiesBusiness Objects XI Platform Capabilities
• High performance• Scalability• Reliability• Service-oriented architecture
• But what about the underlying data warehouse?
![Page 6: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/6.jpg)
6 1/24/2006
An example applicationAn example application
![Page 7: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/7.jpg)
7 1/24/2006
BO UniverseBO Universe
![Page 8: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/8.jpg)
8 1/24/2006
Aggregate queriesAggregate queries
![Page 9: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/9.jpg)
9 1/24/2006
Typical Data WarehouseTypical Data Warehouse
• A schema• Methodology • Lots of summary tables• Table management
challenges– Numbers of tables– Complex configurations– Table refresh– Redundant storage
![Page 10: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/10.jpg)
10 1/24/2006
WhatWhat’’s wrong with this picture?s wrong with this picture?
Multiple Views of Multiple Views of summary tables and summary tables and
complex universecomplex universe
![Page 11: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/11.jpg)
11 1/24/2006
Performance Fundamentals:Performance Fundamentals:AggregationsAggregations
Number of AggregationsNumber of Aggregations
Tim
eTi
me
Processing Time
Processing TimeQuery Time
Query Time
PrePre--calculated summaries of datacalculated summaries of dataIntersections of levels from each dimensionIntersections of levels from each dimensionTradeoff between processing and query timesTradeoff between processing and query times
![Page 12: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/12.jpg)
12 1/24/2006
The Summary Table DilemmaThe Summary Table Dilemma
# of Summary Tables
Que
ry P
erfo
rman
ce
ROLAP enginesrequire a steadydiet of summarytables to perform
Maintenance B
urden
Unbearable
Simple
A few querieshave acceptablePerformance…
….but the majority of queries,especially ad-hoc requests,perform poorly and system
adoption suffers
At some point summary tablemaintenance becomes
unbearable
![Page 13: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/13.jpg)
13 1/24/2006
Typical Data Warehouse EnvironmentsTypical Data Warehouse Environments
Applications Databases Flat Files MainframeEAI/EDI
ETL LayerETL Layer
Data Warehouse ODS
Data marts
$ $!Summary TablesMultidimensional
Data StoresBursted Reports Data AlertsCached Reports Extra Hardware
Memory, CPUs
NEED FOR REAL TIME INFORMATIONLow High
Poor Query Performance & Poor User Concurrency
DSS Ad-hocQuery
Budgeting &Planning
OperationalBI
CPM BAM Real-TimeDashboards
Longer Batch Window
![Page 14: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/14.jpg)
14 1/24/2006
On the limitation of RDBMS On the limitation of RDBMS
“In fact, relational DBMS were never intended to provide the very powerful functions for data synthesis, analysis, and consolidation that is being defined as multi-dimensional data analysis.
These types of functions were always intended to be provided by separate, end-user tools that were outside and complementary to the relational DBMS products.”
E.F. Codd, S.B. Codd and C.T. SalleyProviding OLAP to User-Analysts: An IT Mandate
![Page 15: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/15.jpg)
15 1/24/2006
The CatchThe Catch--22 of data aggregation in DW22 of data aggregation in DW
• We want a Data Warehouse that performs data aggregations effectively
• The Data Warehouse should ideally consist of relational databases
• Relational databases are not set to support effectively data aggregation
![Page 16: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/16.jpg)
16 1/24/2006
The HyperRoll approachThe HyperRoll approach
• Build an effective non-relational data aggregation server
• Have the data aggregation server provide “aggregation services” to a relational database
• As a result, have a HyperRoll enabled relational database that effectively supports aggregations
![Page 17: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/17.jpg)
17 1/24/2006
DBMS
DB2 CLI ODBC Oracle OCI ASCII
HyperRoll for RelationalHyperRoll for Relational
Access
Storage
Loading
FACTTABLE
ETLData is loaded into
HR in order to build aggregates
Hyp
erR
oll E
ngin
e
DBMSViewGateway
Up to 90% reduction in batch window compared to existing aggregation strategies
Benefit
Summary table storage &
maintenance reduced or eliminated
Benefit
Up to 100x faster queries, and endusers continue to use familiar
applications
Benefit
![Page 18: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/18.jpg)
18 1/24/2006
HyperRollHyperRoll--enabled Data Warehouseenabled Data Warehouse
Hyp
erR
oll
StarSchema AggregatesView
ROLAP Queries (SQL)
Data Warehouse or Mart
10x – 100x performanceimprovement Replace or
ComplementSummary Tables(but does NOT Build or storesSummary tables)
![Page 19: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/19.jpg)
19 1/24/2006
A DW implementationA DW implementation
Fact1
Fact2
HyperRoll
MV1
MV3
MV4
MV5
MV9
MV10
MV2
MV6
MV7
MV8
MV11
MV12
Query Tools
400 Millions
36 Millions
![Page 20: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/20.jpg)
20 1/24/2006
Typical Data WarehouseTypical Data Warehouse
• A schema• Methodology • Lots of summary tables• Table management
challenges– Numbers of tables– Complex configurations– Table refresh– Redundant storage
![Page 21: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/21.jpg)
21 1/24/2006
Data Warehouse with HyperRollData Warehouse with HyperRoll
• Same methodology• Same schema• Now only “one summary” table
– Represents all aggregations – Simplifies management
![Page 22: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/22.jpg)
22 1/24/2006
Data Warehouse with HyperRollData Warehouse with HyperRoll
• Same methodology• Same schema• Now only “one summary” table
– Represents all aggregations – Simplifies management
![Page 23: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/23.jpg)
23 1/24/2006
WhatWhat’’s wrong with this picture?s wrong with this picture?
Multiple Views of Multiple Views of summary tables and summary tables and
complex universecomplex universe
![Page 24: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/24.jpg)
24 1/24/2006
Data Warehouse with HyperRollData Warehouse with HyperRoll
One View of All One View of All Possible TablesPossible Tables
![Page 25: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/25.jpg)
25 1/24/2006
Query to the HyperRoll ViewQuery to the HyperRoll View
HyperRoll View --- Simple Query
Few SecondsFew Seconds - query response time !!!
![Page 26: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/26.jpg)
26 1/24/2006
Significant Performance Significant Performance EnhancementEnhancement
0
500
1000
1500
2000
2500
3000
3500
1M 5M 10M 15M 20MMillions of Records
Number of Seconds to Complete Query
Business Objects + Oracle + HyperRoll
Business Objects + Oracle
Less than 1 second
![Page 27: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/27.jpg)
27 1/24/2006
Typical Data Warehouse EnvironmentsTypical Data Warehouse Environments
Applications Databases Flat Files MainframeEAI/EDI
ETL LayerETL Layer
Data Warehouse ODS
Data marts
$ $!Summary TablesMultidimensional
Data StoresBursted Reports Data AlertsCached Reports Extra Hardware
Memory, CPUs
NEED FOR REAL TIME INFORMATIONLow High
Poor Query Performance & Poor User Concurrency
DSS Ad-hocQuery
Budgeting &Planning
OperationalBI
CPM BAM Real-TimeDashboards
Longer Batch Window
![Page 28: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/28.jpg)
28 1/24/2006
Typical Data Warehouse EnvironmentsTypical Data Warehouse Environments
Longer Batch Window
$ $!Summary TablesMultidimensional
Data StoresBursted Reports Data AlertsCached Reports Extra Hardware
Memory, CPUs
DSS Ad-hocQuery
Budgeting &Planning
NEED FOR REAL TIME INFORMATIONLow High
Poor Query Performance & Poor User Concurrency
OperationalBI
CPM BAM Real-TimeDashboards
Applications Databases Flat Files MainframeEAI/EDI
ETL LayerETL Layer
Data Warehouse ODS
Data marts
Hyp
erR
oll E
ngin
e
Hyp
erR
oll E
ngin
e
![Page 29: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/29.jpg)
29 1/24/2006
The best of both worldThe best of both world
RDBMS
OLAP
Relational“Unlimited” scope of dataVariety of client toolsHigh maintenanceComplex table joins and aggregations slows down queriesComplex analysis difficult
OLAPFast QueriesComplex AnalysisLimited scopeLong cube buildsLimited client tools
HyperRoll offers the best of both worldsTransparent integration with both Relational and Multidimensional databasesSeamless to the existing client toolsFast build process (dramatically faster then OLAP)Fast queries without having to design, build and maintain multiple summary tablesBroader scope of analysis (dimensions and data)Eliminates complex Joins and GroupBy
![Page 30: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/30.jpg)
30 1/24/2006
The HyperRoll aggregation serverThe HyperRoll aggregation server
• What’s the magic with the HyperRoll aggregation server
• Does it compute all possible aggregates?
• How come it can perform so much better than OLAP cubes
![Page 31: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/31.jpg)
31 1/24/2006
Legend
Multidimensional CubeMultidimensional Cube
Theoretical scope of data
Leaf level dataAggregated Data
Problems:•Sparsity•Irregularity
![Page 32: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/32.jpg)
32 1/24/2006
16 81 256 1024 4096
16384
65536
0
10000
20000
30000
40000
50000
60000
70000
2 3 4 5 6 7 8
Data Explosion SyndromeData Explosion Syndrome
Number of DimensionsNumber of Dimensions
Num
ber o
f Agg
rega
tions
Num
ber o
f Agg
rega
tions
(4 levels in each dimension)(4 levels in each dimension)
Typical OLAP ProblemsTypical OLAP ProblemsData ExplosionData Explosion
![Page 33: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/33.jpg)
33 1/24/2006
What is the HyperRoll?What is the HyperRoll?• An intelligent Aggregation Server• Software engine based on proprietary
algorithms for data aggregation– Pre-computes a small-footprint data store– Enables quick computation of aggregate values– Highly-efficient I/O
• The logical equivalent of OLAP for relational without the limitations
• Patented Architecture for standalone data aggregation
• Integrated into existing relational databases and Business Intelligence systems
![Page 34: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/34.jpg)
34 1/24/2006
What about HardwareWhat about Hardware--based solutionsbased solutions
• Will better H/W make the aggregation problem go away?
• The good news:– Better h/w platforms improve performance
• The bad news– The problem will just get worse over time
![Page 35: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/35.jpg)
35 1/24/2006
Longer Processing TimesSoaring Costs
Limited Analysis FlexibilityOut of Date Information
ConsequencesConsequencesConsequences
InfoGlut is Only Getting WorseInfoGlut is Only Getting Worse
9 Mo. 18 Mo.Time
MultipleVolume ofVolume of
Corporate DataCorporate Data
Linear Processing Linear Processing CapabilityCapability
(Moore(Moore’’s Law)s Law)3
1
2
![Page 36: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/36.jpg)
36 1/24/2006
Expense process required 4 reports to run sequentially Total time to complete task taking 4 hours Queries ran from 11 to 14 minutes
Financial Services Company Application: Expense Management Primary Business Issue: Analyst Productivity
Financial Institution Financial Institution
Before
Query Performance Increase: 37 to 90X
Process now completed in minutes Queries run in 2 to 18 secondsProjected manpower saving: >$500K
Oracle, Business Objects
![Page 37: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/37.jpg)
37 1/24/2006
Customer Test Results Customer Test Results
Query Name Oracle Timing (MV)
HR Timing Improvement
Bill to 7 min 51 sec 14 sec 31 X
Territory 7 min 15 sec 1 sec 427 X
Region 11 min 4 sec 1 sec 422 X
Sales Force12 min 12 sec 1 sec 438 X
All Sales Force 16 min 37 sec 1 sec 541 X
![Page 38: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/38.jpg)
38 1/24/2006
CustomersCustomers
![Page 39: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/39.jpg)
39 1/24/2006
Setting up Queries to work with HyperRollSetting up Queries to work with HyperRoll
Step1Analyze
the
Business
• Analyze Reports
• Analyze Semantic Layer
• Select Measures
• Select Dimensions
• Select Hierarchies
• Obtain Design Validation
• Look for hidden requirements
Step1
Analyze the Business Objects Universe, reports, queries and schema
![Page 40: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/40.jpg)
40 1/24/2006
Setting up Queries to work with HyperRollSetting up Queries to work with HyperRoll
Step2
Design the HyperRoll metadata structure using HyperRoll HDF Builder
![Page 41: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/41.jpg)
41 1/24/2006
Setting up Queries to work with HyperRollSetting up Queries to work with HyperRoll
Step3
HyperRoll is loaded with source data:•RDBMS•Flat files H
yper
Rol
l • Source data is read
• HyperRoll aggregation engine is loaded and calculated
• Hierarchies are developed
![Page 42: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/42.jpg)
42 1/24/2006
Setting up Queries to work with HyperRollSetting up Queries to work with HyperRoll
Step4
Create the Database view and link it to HyperRoll
HyperRoll
• Define an ODBC System DSN for HyperRoll
• Create a DBlink for the DSN
• Create View as Select * from HyperRoll@DBlink
![Page 43: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/43.jpg)
43 1/24/2006
Setting up Business Objects to work with Setting up Business Objects to work with HyperRollHyperRoll
Step5
Modify the Business Objects Universe by adding the Database View that points to HyperRoll
• Add the View to the Universe
• Enable the Aggregate Aware Function
![Page 44: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/44.jpg)
44 1/24/2006
Setting up Queries to work with HyperRollSetting up Queries to work with HyperRoll
Step6
Execute queries against Database as they normally would
RDBMS Hyp
erR
oll
• Transparent redirection between detail and aggregate data
• No user training
• Dramatically improved query response
• Dramatically improved manageability
![Page 45: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/45.jpg)
45 1/24/2006
Add the View to Business ObjectsAdd the View to Business Objects
• Here the new Database view has been added to the current BO Universe
• The view comprises the aggregated data for the existing schema
Database View Accessing HyperRoll data
![Page 46: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/46.jpg)
46 1/24/2006
Enable Aggregate Aware FunctionEnable Aggregate Aware Function• In the Aggregate Aware function place the matching
column from the view as the first parameter, and the column from the fact table as the second parameter
@Aggregate_Aware(SH.HR_SALES_VW.AMOUNT_SOLD, SH. SALES.AMOUNT_SOLD)
![Page 47: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/47.jpg)
47 1/24/2006
Setting up Queries to work with HyperRollSetting up Queries to work with HyperRoll
RDBMS
Business Objects End User Layer
Query Request
Business Objects SQL GenerationIs it a summary
request?N Y
SQL Request
FACTTABLE VIEW
DETAILED SUMMARIZED Hyp
erR
oll I
nsta
nce
Gateway
![Page 48: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/48.jpg)
48 1/24/2006
HyperRoll Value Propositions HyperRoll Value Propositions
• Improved query performance• Reduced batch window to load data• Lower maintenance and support costs • Enables operational BI• Complimentary to existing BI, DB and
DW infrastructures
![Page 49: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/49.jpg)
49 1/24/2006
How to learn moreHow to learn more
• On algorithms for massive data sets– http://theory.stanford.edu/~matias/
• On HyperRoll– Talk to me over the break
– Talk to Kathleen • [email protected]• (845)-928-6974
– Take a webinar www.hyperroll.com
![Page 50: Data Aggregation in Today's Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020705/61fb88382e268c58cd5f4917/html5/thumbnails/50.jpg)
50 1/24/2006
Yossi Matias, CTOHyperRoll
NEBOUG
January 19, 2006January 19, 2006
Realizing the Potential of Realizing the Potential of Business IntelligenceBusiness Intelligence