xldb clallenges for structural data focus: analytics @ nokia
TRANSCRIPT
XLDB Clallenges for Structural Data
Focus: Analytics @ Nokia
Pekka Barck
Head of Data Warehousing
Nokia
1 © 2009 Nokia Aug 28-2009/ Pekka Barck
Nokia
Scale & Size
•100 TB + centralized in EDW, ERP DW, Research
Center, Services specific on Teradata, Oracle, Hadoop…
•100 TB + in hundreds decentralized / federated
dw:s/marts, mostly Oracle + 10 + other technologies.
•100 TB + in application specific logs where history is
2 © 2009 Nokia Aug 28-2009/ Pekka Barck
•100 TB + in application specific logs where history is
discarded for cost reasons.
•1 PB + in multiple new sensor data applications.
Structures and Processing
•Data Structures
• Commercial normalized - denormalized databases
• ROLAP, MOLAP, HOLAP
• Hadoop, Map/Reduce
• Virtualization through cross-database drill-through
3 © 2009 Nokia Aug 28-2009/ Pekka Barck
• Virtualization through cross-database drill-through
• In-memory (tokenized) databases
•Data Preparation
• Transformation tools with too high costs for
reconciliation and audit trail
• Business rule engine type of approach
• Messaging software
• Nokia custom made loading tools for data access from
phones
The Tools
Data exploration/mining: 4 tools including
SPSS, Minitab + custom made
Text mining: 3 tools including Attensity
Webanalytics: Omniture Site Catalyst,
Hitbox
4 © 2009 Nokia Aug 28-2009/ Pekka Barck
Hitbox
OLAP: 6 tools including Cognos,
BO, OBI (Hyperion)
Reporting: 12 tools including
Cognos, BO, OBI (Siebel
Analytics), SAP products
Analytics Distribution
• Normal queries are noise
and as such are not
affecting computing
capacity needs
• Precooked 70 %,
2 % of capacity
5 © 2009 Nokia Aug 28-2009/ Pekka Barck
• Precooked 70 %,
2 % of capacity
• Ad-hoc 30 %,
55 % of capacity
• Exploration 2 % of
users, 43 % of capacity
The Challenges
1. AGILITY
6 © 2009 Nokia Aug 28-2009/ Pekka Barck
1. AGILITY
Project Specific Solutions June 2008
ADW
Presentation Layer
PRODUCTION
Solution
Specific ADW
7 © 2009 Nokia Aug 28-2009/ Pekka Barck
STAGING
STRENGTHS
Sharing of Important Resources
Reduced Data Movement
Faster Delivery
Increased Security
CHALLENGES
Data Not Fully Integrated
Workload Management
Maintaining Solution Consistency
Additional Resource Demands on Production
FastValue on IT Aug 2008IT
ADW
Limited Copy
STAGING
ADW
Presentation Layer
PRODUCTION
Solution
Specific ADW
8 © 2009 Nokia Aug 28-2009/ Pekka Barck
STAGINGSTAGING
STRENGTHS
Quick Determination of Data Value
Easy Access for Data Research
More Efficient Use of Business Input
Less Demand on Production System Resources
Automated Loading
CHALLENGES
Data Integrated
Additional Data Movement
Freshness of IT Data
Not a Production Environment
Exploration on Production May 2009
ADW
Presentation Layer
PRODUCTION
Solution
Specific ADW
Analytics
LAB
9 © 2009 Nokia Aug 28-2009/ Pekka Barck
STAGING
STRENGTHS
Reduced Data Movement
Increased Data Freshness
More Powerful Platform Resources
Faster Development of Analytic Metrics
CHALLENGES
All Data Must Enter Through Staging
Mixed Workload Management
Additional Resource Demands on Production
Personal Upload June 2009
ADW
Presentation Layer
PRODUCTION
Solution
Specific ADW
Analytics
LABExternal
Data
10 © 2009 Nokia Aug 28-2009/ Pekka Barck
STAGING
STRENGTHS
Some Data Loads Can Bypass Staging
Quick Access to External Data
Quick Recognition of Data Value
CHALLENGES
Some Data Not Fully Integrated
Mixed Workload Mangement
One Architecture Slide Removed
11 © 2009 Nokia Aug 28-2009/ Pekka Barck
Ensuring Business Value
• Business is moving towards advanced analytics in
all areas
•Now 100 sources. 1 new source added every week.
•Everything has to link together.
•The data model becomes very complex.
12 © 2009 Nokia Aug 28-2009/ Pekka Barck
•The data model becomes very complex.
•Different data has different business value.
Other Challenges
• Efficient (open)
compressions in
databases -> Green IT
• Efficient quality
procedures in databases /
Integration tools
•We need more hours per
day: We materialize for
performance but reach a
deadend.
13 © 2009 Nokia Aug 28-2009/ Pekka Barck
Integration tools
More Challenges
• Growing need for high volume/low usage XXL-
database. Volume estimates between 3 and 40 PB.
• Cross-database drill-through capabilities: Oracle,
Teradata, Hadoop…
• Integration costs too high, tools and integrators
14 © 2009 Nokia Aug 28-2009/ Pekka Barck
• Integration costs too high, tools and integrators
have not brought enough significant improvements
during past 10 years.
• Linguistic based free text analytics: 10+ most
important languages.
Thank [email protected]
15 © 2009 Nokia Aug 28-2009/ Pekka Barck