big data evolution and tips to be future-ready
TRANSCRIPT
Big Data Evolution and Tips to Be Future-Ready
Joanna SchlossBusiness & Big Data Analytics
O’Reilly and Cloudera Strata + Hadoop World
2 Dell Software Group
Agenda• Database Technology Trends & Tips to be Future-
Ready – The End of One-Size-Fits-All– Big Data and Hadoop– NoSQL– The End of Disk?
• Analytic Requirements & Use Cases
3 Dell Software Group
Data and Content Drivers Internet
Big Data
Social Media
IOT
Dell Software Group4
Significant Database Technology Trends
The end of “one size fits all”
Big Data and HadoopNoSQL
The end of disk?
5 Dell Software Group
Trend #1: The end of “one size fits all”
Dell Software Group6
History of databases
Magnetic tape“flat” (sequential) files
Pre-computer technologies:Printing pressDewey decimal systemPunched cards
Magnetic Disk
IMS
Relational Model defined
Indexed-Sequential Access Mechanism (ISAM)
Network Model
IDMS
ADABAS
System R
Oracle V2
Ingres
dBase
DB2
Informix
Sybase
SQL Server
Access
Postgres
MySQL
Cassandra
HadoopVertica
Riak
HBase
DynamoMongoDBRedis
VoltDB
Hana
Neo4J
Aerospike
Hierarchical model
1960-701940-50 1950-60 1970-80 1980-90 1990-2000
2000-2010
Dell Software Group7
Why?
• 3rd Platform drives new demands on the database:– Global High Availability– Data volumes– Unstructured data– Transaction rates– Latency
• 2nd Platform – driving DW and BI architecture– Semantic Layers– Modelling– Metrics
• 1st - A single architecture cannot meet all those demands
CostFalling sensor
and device costs
PowerImproved efficiencie
s
CloudRise of
cloud, IaaS, PaaS, SaaS
Big Data/ Analytics
New capabilities
to gain insight from
data
WirelessIncreasing coverage
MobilityUbiquity of mobile devices
and apps
$
Enter Internet of All ThingsFalling costs and modern technologies are fueling IoT solutions
Dell Software Group9
Operational RDBMS
(Oracle, SQL Server, …)
In-memory Analytics(HANA,
Exalytics …)
In-memory processing
(Spark)
Hadoop
Web DBMS (MySQL,
Cassandra, Cloudera)
ERP & in-house CRM
Analytic/BI software
(Statistica, Tableau)
Web Server Data Warehouse
RDBMS(Oracle,
Terradata …)
It takes all sorts
10
Future Ready and Future Proof Tip #1
Are you ready for the in flux of data and the onslaught of new database technologies?
– Leveraging all existing technology – IoT to big data to the traditional data warehouse– Possible question
› Data origination?› Data destination?› Frequency of analytic refresh?› End user ownership?
– Myriad of apps and analytical solutions› Real-time› High Availability› Ingest› Hot or Cold data
Fail Fast -> Fail Frequently -> Fail Forward
11 Dell Software Group
Trend #2: The Rise of Big Data
The 3-4 “V”sVolume• Terabytes• Petabytes• Exabytes• Zetabytes
Velocity• Transaction
rates• User
populations• Machines
Variety• Structured• Unstructured• Human
Generated• Machine
Generated
ValueWe all know theseWe all have theseWe all live these
The Instrumented Human
• Bluetooth Personal Area Network
• 3G/WiFi Wide Area Network
• GPS• Storage
• Pulse, temp monitor
• Silent alarms• Pedometer, sleep
monitoring
• Compass • Camera• Mike/earphones• Heads up display• Emotion/Attention
monitor
Dell Software Group14
The Instrumented World
15 Dell Software Group
Trend #3:NoSQL
16 Dell Software Group
Scale Up vs Scale Out
VS
NoSQL + Spark allows you to address both today’s and future requirements
CAP Theorem says something has to give
• CAP (Brewer’s) Theorem says you can only have two out of three
• All analytics can leverage CAP theorem– Volume– Speed– Accuracy
Consistency• Everyone always sees
the same data
Availability• System stays up
when nodes fail
Partition Tolerance• System stays up
when network between nodes fail
NOGO
18
Future Ready and Future Proof Tip #2
• Understanding the business drivers for these different database systems
• How the data platform selections (plural) assist in anticipating the analytic requirements of the company
• Allowing you to build the analytic content that your business requires and eagerly awaits to deploy
19 Dell Software Group
Trend #4:The End of Disk?
Dell Software Group20
5MB HDD circa 1956
Dell Software Group21
The more that things change....
Dell Software Group22
In-Memory Databases
• Cost of RAM falling 50% each 18 months.
• Some databases can fit entirely within the RAM of a single server or cluster of servers
0.001
0.01
0.1
1
10
100
$1.00
$10.00
$100.00
$1,000.00
$10,000.00
$100,000.00
1990 1995 2000 2005 2010 2015 2020
Size
(GB)
Cost
(US$
/GB)
Year
US$/GB Size (GB)
Dell Software Group23
In Memory Market Opportunity-driving Analytics
2015 Market Size5.58 billion (USD)
Better PerformanceImprove performance of OLTP and analytic systems
Compound Annual Growth Rate (CAGR)32.9%
All DB vendors offer in memory technology
Projected 2020 Market Size 23.15 billion (USD)
Kick-start Big Data ProjectsOptimize analytics and the use of business intelligence applications
Source: In-Memory Computing Market - Global Forecast to 2020
Dell Software Group24
Time– Reshaped or Redefined
Surveys indicate• For Data and systems of operations/record
– People expect sub second responses on OLTP– Speed of thought for Web and internet applications– Recommendations – preceding thoughts
• For analytics and systems of change– 10 minutes – willing to wait for data refresh– 1 day – willing to wait for BI reporting– ?? = willing to wait for prediction or prescriptive content
25 Dell Software Group
Changing DBMS effects the analytics world• Database trends drive and affect change
• This movement creates analytic motivators – 2nd generation real-time requirements emerging– Event, push or edge based analytics– Driving complex data environment
• In Memory considerations driving market shift – In memory with All clouds– In memory with all persisted sources
Dell - Restricted - Confidential26
Future Ready and Future Proof Tip #3
• Are you ready for new time analytics?
• Address the all data requirements– In Memory technology combine with new database technology– NoSQL + Spark may do the trick
• Application development with analytics in mind will drive better solutions and agile applications to meet those pesky “right time” analytics
27 Dell Software Group
Use Case: Dell
28
SupportAssist Big Data Analytics
Taking Billions of Data Points
Discovering Key Relationships
Extracting Predictive
Insight
29
Our World: The Unified Data Model
Peripheral Supplier
Product Assembly
Customer Data
Customer Call Logs
Part Replacement
s
Returns & Repair
Failure Analysis
Component Supplier
Sources
Push solutions to customer before
problem happens
Fewer dispatches by linking commonly dispatched parts
Optimize test process through
data mining
Close problems detected across
the supply chain
Customer Info
Manufacturers
Repair Centers
Supply Hubs
Assembly Operations
30
Overwhelming Volume of Logs
One Customer:• 25,000+ Log Files• 400+ GB of Data
31
Improving the Customer Experience
Problem Research
Past
TimeCustomer
Environment
Dell Support
Customer Experiences
Problem
Customer Reports Problem
Resolution or DispatchDiscovery
Solution Testing
SupportAssist Predictive Analytics
TimeCustomer
Environment
Dell Support
Problem Identified
Fix
Problem Research
Solution Testing Fix
Solution Implemented
Predictive Analytics
Incoming Call
Today
32
STATISTICA Building Predictive Models
Data Consolidation &
ManipulationData Sampling &
Manipulation Modelling Output Generation
33
Finding Outliers
33
Using OMNEO:
• Identify outliers
• Areas for Research
• Analyze & Discover
34
Isolating Real Issues
Isolate:
• Software Issues• Configuration Problems
• Driver Issues
35
Prioritize Issues to Resolve
35
Find:
• What impacts• Most Systems
• Frequency/System
• Target Systems
• Research More…
Drive most impact: Most systems and/or biggest customer impact? Having > 60 BSODs/System
36
10x Efficiency Gain
At recent launch of Dell XPS13…
Noticed LCD was flickering on 2 of 6 units Demo Units.
Demonstrating the power of “Analytics at the Speed of Thought”
Carlos utilized Omneo to identify and isolate the problem in 3hrs vs 3 days.
Firmware issue contained the following day.
37
Future Ready and Future Proof Tip #4
• Are you able to examine and investigate all options?
• Creative solutions couple emerging technologies with tried and true solutions
• Analytics systems are typically both tactical and strategic
• Key analytics are immutable – enhancing and delighting our customers with the creation of new opportunities and solutions
38 Dell Software Group
Reuse, Repurpose, and Reinvest• Id. Relevant
systems of record
• BI investments
• Advanced analytics
IoTPredictive AnalyticsDashboards/Viz
Software innovations• Spark• Cassandra
Leverage Hardware break through and innovations
Reuse
Re-Invest
Repurpose
39 Dell Software Group
Practicality vs. Technology
In memory, disk, cloud
Data and content
Business Analytics
Analytics benefit all stakeholders
Operations: Function
BI, Viz, Data Science
Reports, metrics, predictions
40
Future Ready and Future Proof Tip #5
• All good analytics require inspection and evaluation– Time windows have different meaning– Metrics take on different meaning– Analysts ask more questions and need the ability to ask
continuous level of questions
• Build it together - Data Owners and Analytics Enthusiasts together
41 Dell Software Group
Conclusion
• Accurate and consistent analytics• Driving and defining data science• Right time analytics/edge or on premises analytics
Data
Analytics
• All data environment• New Data• Real time data or data at the edge
Thank youFeel free to reach out:• Follow me - @joschloss