Big Data: A definition• Big data is the realization of greater intelligence by
storing, processing, and analyzing data that was previously ignored due to the limitations of traditional data management technologies
• Big data is the merging of several data sets whose complexity becomes greater than the sum of the individual data sets.
Big Data means that Technology Makes it Possible to Analyze ALL Available Data on any phenomena
Cost effectively manage and analyzeall available data in its native form
unstructured, structured, streaming
ERPCRM RFID
Website
Network Switches
Social Media
Billing
Lots of data• 2.5 quintillion bytes of data are generated every day!
• A quintillion is 1018
• Data come from many quarters.• Social media sites• Sensors• Digital photos• Business transactions• Location-based data
Source: IBM http://www-01.ibm.com/software/data/bigdata/
The four dimensions of Big Data• Volume: Large volumes of data• Velocity: Quickly moving data• Variety: structured, unstructured, images, etc.• Veracity: Trust and integrity is a challenge and a must and
is important for big data just as for traditional relational DBs
Source: IBM http://www-01.ibm.com/software/data/bigdata/
IT
Structures the data to answer that question
IT
Delivers a platform to enable creative discovery
Business
Explores what questions could be asked
Narrow Use:
Determine what questions to ask
Garbage InGarbage Out
New Discoveries
Big Data ApproachIterative & Exploratory Analysis
Traditional ApproachStructured & Repeatable Analysis
Reversing the usual Paradigm
AnalystIT
I need to evaluate the possible relationship between variables, X,Y ,Z
OK. We have to evaluate a lot of statistics, set the correct database indexes and partitioning. It will take
us 5 days. Go away
Current Situation
Analyst IT
Okay, I went away and now I am back Done. You can run your analytical query.
After 5 days ...
Analyst IT
Great. I can see here some nice correlations. Now I need to look at it from the different perspective.
After 1 day ...
Ohhh, welcome dear friend. Understand. So, it’s ….
another 5 days of our work
You guys suck. I am outta here%
And now with Some Magic Compute Box
AnalystIT
I need to evaluate the possible relationship between X,Y,Z.
I will use the Magic Box
Analyst IT
Great. I can see here some nice correlations. Now I need to look at it from a different
perspective.With the Magic Box I can run the query
immediately. Go away IT
After 10 minutes ...
IT can do something else – much more
useful – if that is even possible …
Built-In Expertise Makes This as Simple as an Appliance
Go to 'View > Header and Footer' to change this footer text to the event title13
Dedicated device
Optimized for purpose
Complete solution
Fast installation
Very easy operation
Standard interfaces
Low cost
Original Platform
Netezza
Workflow Reporting 2 hours 1 minute
Invoicing and Payments reporting
Payment discipline of current month invoices
33 minutes
17 seconds
Overdue Debt of Invoices – in Current Month
10 hours 23 seconds
Average Monthly Invoice Figures 50 minutes
38 seconds
RESPONSE TIME MASSIVELY IMPROVED
Real Magic Box results using T-Mobile Czech Rep.
Big Data Conundrum• Problems:
• Although there is a massive spike available data, the percentage of the data that an enterprise can understand is on the decline
• The data that the enterprise is trying to understand is saturated with both useful signals and lots of noise
Source: IBM http://www-01.ibm.com/software/data/bigdata/