big data at #waday11
DESCRIPTION
An overview on big data, big data analytics, data products and data scientist. My contribution at WAYDAY11 (a Bit Bang Web Analytics Day, Milan, Italy, 11.11.11)TRANSCRIPT
@CosimoAccoto Are you ready for the era of “Big Data”?
BIG Web Analytics Day, #waday11, Milano
data
@CosimoAccoto Are you ready for the era of “Big Data”?
Let ’ s ta lk about …
W h y : C o m p e t i n g o n ( B i g ) A n a l y t i c s
H o w : D a t a P r o d u c t s & L e a d e r s h i p
W h a t : B i g D a t a ! R e a l i t y B e y o n d H y p e
@CosimoAccoto Are you ready for the era of “Big Data”?
@CosimoAccoto Are you ready for the era of “Big Data”?
Sor t ing Real i ty f rom the Hype
ü Big Data: a top tech trend for 2012 (Forrester Research)
ü Big Data: a new game-changing asset (The Economist)
ü Big Data: a scientific revolution (Harvard Business Review)
@CosimoAccoto Are you ready for the era of “Big Data”?
Science Paradigms Evolution
- A data-intensive computing
- Empirical Science
- Theoretical Modeling
- Computational Simulations
describing natural phenomena
using models and generalizations
simulating complex phenomena
unify, theory, experiment and simulation at scale
source: Gray J., The Fourth Paradigm. Data-Intensive Scientific Discovery, 2009, p. xviii
@CosimoAccoto Are you ready for the era of “Big Data”?
The “Forth” Paradigm
The techniques and technologies for such data-intensive science are so different that it is worth distinguishing data-intensive science from computational science as a new, fourth paradigm for scientific exploration
source: Gray J., The Fourth Paradigm. Data-Intensive Scientific Discovery, 2009, p. xix
@CosimoAccoto Are you ready for the era of “Big Data”?
“Big Data!!!” “…Say what?”
source: Mckinsey, “Big Data: The Next Frontier for innovation, competition, productiviy, May, 2011, p.1
@CosimoAccoto Are you ready for the era of “Big Data”?
“Big Data!!!” “…Say what?”
source: Loukides, “Big Data Now”, O’Reilly Media, 2011, p. 8
@CosimoAccoto Are you ready for the era of “Big Data”? source: Plattner and Zeier, “In-Memory Data Management”, 2011, p. 15-16; * Driscoll, “Big Data Now”;
The Attack of the Exponentials
Over the past five decades, the cost of storage, CPU, and bandwidth has been exponentially dropping, while network access has exponentially increased*
@CosimoAccoto Are you ready for the era of “Big Data”? source: IDC, “2011 Digital Universe Study”June, 2011, 2015; Image: Wikibon, 2011
7ZB
1.8ZB
@CosimoAccoto Are you ready for the era of “Big Data”?
Big Data is not just “big” The 3V of Big Data
source: TDWI Research, “Big Data Analytics”, Fourth Quarter, 2011
@CosimoAccoto Are you ready for the era of “Big Data”?
The Data Deluge: Volume
source: Rogers, “Big Data is scaling BI andAnalytics”, Information Management Magazine, 10/2011
Boeing jet engines can produce 10 terabytes of operational information for every 30 minutes they turn. A four- engine jumbo jet can create 640 terabytes of data on just one Atlantic crossing; multiply that by the more than 25,000 flights flown each day, and you get an understanding of the impact that sensor and machine produced data can make on a BI environment.
@CosimoAccoto Are you ready for the era of “Big Data”?
Online Advertising Serving – 40 millisecond to respond with the decision (deliver the right adv to the right user profile)
Financial Services – near 1 millisecond to calculate customer scoring probabilities There are many examples of data that might demand analysis in real 4me or near real 4me, or at least in less than a day. RFID sensor data and GPS spa4al data show up in 4me-‐sensi4ve transporta4on logis4cs. Fast-‐moving financial trading data feeds fraud-‐detec4on and risk assessments.
Streaming Real-Time Data: Velocity
source: TDWI Research, “Big Data Analytics”, Fourth Quarter, 2011 (image)
@CosimoAccoto Are you ready for the era of “Big Data”?
Data outside of Databases: Variety
Wal-Mart, the world's largest retailer, is logging one million customer
transactions per hour and feeding information into
databases estimated at 2.5 petabytes.
Old & New Data Sources:
rfid’s, sensors, mobile payment, in-vehicle tracking, …
Sources of Retail Data
Channel, Reseller,
Retailer, DC, Store, Online
Brand, Product, SKU, Serial Number,
RFID
Sell-‐in, Sell-‐thru (and
again), Sell-‐out
Channel/Trade programs, discounts, rebates
CRM, Loyalty, personalized coupons
Price, margin, elasJcity
AdverJsing, promoJon liM library, web-‐to-‐store, POP
source: Craig and Craig, “Retail Lesson Learned..”, Strata Conference, 2011
@CosimoAccoto Are you ready for the era of “Big Data”? source: Evelson and Hopkins, “Expand Your Digital Horizon with Big Data”, Forrester Research, 2011
if you just have high volume or velocity, then big data may not
be appropriate. As characteristics accumulate, however, big data
becomes attractive by way of cost. The two main drivers are
volume and velocity, while variety and variability shift the curve
Variety & Variability
@CosimoAccoto Are you ready for the era of “Big Data”? source: Hadoop, 2011
@CosimoAccoto Are you ready for the era of “Big Data”? source: Mckinsey Quaterly, “Are you ready for the era of “Big Data”?”, October, 2011
Not all industries are created equal
@CosimoAccoto Are you ready for the era of “Big Data”?
Not all retail subsectors are created equal
source: Mckinsey, “Big Data: The Next Frontier for innovation, competition, productiviy, May, 2011, p.82
@CosimoAccoto Are you ready for the era of “Big Data”? source: IBM CMO C-suite studies, “From Stretched to Strengthened”, 2011, p. 16
@CosimoAccoto Are you ready for the era of “Big Data”? source: Economist Intelligence Unit, “Big Data: Harnessing a game-changing asset”, 2011, p. 17
Growing Pains
Storing, securing and reconciling data are the most fundamental aspects of any data management strategy But the heavy lifting starts when companies begin extracting meaningful insights from the data and disseminating them throughout the organization
@CosimoAccoto Are you ready for the era of “Big Data”?
From Stretched to Strengthened
source: Economist Intelligence Unit, “Big Data: Harnessing a game-changing asset”, 2011, p. 11
@CosimoAccoto Are you ready for the era of “Big Data”? source: IBM CMO C-suite studies, “From Stretched to Strengthened”, 2011, p. 16
If the 1st CMO Challenge is Data Deluge, the 1st CIO Plan Investment 2012 is BI and Analytics
@CosimoAccoto Are you ready for the era of “Big Data”?
“Big Data” & “Analytics” Together?
big data analytics is the application of advanced analytic techniques to very big data sets
source: TDWI Research, “Big Data Analytics”, Fourth Quarter, 2011, p. 5
advanced analytics as a discovery mission
… and a data products builder
@CosimoAccoto Are you ready for the era of “Big Data”? source: TDWI Research, “Big Data Analytics”, Fourth Quarter, 2011, p. 23
Big Data Analytics
@CosimoAccoto Are you ready for the era of “Big Data”? source: TDWI Research, “Big Data Analytics”, Fourth Quarter, 2011, p. 25
Growth/Commitment
Data visualizazion/discovery BI/Predictive analytics
Data/Text/Content Mining Pattern Recognition
In-memory/real-time analytics Machine Learning
@CosimoAccoto Are you ready for the era of “Big Data”? source: Iliinsky and Steele, “Designing Data Visualizations 2011, p. 5,6,7
Datavis vs Infographics
infographics is useful for referring to any visual representation of data that is: • manually drawn (and therefore a custom treatment of the information); • specific to the data at hand (and therefore nontrivial to recreate with different data); • aesthetically rich (strong visual content meant to draw the eye and hold interest);
data visualization and information visualization (casually, data viz and info viz) are useful for referring to any visual representation of data that is: • algorithmically drawn (may have custom touches but is largely rendered with the help of computerized methods); • easy to regenerate with different data (the same form may be repurposed to represent different datasets with similar dimes/caract); • often aesthetically barren (data is not decorated); and • relatively data-rich (large volumes of data are welcome and viable, in contrast to infographics).
Being (Big) Data-Driven
@CosimoAccoto Are you ready for the era of “Big Data”?
A data-driven organization acquires, processes, and leverages data in a timely fashion to create efficiencies, iterate on and develop new products and navigate the competitive landscape.
source: Patil D.J., “Building Data Science Teams”, O’Reilly, 2011, p. 2
Being (Big) Data-Driven
@CosimoAccoto Are you ready for the era of “Big Data”? source: Patil D.J., “Building Data Science Teams”, O’Reilly, 2011, p. 2
Zynga constantly monitors who their users are and what they are doing, generating an incredible amount of data in the process. By analyzing how people interact with a game over time, they have identified tipping points that lead to a successful game. They know how the probability that users will become long-term changes based on the number of interactions they have with others, the number of buildings they build in the first n days, the number of mobsters they kill in the first m hours, etc. They have figured out the keys to the engagement challenge and have built their product to encourage users to reach those goals.
@CosimoAccoto Are you ready for the era of “Big Data”?
Data Scientists and “Data Products”
• Products that provide highly personalized content (e.g., the ordering/ ranking of information in a news feed). • Products that help drive the company’s value proposition (e.g., “People You May Know” and other applications that suggest friends or other types of connections). • Products that facilitate the introduction into other products (e.g., “Groups You May Like,” which funnels you into LinkedIn’s Groups product area). • Products that prevent dead ends (e.g., collaborative filters that suggest further purchases, such as Amazon’s “People who viewed this item also viewed ...”). • Products that are stand alone (e.g., news relevancy products like Google News, LinkedIn Today, etc.).
source: Patil D.J., “Building Data Science Teams”, O’Reilly, 2011, p. 2
@CosimoAccoto Are you ready for the era of “Big Data”?
The roles of a data scientist
Decision sciences and business intelligence Product and marketing analytics Fraud, abuse, risk and security Data services and operations Data engineering and infrastructure Organizational and reporting alignment
source: Patil D.J., “Building Data Science Teams”, O’Reilly, 2011
Being (Big) Data-Driven
@CosimoAccoto Are you ready for the era of “Big Data”?
…Hey look!!!..I’m not Zynga,Google or FB… I’m a retailer, I’m a bank, I’m an insurance, I’m a publisher, I’m a fashionist … …What does it mean to me?
@CosimoAccoto Are you ready for the era of “Big Data”?
Big Data in Consumer Electronics
HP and the “Project Fusion” - To correlate social media conversations about specific product features to actual customer transactions in real-time
1. “unstructured data” (Amazon.com reviews, customer surveys, customer support logs, and other natural-language text); 2. “structured data” (customer support tickets, sales transactions, customer demographics)
source: Prasanna Dhore “Customer Intelligence at HP”, 2010
@CosimoAccoto Are you ready for the era of “Big Data”?
Big Data in Consumer Electronics
source: Prasanna Dhore “Customer Intelligence at HP”, 2010
customer sentiment analysis (unstructured) + product profile (structured)
@CosimoAccoto Are you ready for the era of “Big Data”? source: Prasanna Dhore “Customer Intelligence at HP”, 2010
Big Data in Consumer Electronics
@CosimoAccoto Are you ready for the era of “Big Data”?
Machine Learning in Travel Services
Improve the customer experience (reduce latency, increase coverage) when searching for hotel rates while controlling impact on suppliers (maintain “look-to-book”). Hotel sort optimization: How can we improve the ranking of hotel search results in order to show consumers hotels that more closely match their preferences? Cache optimization: can we intelligently cache hotel rates in order to optimize the performance of hotel searches? Personalization/segmentation: can we show targeted search results to specific consumer segments?
source: Orvitz Worldwide, 2011
@CosimoAccoto Are you ready for the era of “Big Data”?
Machine Learning in Travel Services
Data Driven Approaches:
Traffic Partitioning: Identify the subset of traffic that is most efficient and optimize that subset through prefetching and increased bursting. TTL Optimization: Use historic logs of availability and rate change information to predict volatility of hotel rates and optimize cache TTL.
source: Orvitz Worldwide, 2011
@CosimoAccoto Are you ready for the era of “Big Data”?
Big Data Value for Car Rental
The goal is to identify car and equipment rental performance levels to enable pinpointing issues and making the necessary adjustments to improve customer satisfaction levels. Using analytics software, Hertz location managers are able to effectively monitor customer comments to deliver top customer satisfaction scores for this critical level of service. In Philadelphia, survey feedback led managers to discover that delays were occurring at the returns area during certain parts of the day. They quickly adjusted staffing levels and ensured a manager was always present in the area during these specific times.
source: IBM big Data Cases, 2011
@CosimoAccoto Are you ready for the era of “Big Data”?
Big Data in Location-Based Services
Customer supports/suggestions
source: in Amazon Big Data Use Cases , 2011
@CosimoAccoto Are you ready for the era of “Big Data”?
Big Data in Location-Based Services and more…
• Analyze ad stats (reporting, billing, algorithm inputs)
• Analyze A/B test results
• Detect duplicate business listings
• Email bounce processing
• Identify bots based on traffic patterns
Not only Data Products…
source: in Amazon Big Data Use Cases , 2011
@CosimoAccoto Are you ready for the era of “Big Data”?
Are you ready to be a Big Data Leader? ;-)
Thanks
@CosimoAccoto
This research is part of a more general project on control and management in digital markets, to which the author collaborates as field expert with prof.
Andreina Mandelli, SDA Bocconi Milan and USI Lugano, within the framework of the project
BIT (Business Information Technology)
http://www.anderson.ucla.edu
For more information on the project: [email protected]