big data at #waday11

41
@CosimoAccoto Are you ready for the era of “Big Data”? BIG Web Analytics Day, #waday11, Milano data

Upload: cosimo-accoto

Post on 19-Jan-2015

1.771 views

Category:

Business


3 download

DESCRIPTION

An overview on big data, big data analytics, data products and data scientist. My contribution at WAYDAY11 (a Bit Bang Web Analytics Day, Milan, Italy, 11.11.11)

TRANSCRIPT

Page 1: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

BIG Web Analytics Day, #waday11, Milano

data

Page 2: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Let ’ s ta lk about …

W h y : C o m p e t i n g o n ( B i g ) A n a l y t i c s

H o w : D a t a P r o d u c t s & L e a d e r s h i p

W h a t : B i g D a t a ! R e a l i t y B e y o n d H y p e

Page 3: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Page 4: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Sor t ing Real i ty f rom the Hype

ü  Big Data: a top tech trend for 2012 (Forrester Research)

ü  Big Data: a new game-changing asset (The Economist)

ü  Big Data: a scientific revolution (Harvard Business Review)

Page 5: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Science Paradigms Evolution

- A data-intensive computing

- Empirical Science

- Theoretical Modeling

- Computational Simulations

describing natural phenomena

using models and generalizations

simulating complex phenomena

unify, theory, experiment and simulation at scale

source: Gray J., The Fourth Paradigm. Data-Intensive Scientific Discovery, 2009, p. xviii

Page 6: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

The “Forth” Paradigm

The techniques and technologies for such data-intensive science are so different that it is worth distinguishing data-intensive science from computational science as a new, fourth paradigm for scientific exploration

source: Gray J., The Fourth Paradigm. Data-Intensive Scientific Discovery, 2009, p. xix

Page 7: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

“Big Data!!!” “…Say what?”

source: Mckinsey, “Big Data: The Next Frontier for innovation, competition, productiviy, May, 2011, p.1

Page 8: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

“Big Data!!!” “…Say what?”

source: Loukides, “Big Data Now”, O’Reilly Media, 2011, p. 8

Page 9: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: Plattner and Zeier, “In-Memory Data Management”, 2011, p. 15-16; * Driscoll, “Big Data Now”;

The Attack of the Exponentials

Over the past five decades, the cost of storage, CPU, and bandwidth has been exponentially dropping, while network access has exponentially increased*

Page 10: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: IDC, “2011 Digital Universe Study”June, 2011, 2015; Image: Wikibon, 2011

7ZB

1.8ZB

Page 11: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Big Data is not just “big” The 3V of Big Data

source: TDWI Research, “Big Data Analytics”, Fourth Quarter, 2011

Page 12: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

The Data Deluge: Volume

source: Rogers, “Big Data is scaling BI andAnalytics”, Information Management Magazine, 10/2011

Boeing jet engines can produce 10 terabytes of operational information for every 30 minutes they turn. A four- engine jumbo jet can create 640 terabytes of data on just one Atlantic crossing; multiply that by the more than 25,000 flights flown each day, and you get an understanding of the impact that sensor and machine produced data can make on a BI environment.

Page 13: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Online Advertising Serving – 40 millisecond to respond with the decision (deliver the right adv to the right user profile)

Financial Services – near 1 millisecond to calculate customer scoring probabilities There  are  many  examples  of  data  that  might  demand  analysis  in  real  4me  or  near  real  4me,  or  at  least  in  less  than  a  day.  RFID  sensor  data  and  GPS  spa4al  data  show  up  in  4me-­‐sensi4ve  transporta4on  logis4cs.  Fast-­‐moving  financial  trading  data  feeds  fraud-­‐detec4on  and  risk  assessments.  

Streaming Real-Time Data: Velocity

source: TDWI Research, “Big Data Analytics”, Fourth Quarter, 2011 (image)

Page 14: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Data outside of Databases: Variety

Wal-Mart, the world's largest retailer, is logging one million customer

transactions per hour and feeding information into

databases estimated at 2.5 petabytes.

Old & New Data Sources:

rfid’s, sensors, mobile payment, in-vehicle tracking, …

Sources  of  Retail  Data  

Channel,  Reseller,  

Retailer,  DC,  Store,  Online  

Brand,  Product,  SKU,  Serial  Number,  

RFID  

Sell-­‐in,  Sell-­‐thru  (and  

again),  Sell-­‐out  

Channel/Trade  programs,  discounts,  rebates  

CRM,  Loyalty,  personalized  coupons  

Price,  margin,  elasJcity  

AdverJsing,  promoJon  liM  library,  web-­‐to-­‐store,  POP  

source: Craig and Craig, “Retail Lesson Learned..”, Strata Conference, 2011

Page 15: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: Evelson and Hopkins, “Expand Your Digital Horizon with Big Data”, Forrester Research, 2011

if you just have high volume or velocity, then big data may not

be appropriate. As characteristics accumulate, however, big data

becomes attractive by way of cost. The two main drivers are

volume and velocity, while variety and variability shift the curve

Variety & Variability

Page 16: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: Hadoop, 2011

Page 17: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: Mckinsey Quaterly, “Are you ready for the era of “Big Data”?”, October, 2011

Not all industries are created equal

Page 18: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Not all retail subsectors are created equal

source: Mckinsey, “Big Data: The Next Frontier for innovation, competition, productiviy, May, 2011, p.82

Page 19: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: IBM CMO C-suite studies, “From Stretched to Strengthened”, 2011, p. 16

Page 20: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: Economist Intelligence Unit, “Big Data: Harnessing a game-changing asset”, 2011, p. 17

Growing Pains

Storing, securing and reconciling data are the most fundamental aspects of any data management strategy But the heavy lifting starts when companies begin extracting meaningful insights from the data and disseminating them throughout the organization

Page 21: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

From Stretched to Strengthened

source: Economist Intelligence Unit, “Big Data: Harnessing a game-changing asset”, 2011, p. 11

Page 22: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: IBM CMO C-suite studies, “From Stretched to Strengthened”, 2011, p. 16

If the 1st CMO Challenge is Data Deluge, the 1st CIO Plan Investment 2012 is BI and Analytics

Page 23: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

“Big Data” & “Analytics” Together?

big data analytics is the application of advanced analytic techniques to very big data sets

source: TDWI Research, “Big Data Analytics”, Fourth Quarter, 2011, p. 5

advanced analytics as a discovery mission

… and a data products builder

Page 24: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: TDWI Research, “Big Data Analytics”, Fourth Quarter, 2011, p. 23

Big Data Analytics

Page 25: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: TDWI Research, “Big Data Analytics”, Fourth Quarter, 2011, p. 25

Growth/Commitment

Data visualizazion/discovery BI/Predictive analytics

Data/Text/Content Mining Pattern Recognition

In-memory/real-time analytics Machine Learning

Page 26: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: Iliinsky and Steele, “Designing Data Visualizations 2011, p. 5,6,7

Datavis vs Infographics

infographics is useful for referring to any visual representation of data that is: • manually drawn (and therefore a custom treatment of the information); • specific to the data at hand (and therefore nontrivial to recreate with different data); • aesthetically rich (strong visual content meant to draw the eye and hold interest);

data visualization and information visualization (casually, data viz and info viz) are useful for referring to any visual representation of data that is: • algorithmically drawn (may have custom touches but is largely rendered with the help of computerized methods); • easy to regenerate with different data (the same form may be repurposed to represent different datasets with similar dimes/caract); • often aesthetically barren (data is not decorated); and • relatively data-rich (large volumes of data are welcome and viable, in contrast to infographics).

Page 27: Big Data at #WADAY11

Being (Big) Data-Driven

@CosimoAccoto Are you ready for the era of “Big Data”?

A data-driven organization acquires, processes, and leverages data in a timely fashion to create efficiencies, iterate on and develop new products and navigate the competitive landscape.

source: Patil D.J., “Building Data Science Teams”, O’Reilly, 2011, p. 2

Page 28: Big Data at #WADAY11

Being (Big) Data-Driven

@CosimoAccoto Are you ready for the era of “Big Data”? source: Patil D.J., “Building Data Science Teams”, O’Reilly, 2011, p. 2

Zynga constantly monitors who their users are and what they are doing, generating an incredible amount of data in the process. By analyzing how people interact with a game over time, they have identified tipping points that lead to a successful game. They know how the probability that users will become long-term changes based on the number of interactions they have with others, the number of buildings they build in the first n days, the number of mobsters they kill in the first m hours, etc. They have figured out the keys to the engagement challenge and have built their product to encourage users to reach those goals.

Page 29: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Data Scientists and “Data Products”

•  Products that provide highly personalized content (e.g., the ordering/ ranking of information in a news feed). • Products that help drive the company’s value proposition (e.g., “People You May Know” and other applications that suggest friends or other types of connections). • Products that facilitate the introduction into other products (e.g., “Groups You May Like,” which funnels you into LinkedIn’s Groups product area). • Products that prevent dead ends (e.g., collaborative filters that suggest further purchases, such as Amazon’s “People who viewed this item also viewed ...”). • Products that are stand alone (e.g., news relevancy products like Google News, LinkedIn Today, etc.).

source: Patil D.J., “Building Data Science Teams”, O’Reilly, 2011, p. 2

Page 30: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

The roles of a data scientist

Decision sciences and business intelligence  Product and marketing analytics Fraud, abuse, risk and security Data services and operations Data engineering and infrastructure Organizational and reporting alignment

source: Patil D.J., “Building Data Science Teams”, O’Reilly, 2011

Page 31: Big Data at #WADAY11

Being (Big) Data-Driven

@CosimoAccoto Are you ready for the era of “Big Data”?

…Hey look!!!..I’m not Zynga,Google or FB… I’m a retailer, I’m a bank, I’m an insurance, I’m a publisher, I’m a fashionist … …What does it mean to me?

Page 32: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Big Data in Consumer Electronics

HP and the “Project Fusion” - To correlate social media conversations about specific product features to actual customer transactions in real-time

1. “unstructured data” (Amazon.com reviews, customer surveys, customer support logs, and other natural-language text); 2. “structured data” (customer support tickets, sales transactions, customer demographics)

source: Prasanna Dhore “Customer Intelligence at HP”, 2010

Page 33: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Big Data in Consumer Electronics

source: Prasanna Dhore “Customer Intelligence at HP”, 2010

customer sentiment analysis (unstructured) + product profile (structured)

Page 34: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”? source: Prasanna Dhore “Customer Intelligence at HP”, 2010

Big Data in Consumer Electronics

Page 35: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Machine Learning in Travel Services

Improve the customer experience (reduce latency, increase coverage) when searching for hotel rates while controlling impact on suppliers (maintain “look-to-book”). Hotel sort optimization: How can we improve the ranking of hotel search results in order to show consumers hotels that more closely match their preferences? Cache optimization: can we intelligently cache hotel rates in order to optimize the performance of hotel searches? Personalization/segmentation: can we show targeted search results to specific consumer segments?

source: Orvitz Worldwide, 2011

Page 36: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Machine Learning in Travel Services

Data Driven Approaches:

Traffic Partitioning: Identify the subset of traffic that is most efficient and optimize that subset through prefetching and increased bursting. TTL Optimization: Use historic logs of availability and rate change information to predict volatility of hotel rates and optimize cache TTL.

source: Orvitz Worldwide, 2011

Page 37: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Big Data Value for Car Rental

The goal is to identify car and equipment rental performance levels to enable pinpointing issues and making the necessary adjustments to improve customer satisfaction levels. Using analytics software, Hertz location managers are able to effectively monitor customer comments to deliver top customer satisfaction scores for this critical level of service. In Philadelphia, survey feedback led managers to discover that delays were occurring at the returns area during certain parts of the day. They quickly adjusted staffing levels and ensured a manager was always present in the area during these specific times.

source: IBM big Data Cases, 2011

Page 38: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Big Data in Location-Based Services

Customer supports/suggestions

source: in Amazon Big Data Use Cases , 2011

Page 39: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Big Data in Location-Based Services and more…

•  Analyze ad stats (reporting, billing, algorithm inputs)

•  Analyze A/B test results

•  Detect duplicate business listings

•  Email bounce processing

•  Identify bots based on traffic patterns

Not only Data Products…

source: in Amazon Big Data Use Cases , 2011

Page 40: Big Data at #WADAY11

@CosimoAccoto Are you ready for the era of “Big Data”?

Are you ready to be a Big Data Leader? ;-)

Page 41: Big Data at #WADAY11

Thanks

@CosimoAccoto

This research is part of a more general project on control and management in digital markets, to which the author collaborates as field expert with prof.

Andreina Mandelli, SDA Bocconi Milan and USI Lugano, within the framework of the project

BIT (Business Information Technology)

http://www.anderson.ucla.edu

For more information on the project: [email protected]

[email protected]