heavy, messy, misleading. why big data is a human problem, not a technology one

94
Francesco D’Orazio, @abc3d VP Product, PulsarPlatform.com Heavy, Messy, Misleading Why Big Data is a human problem, not a technology one

Upload: francesco-dorazio

Post on 02-Jul-2015

712 views

Category:

Data & Analytics


0 download

DESCRIPTION

"Big data" has been around for a few years now but for every hundred people talking about it there’s probably only one actually doing it. As a result Big Data has become the preferred vehicle for inflated expectations and misguided strategy. As always, language holds the key and the seed of the issue is reflected in the expression itself. "Big Data" is not so much about a quality of the data or the tools to mine it, it’s about a new approach to product, policy or business strategy design. And that’s way harder and trickier to implement than any new technology stack. In this talk I look at where Big Data is going, what are the real opportunities, limitations and dangers and what can we do to stop talking about it and start doing it today.

TRANSCRIPT

Page 1: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Francesco D’Orazio, @abc3d VP Product, PulsarPlatform.com

Heavy, Messy, Misleading Why Big Data is a human problem, not a technology one

Page 2: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Every talk about Big Data should start with Twin Peaks

Page 3: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 4: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 5: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 6: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 7: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

There’s more to big data than the technology behind it.

And the best way to find out what it is, is to start from the

metaphors of Big Data

Page 8: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

stream of data

Page 9: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

ocean of data

Page 10: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

river of data

Page 11: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

a data leak

Page 12: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

data firehose

Page 13: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

data flood

Page 14: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

data tsunami

Page 15: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

data “is” fluid

Page 16: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

data “is” huge

Page 17: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

data “is” powerful

Page 18: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

data “is” unpredictable

Page 19: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

data “is” uncontrollable

Page 20: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Data is the new oil(?!)

Page 21: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

We are not going to war for it (yet)

Page 22: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Data is not a scarce resource

Page 23: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

The abundance of data is the result of the instrumentalisation

of the natural, industrial and social worlds

Page 24: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 25: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

The Large Hadron Collider can record up to 40 million particle interactions per second

Page 26: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

The Square Kilometer Array will collect data from the deep space dating back to more than 13 billion years ago

Page 27: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Wolfram Data Science on Facebook Data: how our topics of discussion change by age and gender

Page 28: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Carna Botnet: in just 60 seconds nearly 640 Terabytes of IP data is tranferred across the globe via the Internet

Page 29: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 30: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 31: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 32: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 33: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Machine Sensing

Page 34: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 35: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 36: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

The sensors on the new Airbus 380 generate 10 terabytes of data every 30 minutes. That’s 120T every LDN-NYC flight

Page 37: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 38: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 39: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

And yet, another reason why data is not the new oil is that we are

not actually doing it much…

Page 40: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

99.5% Percentage of newly created digital data

that’s never analysed

Page 41: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 42: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

But that’s not strictly true either…

Page 43: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

0.5% Percentage of newly created digital data

that’s actually being used

Page 44: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

higher % of teenagers having sex vs

% of new data being analysed

Page 45: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Credit Scores have replaced the handshake with the bank manager

Page 46: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Fair and Isaac came along in 1956. Today they crunch around 10 billion scores each year

Page 47: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Buying advertising used to be about smiles and jokes over Martini lunches

Page 48: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Now it looks more like this…

Page 49: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

11 seconds of trading for the FB shares. Already in 2006 one third of all transactions in EU and US was algorithmic

Page 50: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Walmart handles more than 1M customer transactions per hour, all affected by price elasticity

Page 51: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Price Discrimination based on log in info, browser history, device, A/B testing is common practice for most online retailers

Page 52: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

75% of the content Netflix serves is chosen based on a Netflix recommendation

Page 53: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

At Buzzfeed every item of content has its own dashboard showing how it spreads form ‘seed views’ to ‘social views’ and by what ‘viral lift’

Page 54: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Upworthy

Systematic experimentation: 15% of the top 10.000 websites uses A/B testing

Page 55: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Crowdpac matches candidates and funders based on analysis of public speeches, contribution and other sources of public data on the candidate

Page 56: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

LAPD run a pilot to predict where a crime is going to happen next (‘crime aftershocks’) based on 13 million crimes over 80 years

Page 57: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

The Dubai police is equipping officers with Google Glass enabled with face recognition to identify potential wanted criminals

Page 58: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Target Data can predict with 75% accuracy the likelihood that a home will sell in the next 30, 60 or 90 days

Page 59: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 60: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 61: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

LinkedIn had a student problem: so they re-arranged the data they already have for a student audience

Page 62: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

99.5% Why then are we throwing away this

much data?

Page 63: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

We are still learning to recognize problems as

data-problems

Page 64: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 65: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Big Data changes the very definition of how we produce knowledge

Page 66: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Less > More Exact > Messy

Causation > Correlation

Page 67: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Significant correlation requires scale. And scale is hard to handle.

Page 68: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

DNA research is a case in point: DNA data is hard to manipulate and there’s not enough sequenced DNA available to establish significant patterns

Page 69: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Big Data comes with Big Errors

Page 70: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Data is rarely normalised.

Page 71: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Data is siloed and not verifiable.

Page 72: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Big does not equal whole.

Page 73: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Big does not equal representative.

Page 74: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Data doesn’t speak for itself. We speak for it.

Page 75: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Big Data is still biased and the result of interpretation.

Page 76: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Correlation doesn’t imply causality.

Page 77: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 78: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 79: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 80: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Models are often too simple and not peer-reviewed.

Page 81: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 82: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Context is hard to interpret at scale. Traditional Qual & Quant

have to work with big data.

Page 83: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 84: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

3 billion queries/day 50 million top keywords identified 5 years of data on flu spread matched Overestimates by 50% Didn’t predict pandemics

Page 85: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Big Data also means a big

new digital divide.

Page 86: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one
Page 87: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Accessible doesn’t mean ethical.

Page 88: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

The problems slowing down the adoption of Big Data

are human problems

Page 89: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

And that’s because the biggest innovation in Big

Data is a human innovation

Page 90: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

An innovation in decision-making: framing,

solving and actioning a problem

Page 91: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

“Data is just like crude. It’s valuable, but if unrefined it

cannot really be used. It has to be changed into gas, plastic, chemicals, etc., to create a valuable entity that drives

profitable activity; so must data be broken down, analyzed for it

to have value” Michael Palmer

Page 92: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

The opportunity in Big Data is data middleware: turning

crude into gas, plastic, chemicals

Page 93: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

But until we invent the new plastic, the new gas, the new chemicals, we are stuck with the smokescreen. Or even the smoke monster.

Page 94: Heavy, messy, misleading. Why Big Data is a human problem, not a technology one

Francesco D’Orazio, @abc3d VP Product, PulsarPlatform.com

Thank You