big data - a critical appraisal

38
+ Big Data A critical appraisal Thomas Debeauvais [email protected] Bart Knijnenburg [email protected]

Upload: bart-knijnenburg

Post on 30-Jun-2015

399 views

Category:

Education


3 download

DESCRIPTION

Invited talk by Bart Knijnenburg and Thomas Debeauvais at the IIBA OC dinner meeting

TRANSCRIPT

Page 1: Big data - A critical appraisal

+

Big DataA critical appraisal

Thomas [email protected]

Bart [email protected]

Page 2: Big data - A critical appraisal

2+Outline

The wonders of Big Data

The Perils of Big Data

User Experiments

A Note on Privacy

Page 3: Big data - A critical appraisal

+

The Wonders of Big DataHow Big Data will put the personal backin e-commerce

Page 4: Big data - A critical appraisal

4+Large vs small datasets

Everything is significant!

Data from most/all of your customers More than just an educated guess This is what really happens!

Large datasets can improve business intelligence

Page 5: Big data - A critical appraisal

+ 5

The Netflix challenge

Recommendations seen as Netflix’ strongest asset

2006-2009

$1M prize if 10% better than Netflix’s Moviematch

Data: 18k movies, 500k users, 100M ratings

Page 6: Big data - A critical appraisal

6+The Netflix challenge

Netflix’s rational: “Improve our ability to connect people to the movies they

love” Improve recommendations = improve satisfaction and

retention Small R&D team, slow progress $1M will pay for itself

Based on Padhraic Smyth’s report at http://www.ics.uci.edu/~smyth/courses/cs277/slides/netflix_overview.pdf

Page 7: Big data - A critical appraisal

7+Matrix approximation

Distinguish noise from signal: variance and eigenvalues

Singular value decomposition Ratings(m*n) = U(m*n) E(n*n) V(n*n)

Rank-k approximation Ratings(m*n) ≈ U(m*k) E(k*k) V(k*n)

Ratings =

n movies n moviesk

m u

sers

U

V

m u

sers kE

kk

Page 8: Big data - A critical appraisal

8independent, quirky, critically acclaimed

mainstream,formulaic

Lowbrow comedies,Horror,Male or adolescent audience

Drama, serious comedy,Strong femalelead

[Koren et al. 2009]

Plot of V with k=2

Page 9: Big data - A critical appraisal

+ 9

Bias is information

[Smyth 2010]

Page 10: Big data - A critical appraisal

10+Take-aways

Matrix decomposition Meaningful movie categories! For example: lowbrow, quirky, indie, strong female lead

Older movies are rated higher So ...? Should recommend older movies more often or less often? Why are they rated higher?

Page 11: Big data - A critical appraisal

+

The Perilsof Big DataHow overfitting and a lack of domain knowledge can lead to suboptimal solutions

Page 12: Big data - A critical appraisal

12+What about random?

“We were demonstrating our new recommender to a client. They were amazed by how well it predicted their preferences!”

“Later we found out that we forgot to activate the algorithm: the system was giving completely random recommendations.”

Page 13: Big data - A critical appraisal

+ 13

Tradeoffs

Page 14: Big data - A critical appraisal

14+Model complexity

“Our winning entries consist of more than 100 different predictor sets” [Koren et al 2009]

Only 10% better than Netflix Why?

Intrinsic noise Example: children watch cartoons, Mum is recommended

cartoons Should Netflix implement a “switch user” feature? Domain knowledge!

Page 15: Big data - A critical appraisal

15+More gotchas

Obvious truisms and correlation fallacies Still present in large datasets Domain knowledge!

Overfitting: simple models that make sense vs complex models that fit the data

Page 16: Big data - A critical appraisal

+

User ExperimentsHow user evaluations can be used to create meaningful experiences

Page 17: Big data - A critical appraisal

17+Offline evaluations

Calibration/Evaluation Gather rating data Remove 10% of the ratings of each user Optimize the algorithm to predict those 10%

Execution Predict the rating of unknown items Recommend items with highest predicted rating

Page 18: Big data - A critical appraisal

+ 18

Offline evaluations

Problems Offline evaluations may

not give the same outcome as online evaluations (Cosley et al., 2002; McNee et al., 2002)

Higher rating does not mean good recommendation (McNee et al., 2006)

The algorithm counts for only 5% of the relevance of a recommender system (Francisco Martin, 2009)

Solutions Test with real users

(A/B testing)

Consider other behaviors(consumption, retention)

A/B test other aspects(interaction, presentation)

http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html

Page 19: Big data - A critical appraisal

+ 19

Online evaluations

Testing a recommender against a random videoclip system (A/B test) Expectation: Consumption

will increase Reality: The number of

clicked clips and total viewing time went down!

Insight: Recommender is more effective More clips watched from

beginning to end Users browse less,

consume more

Page 20: Big data - A critical appraisal

20+Behavior vs Questionnaires

Behavior is hard to interpret Relationship between behavior and satisfaction is not

always trivial

Questionnaires are a better predictor of long-term retention With behavior only, you will need to run for a long time

Questionnaire data is more robust Fewer participants needed

Page 21: Big data - A critical appraisal

21+A guide to user experiments

“Is my system good?” What does good mean? We need to define measures

“Does my system score high on this satisfaction scale?” What does high mean? We need to compare it against something

“Does my system score higher than this other system?” Say we find that it scores higher on satisfaction... why does

it? Apply the concept of ceteris paribus

http://bit.ly/recsys2011short http://bit.ly/recsystutorialhandout

Page 22: Big data - A critical appraisal

+ 22

An example…

We compared three recommender systems Three different algorithms

System effectiveness scale: The system has no real

benefit for me. I would recommend the

system to others. The system is useful. I can save time using the

system. I can find better TV

programs without the help of the system.

Page 23: Big data - A critical appraisal

+ 23

An example…

The mediating variables tell the entire story

Page 24: Big data - A critical appraisal

24+An example…

Page 25: Big data - A critical appraisal

+

A Note on PrivacyHow to avoid this looming dangerof our Big Data future

Page 26: Big data - A critical appraisal

+ 26

Personalization… with control

Page 27: Big data - A critical appraisal

27+Privacy concerns

Second Netflix challenge

Anonymized dataset

Lawsuit from Californian closeted lesbian Mum

Netflix withdraws their second challenge

http://arstechnica.com/tech-policy/2012/07/class-action-lawsuit-settlement-forces-netflix-privacy-changes/

Page 28: Big data - A critical appraisal

+ 28

Privacy directive

Transparency “companies should

provide clear descriptions of [...] why they need the data, how they will use it”

Informed consent

Control “companies should offer

consumers clear and simple choices [...] about personal data collection, use, and disclosure”

User empowerment

Page 29: Big data - A critical appraisal

29+Transparency Paradox

Page 30: Big data - A critical appraisal

30+Control Paradox

“bewildering tangle of options” (New York Times, 2010)

“labyrinthian controls” (U.S. Consumer Magazine, 2012)

Researchers asked: “what do your privacy settings mean?” 86% of Facebook users got it wrong!

Page 31: Big data - A critical appraisal

31+Control Paradox

Introducing an “extreme” sharing option Nothing - City - Block Add the option Exact

Expected: Some will choose Exact

instead of Block

Unexpected: Sharing increases

across the board!

http://bit.ly/chi2013privacy

B

N

privacy

benefits

C

E

Page 32: Big data - A critical appraisal

32+Bounded rationality

ABCD

25%37%53%0%

????

Page 33: Big data - A critical appraisal

+ 33

Idea: nudging

People do not always choose what is best for them

Idea: use defaults to “nudge” users in the right direction

Page 34: Big data - A critical appraisal

34+What is the right direction?

“More information = better, e.g. for personalization” Techniques to increase disclosure cause reactance in the

more privacy-minded users

“Privacy is an absolute right“ More difficult for less privacy-minded users to enjoy the

benefits that disclosure would provide

Page 35: Big data - A critical appraisal

+ 35

It depends on the user!

“What is best for consumers depends upon characteristics of the consumer

An outcome that maximizes consumer welfare may be suboptimal for some consumers in a context where there is heterogeneity in preferences” (Smith, Goldstein & Johnson, 2009)

Page 36: Big data - A critical appraisal

36+Privacy Adaptation Procedure

Idea: Personalize users’ privacy settings! Automatic defaults in line with “disclosure profile” Using big data to improve big data privacy

Relieves some of the burden of the privacy decision: The right privacy-related information The right amount of control

“Realistic empowerment”

http://bit.ly/privdim

Page 37: Big data - A critical appraisal

+

Conclusions

The wonders of Big Data

Big Data can be used to create powerful personalized e-commerce experiences

The Perils of Big Data

Big Data solutions will only work if the developers have an adequate amount of domain knowledge

User Experiments

Big Data solutions need to be tested on real users, with a focus on user experience

A Note on Privacy

Big Data can raise privacy concerns, but it can at the same time be used to alleviate these concerns

Page 38: Big data - A critical appraisal

+

Questions?

The wonders of Big Data Big Data can be used to create

powerful personalized e-commerce experiences

The Perils of Big Data Big Data solutions will only work if

the developers have an adequate amount of domain knowledge

User Experiments Big Data solutions need to be

tested on real users, with a focus on user experience

A Note on Privacy Big Data can raise privacy

concerns, but it can at the same time be used to alleviate these concerns