(big) data science

25
1

Upload: danmallinger

Post on 26-Oct-2014

117 views

Category:

Documents


4 download

DESCRIPTION

This talk covers:1. The current state of data science2. The problems firms are having in transitioning their skills3. The need to re-think the role of "data scientist"4. How to achieve success at your firm.

TRANSCRIPT

Page 1: (Big) Data Science

1"

Page 2: (Big) Data Science

(Dan’s"Introduc0on)"

2"

Page 3: (Big) Data Science

(Thanks"to"Think"Big)"

3"

Page 4: (Big) Data Science

4"

Page 5: (Big) Data Science

•  Data"science"seen"as"a"job"role"•  The"magical"insight"of"data"•  But"data"DOES"lie"•  Companies"can’t"wait"for"data"specialists"to"have"all"the"hypotheses"and"answers"•  Good"data"science"is"good"research"•  Research"requires"collabora0on,"is"theory"driven,"and"purposeful"•  No"one"person"will"bring"successful"data"science"to"an"organiza0on"

5"

Page 6: (Big) Data Science

•  Data"science"is"not"the"result,"it"is"the"process"•  While"data"wrangling"and"sta0s0cs"might"go"together,"these"folks"need"to"work"

with"business"and"domain"experts"•  Domain"experts"set"theory"and"help"create"hypotheses"•  Business"helps"keep"a"focus"on"meaningful"ques0ons"•  Engineers,"stats,"biz,"etc."working"in"concert"IS"data"science,"the"product"is"insight"

6"

Page 7: (Big) Data Science

•  Data"science"is"s0ll"the"wild"west"•  A"lack"of"good"tools"for"wrangling"data"•  A"lack"of"good"tools"for"performing"analysis"•  Need"for"es0mators"and"crea0ve"solu0ons"

7"

Page 8: (Big) Data Science

•  Big"data"is"new"•  S0ll"lots"of"low"hanging"fruit"•  BIG"value"in"crea0ve,"not"complex,"solu0ons"

8"

Page 9: (Big) Data Science

9"

Page 10: (Big) Data Science

•  Lets"(quickly)"review"what"tools"are"commonly"used"in"the"space"

10"

Page 11: (Big) Data Science

11"

Page 12: (Big) Data Science

•  Just"because"you"have"a"big,"fancy"cluster,"don’t"forget"about"your"technique"specific"tools"and"best"research"prac0ces"

12"

Page 13: (Big) Data Science

13"

Page 14: (Big) Data Science

•  We’ve"seen"the"tools,"some"of"the"exis0ng"packages.""What"does"the"future"of"data"science"and"hadoop"hold?"

14"

Page 15: (Big) Data Science

•  If"we"look"at"the"tools"being"developed/sold,"we"see"the"future"through"their"eyes"•  Focus"seems"to"be"on"algorithms"in"a"library"•  Less"emphasis"on"components"which"can"be"pulled"together"in"novel"ways"•  E.g."do"we"start"with"“collabora0ve"filtering”"and"build"it,"or"do"we"start"with"a"

commonly"used"test"or"es0mator"and"build"that?"•  Packaged"algorithms,"ul0mately,"will"not"sa0sfy"a"mature"field"

15"

Page 16: (Big) Data Science

•  Clients"are"s0ll"thinking"about"data"science"like"it’s"anayl0cs"over"big"data"•  The"power"is"in"going"deeper"•  Behavioral"paberns"•  local"to"the"data"topology"

16"

Page 17: (Big) Data Science

•  Although"the"field"con0nues"to"grow,"we"see"paberns"in"the"challenges"are"clients"face"as"they"move"to"Big"Data"

17"

Page 18: (Big) Data Science

18"

Page 19: (Big) Data Science

19"

Page 20: (Big) Data Science

•  So"what"will"advance"things?"•  Much"of"it"is"organiza0onal"

20"

Page 21: (Big) Data Science

21"

Page 22: (Big) Data Science

22"

Page 23: (Big) Data Science

•  These"are"REAL"(though"censored)"notes"from"a"client"call."•  Client"looking"to"improve"an"addimpression"system"•  Three"people,"all"taking"notes,"the"notes"look"radically"different"•  If"data"science"is"the"system,"not"the"result,"then"cultural"gaps"abort"it"•  Engineers,"data"scien0sts,"and"biz"folks"all"look"at"the"world"different"•  All"understand"problems"differently"•  Assign"value"to"details"differently"•  Understand"this,"embrace"it,"create"a"collabora0ve"culture"

23"

Page 24: (Big) Data Science

•  Sears"had"a"beber"understanding"of"the"impact"of"employee/store"behavior"on"sales"than"anyone"

•  Sophis0cated"research"and"sta0s0cal"methods"•  Things"sears"new"to"measure,"they"were"the"absolute"best"at"•  But"don’t"forget"about"the"things"you"aren’t"measuring"•  Exogenous"to"the"model"was"the"cultural"change"of"the"internet"•  You"know"the"rest…"•  Start"measuring"that"which"you"aren’t,"but"leave"room"for"alterna0ve"sources"of"

insight"

""

24"

Page 25: (Big) Data Science

25"