(big) data science
DESCRIPTION
This talk covers:1. The current state of data science2. The problems firms are having in transitioning their skills3. The need to re-think the role of "data scientist"4. How to achieve success at your firm.TRANSCRIPT
1"
(Dan’s"Introduc0on)"
2"
(Thanks"to"Think"Big)"
3"
4"
• Data"science"seen"as"a"job"role"• The"magical"insight"of"data"• But"data"DOES"lie"• Companies"can’t"wait"for"data"specialists"to"have"all"the"hypotheses"and"answers"• Good"data"science"is"good"research"• Research"requires"collabora0on,"is"theory"driven,"and"purposeful"• No"one"person"will"bring"successful"data"science"to"an"organiza0on"
5"
• Data"science"is"not"the"result,"it"is"the"process"• While"data"wrangling"and"sta0s0cs"might"go"together,"these"folks"need"to"work"
with"business"and"domain"experts"• Domain"experts"set"theory"and"help"create"hypotheses"• Business"helps"keep"a"focus"on"meaningful"ques0ons"• Engineers,"stats,"biz,"etc."working"in"concert"IS"data"science,"the"product"is"insight"
6"
• Data"science"is"s0ll"the"wild"west"• A"lack"of"good"tools"for"wrangling"data"• A"lack"of"good"tools"for"performing"analysis"• Need"for"es0mators"and"crea0ve"solu0ons"
7"
• Big"data"is"new"• S0ll"lots"of"low"hanging"fruit"• BIG"value"in"crea0ve,"not"complex,"solu0ons"
8"
9"
• Lets"(quickly)"review"what"tools"are"commonly"used"in"the"space"
10"
11"
• Just"because"you"have"a"big,"fancy"cluster,"don’t"forget"about"your"technique"specific"tools"and"best"research"prac0ces"
12"
13"
• We’ve"seen"the"tools,"some"of"the"exis0ng"packages.""What"does"the"future"of"data"science"and"hadoop"hold?"
14"
• If"we"look"at"the"tools"being"developed/sold,"we"see"the"future"through"their"eyes"• Focus"seems"to"be"on"algorithms"in"a"library"• Less"emphasis"on"components"which"can"be"pulled"together"in"novel"ways"• E.g."do"we"start"with"“collabora0ve"filtering”"and"build"it,"or"do"we"start"with"a"
commonly"used"test"or"es0mator"and"build"that?"• Packaged"algorithms,"ul0mately,"will"not"sa0sfy"a"mature"field"
15"
• Clients"are"s0ll"thinking"about"data"science"like"it’s"anayl0cs"over"big"data"• The"power"is"in"going"deeper"• Behavioral"paberns"• local"to"the"data"topology"
16"
• Although"the"field"con0nues"to"grow,"we"see"paberns"in"the"challenges"are"clients"face"as"they"move"to"Big"Data"
17"
18"
19"
• So"what"will"advance"things?"• Much"of"it"is"organiza0onal"
20"
21"
22"
• These"are"REAL"(though"censored)"notes"from"a"client"call."• Client"looking"to"improve"an"addimpression"system"• Three"people,"all"taking"notes,"the"notes"look"radically"different"• If"data"science"is"the"system,"not"the"result,"then"cultural"gaps"abort"it"• Engineers,"data"scien0sts,"and"biz"folks"all"look"at"the"world"different"• All"understand"problems"differently"• Assign"value"to"details"differently"• Understand"this,"embrace"it,"create"a"collabora0ve"culture"
23"
• Sears"had"a"beber"understanding"of"the"impact"of"employee/store"behavior"on"sales"than"anyone"
• Sophis0cated"research"and"sta0s0cal"methods"• Things"sears"new"to"measure,"they"were"the"absolute"best"at"• But"don’t"forget"about"the"things"you"aren’t"measuring"• Exogenous"to"the"model"was"the"cultural"change"of"the"internet"• You"know"the"rest…"• Start"measuring"that"which"you"aren’t,"but"leave"room"for"alterna0ve"sources"of"
insight"
""
24"
25"