demystifying data science, machine learning and ai gemba … · 2019-07-30 · prof. anton...
TRANSCRIPT
Prof. Anton Ovchinnikov
Demystifying Data Science, Machine Learning and AI
GEMBA Reunion, July 13, 2019
• Some background and terminology
• Why now -- What’s now -- What’s next?
Why are we here?
• “All-things-digital/data” (AI, Machine Learning, …) is at the top of every leader’s agenda
• Forbes: “The Top 10 Business Trends That Will Drive Success” – AI is #1 https://www.forbes.com/sites/ianaltman/2017/12/05/the-top-business-trends-that-will-drive-success-in-2018/#66bebaf0701a
• Fortune “Five Big Business Trends to Watch” – AI is #2 http://fortune.com/2018/01/02/five-big-business-trends-to-watch-in-2018/
• Economist: “The world’s most valuable resource is no longer oil, but data” https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data
• WSJ, FT, … – “data” regularly on the cover/1st page
• Forbes:
• AI will generate $2.9 trillion in business value and recover 6.2 billion hours of worker productivity by 2021. https://www.forbes.com/sites/louiscolumbus/2017/10/03/gartners-top-10-predictions-for-it-in-2018-and-beyond/#39ec316f45bb
• Gartner:
• AI-driven companies will take $1.2 trillion from competitors by 2020.https://go.forrester.com/wp-content/uploads/Forrester_Predictions_2017_-Artificial_Intelligence_Will_Drive_The_Insights_Revolution.pdf
«Тот, кто станет лидером в этой сфере [искусственного интеллекта], будет властелином мира».
ВЛАДИМИР ПУТИН
The perception of highly publicized new technologies tends to follow a consistent pattern, Gartner hype cycle www.gartner.com
Is AI just a hype?
Sour
ce:
http
s://w
ww
.gar
tner
.com
/doc
/378
3465
?ref
=Site
Sear
ch&s
thkw
=Hyp
e%20
Cyc
le%
20fo
r%20
Anal
ytic
s%2
0and
%20
Busi
ness
%20
Inte
lligen
ce&f
nl=s
earc
h&sr
cId=
1-34
7892
2254
#204
8791
703
Demystifying “Data” & AI
• Descriptive vs Predictive vs Prescriptive Analytics
• Big Data vs Smart Data
• Data Science
• AI vs Machine Learning vs Deep Learning
• Intelligence – Learning – Data & Science
• Supervised vs Unsupervised vs Reinforcement Learning
• Why all this became so important NOW?
• How machines learn?
• What’s next?
Smart data is:
• Data that is right for the decision
• Supports (and is supported by) analytics, expertise and machines
• Hits your key business drivers: customer acquisition, loyalty, growth, risk optimization, etc.
Big Data vs Smart Data: What makes data “Smart”?
Big Data vs Smart Data: Examples
“Big data”• Full-motion video feed from security
cameras at a bank branch
• Real-time website click-stream data
• Raw twitter feed
• Your examples?
“Smart data”• Customer arrival patterns by time
of day; security alert
• Purchase behavior segmentation
• Sentiment analyses
• Your examples?
Smart data is:
• Data that is right for the decision
• Supports (and is supported by) analytics, expertise and machines
• Hits your key business drivers: customer acquisition, loyalty, growth, risk optimization, etc.
Big Data vs Smart Data: What makes data “Smart”?
Data Engineering
Data Analytics
Business Expertise
The driver of “Smart Data”: Data Science
Data Engineering
Business Expertise
Sour
ce: h
ttp://
drew
conw
ay.c
om/z
ia/2
013/
3/26
/the-
data
-sci
ence
-ven
n-di
agra
m
Analytics
Making sense of AI and ML
Artificial intelligence /ˌɑː.tɪ.fɪʃ.әl ɪnˈtel.ɪ.dʒәns /
the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings.
Making sense of AI and ML
Artificial Intelligence
Machine Learning
Deep Learning
Programs that can act, understand and interact (with other programs and non-programs, e.g., humans)
Programs/algorithms that improve over time
through exposure to more data
Data to machines is like experience for humans
Subset of Machine Learning that uses advanced Neural Networks with massive amounts of data to learn
“Sexy” rebranding of Neural Networks with certain clever
structures/algorithms: Convolution NN, Recursive NN
Intelligence – Learning – Data & Science
Three kinds of (Machine) Learning
Regression: predicting numbersClassification: predicting events
ClusteringAnomaly detectionAssociation rules
Learning to act based on feedbackGames (chess, Go), Driverless car[precise rules, stable environment]
Why this became so importantNOW?Algorithms:1960s: Rosenblatt (US), Ivakhnenko (UKR) - ANN1986: Hinton (CAN) -backpropoagation1998: Brin (RUS/US) and Page (US) - pagerank2006: Hinton - “deep learning”
Data:1991: Internet1997: Google2000s: home PCs2004: Facebook2005: YouTube2007: iPhone [one]…
Computing power:1965: Moore’s law1999: Nvidia GPU2002: Amazon cloud2004: MapReduce2006: Hadoop (Yahoo)2009: Spark…
2015+ Par-human performance in various “intelligent” tasks”
Super-human performance in various games due to reinforcement learning (“machine teaching itself”)
Why this became so importantNOW? Translation
http
s://w
ww.
eff.o
rg/a
i/met
rics
Why worse than images or text? More complex taskWhy French better than German? More data (Canada’s Parliament Records)
Why this became so importantNOW? “Games”
http
s://w
ww.
new
scie
ntis
t.com
/arti
cle/
2133
146-
hum
an-v
s-m
achi
ne-fi
ve-e
pic-
fight
s-ag
ains
t-ai/
• Computer beats the best human chess players ever since IBM’s Deep Blue defeated Kasparov in 1997.
• AlphaGo by DeepMind/Google bet best Go player in 2016
• AlphaZero has beaten the world’s best chess-playing computer program, having taught itself how to play in four hours in 2017.
• How? Why? • Reinforcement Learning:
works when clear “rules” are present and machines can play with themselves (create its own data)
• Poker [2017], Starcraft [?]
Why this became so importantNOW? How do these algorithms learn• Do they learn from / like humans?
• Polanyi’s paradox https://en.wikipedia.org/wiki/Polanyi’s_paradox
Why this became so importantNOW? How do these algorithms learn• Do they learn from / like humans?
• Polanyi’s paradox https://en.wikipedia.org/wiki/Polanyi’s_paradox
Why this became so importantNOW? How do these algorithms learn• Do they learn from / like humans?
• Both Yes, and No, and “We don’t know”
1. Data: “features” / variables that describe the situation
• Structured alpha-numerical data (transaction and customer characteristics)
• Unstructured: images, sound, network links (“ImageNet”)
Why this became so importantNOW? How do these algorithms learn• Do they learn from / like humans?
• Both Yes, and No, and “We don’t know”
1. Data: “features” / variables that describe the situation
2. Feature Engineering: what information is in your data, but not captured by the existing variables?
• “Retiree” (age>65), “single and male”, etc. VERY MANY
Why this became so importantNOW? How do these algorithms learn• Do they learn from / like humans?
• Both Yes, and No, and “We don’t know”
1. Data: “features” / variables that describe the situation
2. Feature Engineering: creating many new variables
3. Indirect (non-linear) relationships / “representations”
• Regression: Y = f(X) = a + b * X
• Modern ML: Y = crazy complicated function (of functions (of other functions (of many-many Xs)))
𝑥𝑥…
𝑓 𝑥 , 𝑥 ,…)
𝑥
𝑔 ∗ 𝑔 ∗ 𝑔 ∗
Why is this so powerful? Machine Learning methods “learn” (find) functions that cannot be expressed / explained with simple rules.
Why this became so importantNOW? How do these algorithms learn• Do they learn from / like humans?
• Both Yes, and No, and “We don’t know”
1. Data: “features” / variables that describe the situation
2. Feature Engineering: creating many new variables
3. Indirect (non-linear) relationships / “representations”
• Regression: Y = f(X) = a + b * X
• Modern ML: Y = crazy complicated function (of functions (of other functions (of many-many Xs)))
4. Complexity control: not letting the ML overfit (“learn” what’s in the data it knows, but may not generalize beyond)
Complexity Controls: Feature Engineering and Overfitting
Karl Popper Albert EinsteinTheory of Knowledge: Falsifiability
“All swans are white”Theory of Knowledge: Complexity
“Everything Should Be Made as Simple as Possible, But Not Simpler”, (KISS) E=mc2
Why this became so importantNOW? How do these algorithms learn• Do they learn from / like humans?
• Both Yes, and No, and “We don’t know”
1. Data: “features” / variables that describe the situation
2. Feature Engineering: creating many new variables
3. Indirect (non-linear) relationships / “representations”
• Regression: Y = f(X) = a + b * X
• Modern ML: Y = crazy complicated function (of functions (of functions)) of many-many Xs
4. Complexity control: not letting the ML overfit (“learn” what’s in the data it knows, but may not generalize beyond)
• Cross-fold validation, train-test-holdout, regularizations
Note 1: DL is not best for all use-cases
These algorithms are “workhorse” for many firms:• RandomForest• Gradient Boosted Trees (xgboost)
• Support Vector Machines
• Regularized regressions (LASSO)
What’s now/next? [cont.]
• As we do more ML/AI easier to do harm
• Ethical issues in AI?
• Regulatory issues in AI?
• As ML/AI does more work less work remains for humans
• The traditional link between jobs and incomes is being broken
• The economy of abundance can sustain all citizens in comfort and economic security whether or not they engage in what is commonly reckoned as work
• As machines continue to invade society, duplicating greater and greater numbers of social tasks, it is human labor itself—at least, as we now think of ‘labor’—that is gradually rendered redundant
• Ad Hoc Committee on the Triple Revolution, 1964https://en.wikipedia.org/wiki/The_Triple_Revolution
To wrap-up
• ML/AI is not just a hype: it is a transformative new technology
• algorithms + data + compute power allow for widespread data-driven decision-making applications (“AI”) in business and society
• Train your talent to understand (possibilities of) machine learning/AI and look for opportunities: more / more creative, better, faster, …
• Implementing them will not be easy, and many processes / people will need to change; there will be winners and losers
• The future is not about Humans vs AI, but rather Humans and AI
• Key management question of the 21st century will be about How can we get Humans and Machines Best Work Together?
• We need to learn how to innovate with data/ML/AI, much as we did with earlier tech (I’m optimistic!)
Easter morning, 1900: 5th Ave, New YorkSpot the automobile
Easter morning, 1913: 5th Ave, New YorkSpot the horse
• Key management question of the 21st century will be about How can we get Humans and Machines Best Work Together?
• We need to learn how to innovate with data/ML/AI, much as we did with earlier tech (I’m optimistic!)
• Key management question of the 21st century will be about How can we get Humans and Machines Best Work Together?
• We need to learn how to innovate with data/ML/AI, much as we did with earlier tech (I’m optimistic!)
Thank you!
https://www.linkedin.com/in/antonovchinnikov