an explanation of machine learning for business

MK99 – Big Data 1

Big data &

cross-platform analytics MOOC lectures Pr. Clement Levallois

MK99 – Big Data 2

A short note on machine learning for business

MK99 – Big Data 3

Machine Learning • Family of techniques to formulate predictions, based on

• Why is it called Machine learning? – Machine: it is about algorithms running on computers, not

equations solved with pen and paper

– Learning: the algorithms start with zero accuracy. Then, they get more accurate while being fed with data: the algorithm refines its parameters, it “learns”.

MK99 – Big Data 4

Typical set up 1. We start with a training set

Data already collected: we know the actual values to be found Ex: a list of consumers, their characteristics and their associated credit score

2. The algorithms are trained on this set

-> A series of algorithms run on the training set. Their parameters get adjusted so that the actual values get progressively predicted the most accurately possible.

3. A test set (“fresh data”) is brought -> List of consumer characteristics. Their credit score is known but hidden.

4. Running the trained algo on the test set -> Predict the credit score for each consumer in the test set, using the algorithms that were trained on phase 1

5. A measure of accuracy - Given the correct values to be predicted in the test set, how accurate were the algorithms? -> Where the credit scores accurately predicted?

Actual values

MK99 – Big Data 5

Vocabulary

• Data scientists “train” their model and then test it

• They are concerned by “out-of-sample” prediction

– The fact that their model predicts accurately data points in the training set (the “sample”) is trivial

– This is the accuracy on the test set that matters!

– This is called an “out-of-sample” prediction

MK99 – Big Data 6

Why is machine learning (ML) so different from statistics?

• ML does not focus on causality – just prediction! – Note: for this reason, ML cannot predict the effect of

intervention - it has no causal model.

• ML has a special concern for out-of-sample prediction

– Will be especially careful about over-fitting

• ML picks its algorithms from diff academic disciplines

– Text, network relations, clustering, not just traditional statistics

• Coming from comput. sciences, ML has affinities with big data – Procedures optimized for speed and scale

But the best data scientists often started as statisticians / econometricians: See Hal Varian: Chief Economist at Google

MK99 – Big Data 7

• Kaggle is a website hosting ML competitions, anybody can join

• Goal: make the best prediction on a dataset, with cash prizes

• From predicting clicks on ads to epileptic seizures

• Always the same setup: a training set, a test set, a scoring based on accuracy.

MK99 – Big Data 8

This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com)

Contact Clement Levallois (levallois [at] em-lyon.com) for more information.

an explanation of machine learning for business

Business

1. introduction to machine - lis-lab.fr ›...

case-based reasoning for explaining...

learning theory explanation of attachment

how do humans understand explanations from machine ... ·...

modeling conceptual understanding in image reference...

learning english explanation text (genre)

a very easy explanation to understanding machine learning...

explanation-based learning (ebl)

cs7267 machine learning introduction to machine learning

an introduction to machine learning - github pages ·...

machine learning with matlab - … · 2 agenda machine...

cs 5751 machine learning chapter 11 explanation-based...

fast track machine learning part 1 (machine learning...

machine learning chapter 11. 2 machine learning what is...

machine learning for nlp - ethics and machine learning

mathworks - introducing machine learning · 4 ntroducing...

11 machine learning important issues in machine learning

bain's redemption machine learning explanation sketch

machine learning introduction machine...

virtual machine explanation