machine learning explained and how apply lean startup to develop a mvp tool
TRANSCRIPT
Imagine you wish to predict the quality of any bananas at your will? With Machine Learning this is possible. The first step is to acquire a large sample of bananas, assess their characteristics, and use them to create a large dataset. From this dataset, you determine which features (eg. colour, size, weight, shape, area grown etc) are the most important ones for predicting the quality of each banana. This process, called Feature Engineering, provides a set of input variables. Secondly, you may decide to that the measures of quality you are wish to predict are sweetness, softness, and storage life. These are called output variables. The task of the machine learning algorithm is to predict the output variables, based on the input variables.
To develop the machine learning model, we split the dataset into two groups: a Learning set (around 90% containing both input and output variables) and a smaller Validation dataset (around 10% also containing both input and output variables).
Using the larger Learning dataset only, we start to “train” the machine learning algorithm by feeding it both the input and output variables. The algorithm uses internal rules (or parameters) to predict the output based on the input, and adjusts them each time it makes a mistake (predicts the wrong output value). This allows the algorithm to start to experience the data and learn how the input variables impact the output variables. It begins to create its own framework of how it views bananas. This framework models the link between a typical banana's physical characteristics (input), and its quality (output).
After training, we must test the models accuracy. To do this, we use the remaining Validation dataset and hide the answers (output) from the algorithm. This way we can assess the algorithm’s accuracy on data in which we know the answers, but the algorithm does not. Hence, we ask the model to predict the output and compare its answers (output) to the true ones.
What's more, the algorithm’s prediction accuracy improves as more data becomes available; it continues to modify itself and gets better. The machine learns!
Case study on Machine Learning. Lets talk Bananas…
Got questions or want to learn more? Contact [email protected] Page 1
STEP
WHA
T
Dataset
CCA/CCSP promotional effectiveness“historical” dataset is received
Data is split into “Training set” (90%) & “Validation set” (10%)
Learnt Model
The models “parameters” or demand signals get adjusted so it progressively gets better at predicting.
We also "Feature engineering” the model to help it understand the most important “features” of “promotional effectiveness” data it needs to learn.
Training set
Using the training data set, we show the model the ‘answers’ within the data so it learns
E.g. When we ran a promotion Y, during time Z, the result was X
Validation set
We now test the model using validation set but hide the “answers” by asking the model to predict the “answers”. We compare model’s predictions with the hidden answers to determine accuracy
E.g. If we ran a promotion Y, during time Z, what will be the result?
Idea Model
Once the model is predicting with high degree of accuracy, we are ready live ‘market’ data
55545251 53
Got questions or want to learn more? Contact [email protected] Page 2
Machine Learning , a subset of Artificial Intelligence, is the science that involves developing self-learning algorithms. The "learning" part of machine learning is an algorithm that optimizes predictive accuracy through “training” and “validation”
Step by step flow of machine learning
Complete dataset Complete datasetSplit into two dataset to train model
High degree predicting model
DeploymentExperiment
STEP
MVP
Once the experiment has validated ROI, we proceed to develop a MVP tool (i.e. Idea Model + UX interface) to allow end-users to interact with the model.
Experiment
We apply our freshly-developed model to the real-world data and assess the results/business impact. We continue to refine the parameters.
Product
The the product, often called “Beta Product” because it’s the first version is constantly refined and improved based on user and business (i.e. security) feedback and needs
WHA
T
Dataset
Source dataset
Data is split into “Training set” (90%) & “Validation set” (10%)
Learnt Model
The models “parameters” or demand signals get adjusted so it progressively gets better at predicting.
We also "Feature engineering” the model to help it understand the most important “features” of “emailing” data it needs to learn.
Training set
Using the training data set, we show the model the ‘answers’ within the data so it learns
E.g. This is spam email looks like, this is not spam email
Validation set
We now test the model using validation set but hide the “answers” by asking the model to predict the “answers”. We compare model’s predictions with the hidden answers to determine accuracy
E.g. Is this email spam or not?
Idea Model
Once the model is predicting with high degree of accuracy, we are ready live ‘market’ data
HIVE
RY
Using HIVERY’s Discovery, Experiment and Deployment methodology, a product development cycle is added once the model has been validated (step 5), where we experiment (test the model) and build an MVP that will allows business users ongoing use of the model in simple yet powerful
interface
From Machine Learning to custom Product Solutions
Discovery
55545251 53 56 57 58
Got questions or want to learn more? Contact [email protected] Page 3