Download - MLP
LOGO
Classification using Weka (Brain, Computation, and Neural Learning)
Jung-Woo Ha
2
Agenda
Classification
General Concept
Terminology
Introduction to Weka
Classification practice with Weka
Problems: Pima Indians diabetes, handwritten digit recognition
Algorithms: Neural Networks, Decision Trees, Support Vector Machines
Evaluation criteria
Using Experimenter for batch experiments
Building committee machine
Mini-project
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Machine Classification
Sorting fish on a
conveyor belt:
Salmon (연어) vs. sea bass (농어)
set up a camera, take images and use some physical differences (length, lightness, width, fin shape, mouth position, etc) to explore.
3 (C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
4
Concept of Classification
<Notations>
n = # training examples
x = “input” variables (features or attributes)
y = “output” variable / “target” variable
(x, y) – training example
The i-th training example = (x(i), y(i))
Training Set
Learning Algorithm
h
hypothesis
Input features Output / prediction
e.g. pixels in a picture of
handwritten digit
‘3’ or ‘8’
nnxwxwwf
110)(x
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Terminology
Features or Attributes
Features are the individual measurable properties of the phenomena being observed
Choosing discriminating and independent features is key to any pattern recognition algorithm being successful in classification
Training set / Test set
Training set: A set of examples used for learning, that is to fit the parameters [i.e., weights] of the classifier
Test set: A set of examples used only to assess the performance [generalization] of a fully-specified classifier
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 5
6
Introduction to Weka
Weka: Data Mining Software in Java
Weka is a collection of machine learning algorithms for data mining & machine learning tasks
What you can do with Weka?
data pre-processing, feature selection, classification, regression, clustering, association rules, and visualization
Weka is an open source software issued under the GNU General Public License
How to get? http://www.cs.waikato.ac.nz/ml/weka/ or just type „Weka‟ in google.
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Dataset #1: Pima Indians Diabetes
Description Pima Indians have the highest prevalence of diabetes in the world
We will build classification models that diagnose if the patient shows signs of diabetes
http://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes
Configuration of the data set 768 instances
8 attributes
age, number of times pregnant, results of medical tests/analysis
all numeric (integer or real-valued)
Also, a discretized set will be provided
Class value = 1 (Positive example )
Interpreted as "tested positive for diabetes"
500 instances
Class value = 0 (Negative example)
268 instances
7 (C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Dataset #2: Handwritten Digits (MNIST)
Description
The MNIST database of handwritten digits contains digits written by office workers and students
We will build a recognition model based on classifiers with the reduced set of MNIST
http://yann.lecun.com/exdb/mnist/
Configuration of the data set
Attributes
pixel values in gray level in a 28x28 image
784 attributes (all 0~255 integer)
Full MNIST set
Training set: 60,000 examples
Test set: 10,000 examples
For our practice, a reduced set with 800 examples is used
Class value: 0~9, which represent digits from 0 to 9
8 (C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
9
Artificial Neural Networks
MLP (Multilayer Perceptron)
In Weka, Classifiers-functions-MultilayerPerceptron
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
: Four main parameters for learning MLPs
Artificial Neural Networks
Reviews on BP algorithm
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 10
The Number of iterations
Learning rate Momentum
The number of hidden layers
and hidden nodes
Reviews on MLPs
Expression power of MLPs
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 11
12
Decision Trees
J48 (Java implementation of C4.5)
In Weka, classifiers-trees-J48
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Support Vector Machines
SMO (sequential minimal optimization) for training SVM
In Weka, classifiers-functions-SMO
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 13
Practice
Basic
Comparing the performances of algorithms
MultilayerPerceptron vs. J48 vs. SVM
Checking the trained model (structure & parameter)
Tuning parameters to get better models
Understanding „Test options‟ & „Classifier output‟ in Weka
Advanced
Building committee machines using „meta‟ algorithms for classification
Preprocessing / data manipulation – applying „Filter‟
Batch experiment with „Experimenter‟
Design & run a batch process with „KnowledgeFlow‟ 14 (C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Dataset for Practice with Weka
Pima Indians diabetes
Original data: pima_diabetes.arff
Discretized data: pima_diabetes_supervised_discretized.arff
Handwritten Digit (MNIST)
Training/test pair
mnist_reduced_training.arff, mnist_reduced_test.arff
800 & 200 instances, respectively
Total set (1,000 instances)
mnist_reduced_total.arff
Can be used for cross-validation
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 15
Data format for Weka (.ARFF)
@relation heart-disease-simplified
@attribute age numeric
@attribute sex { female, male}
@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}
@attribute cholesterol numeric
@attribute exercise_induced_angina { no, yes}
@attribute class { present, not_present}
@data
63,male,typ_angina,233,no,not_present
67,male,asympt,286,yes,present
67,male,asympt,229,yes,present
38,female,non_anginal,?,no,not_present
Data
(CSV format)
Header
16
Note: You can easily generate ‘arff’ file by adding a header to a usual CSV text file
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Neural Networks in Weka
17
click • load a file that contains the
training data by clicking
‘Open file’ button
• ‘ARFF’ or ‘CSV’ formats are
readible
• Click ‘Classify’ tab
• Click ‘Choose’ button
• Select ‘weka – function
- MultilayerPerceptron
• Click ‘MultilayerPerceptron’
• Set parameters for MLP
• Set parameters for Test
• Click ‘Start’ for learning
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
18
Some Notes on the Parameter Setting
Parameter Setting = Car Tuning
need much experience or many times of trial
you may get worse results if you are unlucky
Multilayer Perceptron (MLP)
Main parameters for learning: hiddenLayers, learningRate, momentum, trainingTime (epoch), seed
J48
Main parameters: unpruned, numFolds, minNumObj
Many parameters are for controlling the size of the result tree, i.e. confidenceFactor, pruning
SMO (SVM)
Main parameters: c (complexity parameter), kernel, kernel parameters
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Test Options and Classifier Output
19
There are
various metrics
for evaluation
Setting the
data set used
for evaluation
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
20
How to Evaluate the Performance? (1/2)
Usually, build a Confusion Matrix out of given data
Evaluation Metrics Accuracy (percent correct)
Precision
Recall
Many other metrics: F-measure, Kappa score, etc.
For fare evaluation, the
‘cross-validation’ scheme is used
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
21
How to Evaluate the Performance? (2/2)
Confusion Matrix Real
Prediction Positive Negative
Positive TP FP All with positive
Test
Negative FN TN All with
Negative Test
All with Disease
All without Disease
Everyone
FNTNFPTP
TNTP
Accuracy
FNTP
TP
Recall
FPTP
TP
Precision
As recall ↑ precision ↓
conversely:
As recall ↓ precision ↑
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
22
Evaluation Method - Cross Validation
K-fold Cross Validation
The data set is randomly divided into k subsets.
One of the k subsets is used as the „test set‟ and the other k-1 subsets are put together to form a „training set‟.
128 128 128 128 128
D1 D2 D3 D4 D5
128
D6
128 128 128 128 128
D1 D2 D3 D4 D6
128
D5
128 128 128 128 128
D2 D3 D4 D5 D6
128
D1
k
i
iErrork
Error1
1
6-fold cross validation
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Using committee machine / ensemble learning in Weka
Boosting: AdaBoostM1
Voting committee: Vote
Bagging
Committee Machine in Weka
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 23
Data Manipulation with Filter in Weka
Attribute
Selection, discretize
Instance Re-sampling, selecting specified folds
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 24
Using Experimenter in Weka
Tool for ‘Batch’ experiments
25
click
• Set experiment type/iteration
control
• Set datasets / algorithms
Click ‘New’
• Select ‘Run’ tab and click ‘Start’
• If it has finished successfully, click
‘Analyse’ tab and see the summary
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
KnowledgeFlow for Analysis Process Design
26
(‘Process Flow Diagram’ of SAS® Enterprise Miner )
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
References
Weka Wiki: http://weka.wikispaces.com/
Weka online documentation: http://www.cs.waikato.ac.nz/ml/weka/index_documentation.html
Textbooks
Tom Mitchell (1997) Machine Learning, McGraw Hill
Christopher M. Bishop (2006) Pattern Recognition and Machine Learning, Springer
Richard O. Duda, Peter E. Hart, David G. Stork (2001) Pattern classification (2nd edition), Wiley, New York
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 27
Mini-project
Make an arff file
Make a csv file with MS Excel.
Open the csv file with Weka
Save the csv file as an arff file
Modify the property value of „class‟ to discrete value set with any text editor program
Save the arff file
Reload the arff file with Weka
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 28
Mini-project
29
click • load a file that contains the
training data by clicking
‘Open file’ button
• ‘ARFF’ or ‘CSV’ formats are
readible
• Click ‘Classify’ tab
• Click ‘Choose’ button
• Select ‘weka – function
- MultilayerPerceptron
• Click ‘MultilayerPerceptron’
• Set parameters for MLP
• Set parameters for Test
• Click ‘Start’ for learning
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Mini-project
Parameter setting of MLPs
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 30
More explanations
on the parameters
Test Options and Classifier Output
31
There are
various metrics
for evaluation
Setting the
data set used
for evaluation
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Mini-project
Make a MLP by yourself with GUI option
You can make the hidden layers by yourself.
When clicking more button, you can get details of explanation for GUI.
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 32
Mini-project
J48
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 33
Mini-project
Experiments
Convenient comparisons on data and methods
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 34
Experiments
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 35
Mini-project
Classification problem with Weka
Data set
3 different data sets
You should include at least one set from UCI ML repository and MNIST set (http://archive.ics.uci.edu/ml/)
Classification methods
MLP: iters, learning rate, momentum, # of hidden nodes
SVM: will be addressed in next time
J48: Default options only
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 36
Mini term-project
Contents in the report
You should
compare the results of various parameter settings for MLPs
find optimal parameter setting for MLP and report the classification performance on that setting on all data sets
Compare the best MLP result to the result of J48 on three data sets (classification and time)
Include discussions
At most A4 four pages
Due date: 24th Nov. 2011(302-314-1)
(C) 2010, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 37