weka - unipi.it · 2016-11-21 · weka waikato environment for knowledge analysis performing...

16
Università di Pisa 1 WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange

Upload: others

Post on 22-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

1

WEKA Waikato Environment for Knowledge

Analysis

Performing Classification Experiments

Prof. Pietro Ducange

Page 2: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

2

n  It provides an alternative to the Explorer interface

n The user can select WEKA components from a palette, place them on a layout canvas and connect them together in order to form a knowledge flow for processing and analyzing data.

The Knowledge Flow Interface

Page 3: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

3

Setting up a flow to load an ARFF file and perform a cross-validation using J48

n  Create a source of data (DataSources tab - ARFFLoader)

n  Connect it to a ARFF file (right click over the ARFFLoader icon - Configure)

n  Specify which attribute is the class (Evaluation tab – ClassAssigner)

n  Connect the ArffLoader to the ClassAssigner (right click over the ArffLoader, select the dataSet under Connections and link with the ClassAssigner component with a left click

n  Specify which column is the class (right click over the ClassAssigner - choose Configure)

n  Add a CrossValidationFoldMaker component (Evaluation)

n  Connect the ClassAssigner to the CrossValidationFoldMaker (right click over ClassAssigner, select dataSet, left click over CrossValidationFoldMaker

Knowledge Flow example (1)

Page 4: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

4

n  Select the J48 component (classifiers tab)

n  Connect the CrossValidationFoldMaker to J48 TWICE (right click over CrossValidationFoldMaker, first choose trainingSet and then testSet)

n  Select ClassifierPerformanceEvaluator component (Evaluation tab)

n  Connect J48 to this component (right click over J48, select batchClassifier left click over by ClassifierPerformanceEvaluator

n  Select TextViewer component (Visualization tab)

n  Connect the ClassifierPerformanceEvaluator to the TextViewer (select the text entry from the pop-up menu for ClassifierPerformanceEvaluator)

n  Select GraphViewer component (Vizualization tab) and link to J48 (select the graph entry from the pop-up menu for J48)

n  Start the flow (select start loading from the pop-up menu for the loader)

Knowledge Flow example (2)

Page 5: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

5

Knowledge Flow example (3)

Page 6: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

6

n  Select show results from the pop-up menu for the graph viewer

n  Select show results from the pop-up menu for the text viewer

Knowledge Flow example (4)

Page 7: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

Knowledge Flow: attribute selection

Page 8: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

8

Knowledge Flow: attribute selection

Select show results from the pop-up menu for the text viewer connected to the Attribute Selection Block

For each fold we can extracted the actual filtered test set!!!

Page 9: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

Knowledge Flow: metaclassification

Page 10: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

10

Knowledge Flow: attribute selection

Select show results from the pop-up menu for the text viewer connected to the Meta Classifier Block

For each fold we can extracted the actual model along with the selected features!

Page 11: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

11

n  A robust experimental part involves running several learning schemes on different datasets.

n  The Experimenter interface enables us to set-up large scale experiments.

n  The user can create an experiment that runs several schemes against a series of datasets and then analyze the results to determine if one of the schemes is (statistically) better than the other schemes.

The Experimenter

Page 12: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

12

Experiment type: n  Cross-validation (default),

Train/Test Percentage Split (data randomized or order preserved)

n  Number of folds n  Classification/Regression

Iteration control n  Set the number of repetition and change the

order of iterations

Datasets

Algorithms

Simple setup

Page 13: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

13

The Analyze panel The number of result lines available

Type of results to load: à from the current experiment à from an earlier experiment file à from the database . Type of comparison

How to perform and show the results of the test

Significance level

Page 14: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

14

The paired T-test results respect to a control algorithm (C4.5)

và the results are statistically better than the control algorithm *à the results are statistically worse than the control algorithm (x/y/z) à counts of the number of times the scheme was better than (x), the same as (y), or worse than (z) the control algorithm

Page 15: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

15

The paired T-test results respect to a control algorithm (Random Forest)

Page 16: WEKA - unipi.it · 2016-11-21 · WEKA Waikato Environment for Knowledge Analysis Performing Classification Experiments Prof. Pietro Ducange . Università di Pisa 2 ! It provides

Università di Pisa

16

n  Load the ionosphere dataset and prepare a 5 fold cross validation

n  Perform the classification by using the three different classifiers and identify the most performing one

n  Once selected the best classifier, perform the classification by using a metaclassifier with three different attribute selection methods

n  Which is the best attribute selection method?

n  Which are the most relevant selected attributes?

Exercise