weka 3.5.5 (sumber: machine learning with weka). what is weka? weka is a collection of machine...

21
WEKA 3.5.5 (sumber: Machine Learning with WEKA)

Upload: preston-boyd

Post on 19-Dec-2015

270 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

WEKA 3.5.5

(sumber: Machine Learning with WEKA)

Page 2: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

What is WEKA?

• Weka is a collection of machine learning algorithms for data mining tasks.

• Weka contains tools for – data pre-processing, – classification,– regression, – clustering, – association rules, and – visualization.

• It is also well-suited for developing new machine learning schemes.

Page 3: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

Dataset

• A dataset is roughly equivalent to a two-dimensional spreadsheet or database table.

• A dataset is a collection of examples.• The external representation of an Instances class is an

ARFF file, which consists of a header describing the attribute types and the data as comma-separated list.

Page 4: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

Dataset - ARFF

• The ARFF Header Section The ARFF Header section of the file contains the relation declaration and attribute declarations.– The @relation Declaration

The relation name is defined as the first line.– The @attribute Declarations

Each attribute in the data set has its own @attribute statement which uniquely defines the name and it's data type. The order the attributes are declared indicates the column position in the data section of the file.

Page 5: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

ARFF - Header Section

Page 6: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

ARFF - Data Types

• The <datatype> can be any of the types:– Numeric: can be real or integer numbers.

• integer is treated as numeric • real is treated as numeric

– Nominal– String – Date

• The keywords numeric, real, integer, string and date are case insensitive.

Page 7: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

ARFF - Data Types Example

• @ATTRIBUTE sepallength NUMERIC• @ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-

virginica}• @ATTRIBUTE LCC string• @attribute <name> date [<date-format>]

default format: yyyy-MM-dd'T'HH:mm:ss

Page 8: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

ARFF - Data Section

Page 9: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

ARFF - Data Section ..

• The ARFF Data section of the file contains the data declaration line and the actual instance lines.– The @data Declaration

The @data declaration is a single line denoting the start of the data segment in the file.

– The instance data • Each instance on a single line• Attribute values delimited by commas• The order agreed the declaration in header section• Missing values are represented by a single question mark• Values of string and nominal attributes are case sensitive, and any

that contain space must be quoted

Page 10: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

Create an ARFF file

Page 11: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

Create an ARFF file ..

Page 12: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

WEKA 3.5.5

Page 13: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

Program

• LogWindow Opens a log window that captures all that is printed to stdout or stderr. Useful for environments like MS Windows, where WEKA is not started from a terminal.

• Exit Closes WEKA.

Page 14: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

Program .. LogWindow

Page 15: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

Applications

• Explorer: for exploring data with WEKA.• Experimenter: for performing experiments and

conducting statistical tests between learning schemes. • KnowledgeFlow: supports essentially the same

functions as the Explorer but with a drag-and-drop interface. One advantage is that it supports incremental learning.

• SimpleCLI: Provides a simple command-line interface that allows direct execution of WEKA commands for operating systems that do not provide their own command line interface.

Page 16: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

Tools

• ArffViewer An MDI application for viewing ARFF files in spreadsheet format.

• SqlViewer represents an SQL worksheet, for querying databases via JDBC.

• EnsembleLibrary An interface for generating setups for

Ensemble Selection.

Page 17: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

ArffViewer

Page 18: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

SqlViewer

Page 19: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

EnsembleLibrary

Page 20: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

Visualization

• Plot For plotting a 2D plot of a dataset. • ROC Displays a previously saved ROC curve. • TreeVisualizer For displaying directed graphs, e.g., a

decision tree. • GraphVisualizer Visualizes XML BIF or DOT format

graphs, e.g., for Bayesian networks. • BoundaryVisualizer Allows the visualization of classifier

decision boundaries in two dimensions.

Page 21: WEKA 3.5.5 (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains

Windows

• Minimize Minimizes all current windows. • Restore Restores all minimized windows again.