shraddha weka

17
Prepared By : Shraddha Mehta

Upload: shraddha18

Post on 04-Aug-2015

143 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Shraddha weka

Prepared By : Shraddha Mehta

Page 2: Shraddha weka

Weka was developed at the University

of Waikato in New Zealand.

Weka is a open source data mining tool developed in Java. It is used for research, education, and applications. It can be run on Windows, Linux and Mac.

Page 3: Shraddha weka
Page 4: Shraddha weka

Main features:Comprehensive set of data pre-processing

tools, learning algorithms and evaluation methods

Graphical user interfaces (incl. data visualization)

Environment for comparing learning algorithms

Page 5: Shraddha weka

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset (using GUI) or called from your own Java code (using Weka Java library).

Page 6: Shraddha weka

Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.

Page 7: Shraddha weka

Input•Raw data

Input•Raw data

Data Mingby Weka

•Pre-processing •Classification•Regression •Clustering

•Association Rules •Visualization

Data Mingby Weka

•Pre-processing •Classification•Regression •Clustering

•Association Rules •Visualization

Output•Result

Output•Result

Page 8: Shraddha weka

There are mainly 2 ways to use Weka to conduct your

data mining tasks.Use Weka Graphical User Interfaces (GUI)

GUI is straightforward and easy to use. But it is not flexible. It can not be called from you own application.

Page 9: Shraddha weka

Import Weka Java library to your own java application. Developers can leverage on Weka Java library

to develop software or modify the source code to meet special requirements. It is more flexible and advanced. But it is not as easy to use as GUI.

Page 10: Shraddha weka

Tools (or functions) in Weka include:

Data preprocessing (e.g., Data Filters), Classification (e.g., BayesNet, KNN, C4.5 Decision Tree,

Neural Networks, SVM), Regression (e.g., Linear Regression, Isotonic Regression, SVM

for Regression), Clustering (e.g., Simple K-means, Expectation Maximization

(EM)), Association rules (e.g., Apriori Algorithm, Predictive Accuracy,

Confirmation Guided), Feature Selection (e.g., Cfs Subset Evaluation, Information Gain,

Chi-squared Statistic), and Visualization (e.g., View different two-dimensional plots of the

data).

Page 11: Shraddha weka

Weka Data File Format (Input) Weka for Data Mining Sample Output from Weka (Output)

Page 12: Shraddha weka

FILE FORMAT@relation RELATION_NAME

@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR

@dataDATAROW1DATAROW2DATAROW3

FILE FORMAT@relation RELATION_NAME

@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR@attribute ATTRIBUTE_NAME ATTRIBUTE_TYPR

@dataDATAROW1DATAROW2DATAROW3

The most popular data input format of Weka is “arff” (with “arff” being the extension name of your input data file).

Page 13: Shraddha weka

Different analysis tools/functions

Different attributes to choose

The value set of the chosen attribute and the # of input items with each value

Page 14: Shraddha weka

Weka GUI

Classification Algorithms

Page 15: Shraddha weka

Three sets of classes you may need to use when developing your own applicationClasses for Loading DataClasses for ClassifiersClasses for Evaluation

Page 16: Shraddha weka

In sum, the overall goal of Weka is to build a state-

of-the-art facility for developing machine

learning (ML) techniques and allow people to

apply them to real-world data mining problems.

Page 17: Shraddha weka

Thank u