machine learning with weka - university of illinois at chicagocornelia/russir14/lectures/... ·...

32
Machine Learning with Weka Sujatha Das Gollapalli Cornelia Caragea Thanks to Eibe Frank for some of the slides August 19, 2014

Upload: others

Post on 25-May-2020

7 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

Machine Learning with Weka!Sujatha Das Gollapalli!

Cornelia Caragea!!!

!Thanks to Eibe Frank for some of the slides

August 19, 2014

Page 2: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

WEKA: the software n  Machine learning/data mining software written in Java

(distributed under the GNU Public License) n  Used for research, education, and applications n  Main features:

n  Comprehensive set of data pre-processing tools, learning algorithms and evaluation methods

n  Graphical user interfaces (incl. data visualization) n  Environment for comparing learning algorithms

n  WEKA website: n  http://www.cs.waikato.ac.nz/ml/weka/!

Page 3: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

WEKA: resources!n  API Documentation, Tutorials, Source code.!n  WEKA mailing list !n  Data Mining: Practical Machine Learning Tools and

Techniques with Java Implementations!n  Weka-related Projects:!

n  Weka-Parallel - parallel processing for Weka !n  RWeka - linking R and Weka !n  YALE - Yet Another Learning Environment !n  Many others…!

Page 4: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

WEKA: launching!n  java -jar weka.jar!

Page 5: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

Data Preparation and Loading

Page 6: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

@relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ...!

Data Preparation:WEKA only deals with “flat” files!

Page 7: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

@relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ...!

WEKA only deals with “flat” files!

Page 8: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

Explorer: pre-processing the data!n  Data can be imported from a file in various formats: ARFF,

CSV, C4.5, binary n  Data can also be read from a URL or from an SQL database

(using JDBC) n  Pre-processing tools in WEKA are called “filters” n  WEKA contains filters for:

n  Discretization, normalization, resampling, attribute selection, transforming and combining attributes, …!

Page 9: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

Import Datasets into WEKA!

Page 10: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 11: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 12: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 13: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 14: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 15: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

Building Classifiers

Page 16: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

Explorer: building “classifiers”!n Classifiers in WEKA are models for

predicting nominal or numeric quantities n  Implemented learning schemes include:

n  Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, …

n  “Meta”-classifiers include: n  Bagging, boosting, stacking, etc.!

Page 17: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 18: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 19: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 20: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 21: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 22: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 23: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 24: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 25: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 26: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 27: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 28: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 29: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 30: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 31: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!
Page 32: Machine Learning with Weka - University of Illinois at Chicagocornelia/russir14/lectures/... · WEKA: resources!! API Documentation, Tutorials, Source code.!! WEKA mailing list !!

To Do List!n  Try Decision Tree, Naïve Bayes, and Logistic

Regression and Support Vector Machines classifiers on a CiteSeerX dataset !n  The dataset contains titles and abstracts of papers

from Computer Science that are available in the CiteSeer digital library;!

n  The class for each example in the dataset is the topic of the paper. There are six possible classes.!

n  The dataset is available in arff format. !n  Use various model parameters!!