rapidminer - paginas.fe.up.ptpaginas.fe.up.pt/~ec/files_1112/lab05-lab.pdf · rapidminer ‐i.com/...

26
Rapidminer http://rapidi.com/ OpenSource Data Mining with the Java Software RapidMiner “RapidMiner is the worldwide leading opensource data mining solution due to the combination of its leadingedge technologies and its functional range. Applications of RapidMiner cover a wide range of realworld data mining tasks.” 1

Upload: lybao

Post on 11-May-2018

254 views

Category:

Documents


3 download

TRANSCRIPT

Rapidminer

http://rapid‐i.com/ Open‐Source Data Mining with the Java Software RapidMiner  “RapidMiner is the world‐wide leading open‐source data 

mining solution due to the combination of its leading‐edge technologies and its functional range. Applications of RapidMiner cover a wide range of real‐world data mining tasks.”

1

2

lab 05‐risk.xls

3

4

5

6

7

8

9

10

11

12

13

We use k‐medoids because k‐means only works with numerical attributes. 

14

k‐means example

15

NAME  Calories  Protein  Fat  Calcium  Iron LabelBEEF BRAISED 340 20 28 9 2.6 1HAMBURGER 245 21 17 9 2.7 1BEEF ROAST 420 15 39 7 2 1BEEF STEAK 375 19 32 9 2.6 1BEEF CANNED 180 22 10 17 3.7 1CHICKEN BROILED 115 20 3 8 1.4 2CHICKEN CANNED 170 25 7 12 1.5 2BEEF HEART 160 26 5 14 5.9 3LAMB LEG ROAST 265 20 20 9 2.6 1LAMB SHOULDER ROAST 300 18 25 9 2.3 1SMOKED HAM 340 20 28 9 2.5 1PORK ROAST 340 19 29 9 2.5 1PORK SIMMERED 355 19 30 9 2.4 1BEEF TONGUE 205 18 14 7 2.5 1VEAL CUTLET 185 23 9 9 2.7 1BLUEFISH BAKED 135 22 4 25 0.6 2CLAMS RAW 70 11 1 82 6 3CLAMS CANNED 45 7 1 74 5.4 3CRABMEAT CANNED 90 14 2 38 0.8 2HADDOCK FRIED 135 16 5 15 0.5 2MACKEREL BROILED 200 19 13 5 1 2MACKEREL CANNED 155 16 9 157 1.8 3PERCH FRIED 195 16 11 14 1.3 2SALMON CANNED 120 17 5 159 0.7 3SARDINES CANNED 180 22 9 367 2.5 3TUNA CANNED 170 25 7 7 1.2 2SHRIMP CANNED 110 23 1 98 2.6 3

k‐means example

16

17

18

19

20

DBscan example

21

Labeled data

22

Results with k‐means

23

DBscan

24

25

References

Data Mining: Concepts and Techniques, Jiawei Han, Micheline Kamber (Morgan Kaufmann ‐ 2000)

Data Mining: Introductory and Advanced Topics, Margaret Dunham (Prentice Hall, 2002)

A Tutorial on Clustering Algorithms http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/index.html

Clustering Web Search Results, Iwona Białynicka‐Birula, http://www.di.unipi.it/~iwona/Clustering.ppt

26

Solutions nearly always come from the direction you least expect, which means there's no point in trying to look in that direction because it wont be coming from there. 

Douglas Adams