ai programming – data mining ( plug in weka to …ai programming – data mining ( plug in weka to...

28
AI Programming – data mining ( Plug in Weka to Eclipse) Review of Identification Tree Run bouncing ball in Weka Run bouncing ball in Eclipse

Upload: others

Post on 07-Jul-2020

22 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

AI Programming – data mining ( Plug in Weka to Eclipse)

• Review of Identification Tree • Run bouncing ball in Weka • Run bouncing ball in Eclipse

Page 2: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

How about color? weight? rubber? Please write down their formulae.

Color: 0.69 Weight: 0.94 Rubber: 0.61

Page 3: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

For the case of Size = small, continue to split this note

How about other two cases? Split or not? Why? - medium? - large?

Finish splitting? Why?

Page 4: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

Home Work

1. Please read the article decision-tree-article 2. Implement and generating ID tree on the bouncing ball dataset, ball.arff using Weka. (correctly generate ball.arff file and then use Weka)

Page 5: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

Wekaの実行

• これがWekaの起動画面です。 • Explorerを押します

Page 6: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

Wekaの実行

• この画面からデータを開いたりデータマイニングをしたりします

Page 7: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

Select attribute with the maximum information gain, which is 'outlook' for splitting

Page 8: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

Apply ID3 to each child node of this root, until leaf node or node that has entropy=0 are reached.

Page 9: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

@relation weather @attribute outlook {sunny, overcast, rainy} @attribute temperature real @attribute humidity real @attribute windy {TRUE, FALSE} @attribute play {yes, no} @data sunny,85,85,FALSE,no sunny,80,90,TRUE,no overcast,83,86,FALSE,yes rainy,70,96,FALSE,yes rainy,68,80,FALSE,yes rainy,65,70,TRUE,no overcast,64,65,TRUE,yes sunny,72,95,FALSE,no sunny,69,70,FALSE,yes rainy,75,80,FALSE,yes sunny,75,70,TRUE,yes overcast,72,90,TRUE,yes overcast,81,75,FALSE,yes rainy,71,91,TRUE,no

Weather.arff

Page 10: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

データセットの作成

• 例としてbounce ballのデータセットを作成し、データマイニングをしてみましょう

Ball Size Color Weight Rubber? Result(Bounce?)

1 Small Green Light Yes Yes

2 Small Blue Medium No No

3 Medium Red Medium No No

4 Small Red Medium Yes Yes

5 Large Green Heavy Yes Yes

6 Medium Blue Heavy Yes No

7 Medium Green Heavy Yes No

8 Small Red Light No No

Page 11: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

データセットの書式

• Wekaではarff形式のファイルが推奨されている

@relation BounceBall @attribute Size {Small,Medium,Large} @attribute Color {Red,Blue,Green} @attribute Weight {Light,Medium,Heavy} @attribute Rubber {Yes,No} @attribute Bounce {Yes,No} @data Small,Green,Light,Yes,Yes

属性

@data以下にcsv形式でデータを列挙

※dataフォルダ内のサンプルも参考にしてもよい ※Wekaでは大文字小文字を区別します ※csv形式のファイルも読み込めますが、推奨はされません

データセットの名前 名義属性は{}でく

くってカンマ区切り 数値の場合は Numeric,integer,real 文字列の場合は String 等で指定する

Page 12: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

データセットの読み込み

• Preprocessタブを開き、Open fileを押します

Page 13: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

データセットの読み込み

• (インストール場所)¥data内にサンプルのデータセットが置いてあります

Page 14: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

データセットの読み込み

• 書式が間違っている場合は、以下の様なエラーが出るので、間違っている部分を修正しましょう

Page 15: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

Wekaの実行

• Classifyタブを開き、Chooseを押します

Page 16: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

アルゴリズムの指定

• アルゴリズムの一覧が表示されるので、treesフォルダの中にあるRandom Treeを選択します

Page 17: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

評価方法の指定

• 初期設定ではCross-validation(交差検定)になっていますが、Use training setに変更します(全てを学習データとする)

Page 18: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

Wekaの実行

• Startを押すと解析を実行します

Page 19: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

Wekaの実行

• 結果が右側に表示されたら、結果のリストを右クリックし、Visualize treeを選択します

Page 20: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

J48による解析

• J48アルゴリズムによる解析結果

Page 21: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

RandomTreeによる解析

• Randomtreeアルゴリズムによる解析結果

Page 22: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

新しいデータの予測

• データセットに、不明なデータを追加します。不明なデータは?と書く

Page 23: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

設定の追加

• More Optionsを開き、Output predictionsをPlainTextに変えます

Page 24: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

予測結果

• マイニングを実行すると、不明な部分がYesと予測されました

Page 25: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

ID Trees to Rules Once an ID tree is constructed successfully, it can be used to generate a rule-set, which will serve to perform the necessary classifications of the ID tree. This is done by creating a single rule for each path from the root to a leaf in the ID tree.

R1: if (size = large) then (ball does bounce) R2: if (size = medium) then (ball does not bounce) R3: if (size = small) (rubber = no) then (ball does not bounce) R4: if (size = small) (rubber = yes) then (ball does bounce)

Page 26: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

Refined Rules

R1: if (size = large) then (ball does bounce) R2: if (size = medium) then (ball does not bounce) R3: if (rubber = no) then (ball does not bounce) R4: if (size = small) (rubber = yes) then (ball does bounce)

R1: if (size = large) then (ball does bounce) R2: if (size = medium) then (ball does not bounce) R3: if (size = small) (rubber = no) then (ball does not bounce) R4: if (size = small) (rubber = yes) then (ball does bounce)

Rules are used in rule-based (forward chaining or backward chaining) systems.

Page 27: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

ball.csv

Page 28: AI Programming – data mining ( Plug in Weka to …AI Programming – data mining ( Plug in Weka to Eclipse) • Review of Identification Tree • Run bouncing ball in Weka • Run

@relation BounceBall @attribute Size {Small,Medium,Large} @attribute Color {Red,Blue,Green} @attribute Weight {Light,Medium,Heavy} @attribute Rubber {Yes,No} @attribute Bounce {Yes,No} @data Small,Green,Light,Yes,Yes Small,Blue,Medium,No,No Medium,Red,Medium,No,No Small,Red,Medium,Yes,Yes Large,Green,Heavy,Yes,Yes Medium,Blue,Heavy,Yes,No Medium,Green,Heavy,Yes,No Small,Red,Light,No,No

ball.arff