dmkdd-5
TRANSCRIPT
-
7/24/2019 DMKDD-5
1/28
Knowledge Discovery
in Databases (IS704)
dan Data Mining
(CS704)
Kuliah #5:
Classification (Bagian 1)
GunawanJurusan Teknik Informatika
ekolah Tinggi Teknik ura!a"a
e$isi %& 'gustus %15
-
7/24/2019 DMKDD-5
2/28
November 23, 2015 Gunawan, Teknik Informatika STTS 2
Classification (1
Sebelum bicara terlalu !au", a#ala" baik !ika kitamencoba mem$er"atikan bebera$a ca$ture# sli#es %an&menarik untuk mema"ami "akikat #asar 'ac"ine
earnin& Sumber)
www*cs*ucr*e#u+eamonn+205+'ac"ineearnin&*$$t *
- ..T"e /i&eon /roblemT"e /i&eon /roblem
an&an lu$akan ba"wa ata 'inin& a#ala" irisan #arisala" satu sub#isi$lin rtificial Intelli&ence
- 'ac"ine earnin&'ac"ine earnin&*
http://www.cs.ucr.edu/~eamonn/205/MachineLearning.ppthttp://www.cs.ucr.edu/~eamonn/205/MachineLearning.ppt -
7/24/2019 DMKDD-5
3/28
November 23, 2015 Gunawan, Teknik Informatika STTS 3
Classification (2
Termasuk #alam ata 'inin& Task)Predictive Modeling*
iberikan "im$unan examples (instances)%an& tela"#ipreklasifikasi(#ilabeli sebelumn%a, 4$1,t1, 4$2,t2, *****
4$n,tn, konstruksila" sebua" mo#el %an& #a$atmen%arikan $en&eta"uan #alam e6am$les tersebut #an#a$at #i&unakan untuk ke$erluan $re#iksi, %aitumem$re#iksi class+tar&et #ari instance %an& belum
#ilabeli*- $ia#ala" vektor input, se#an&kan tia#ala" target (class) terkait*- 76am$les %an& #i&unakan untuk membentuk mo#el #isebut
seba&ai training set*
-
7/24/2019 DMKDD-5
4/28
November 23, 2015 Gunawan, Teknik Informatika STTS 8
Classification (3
9kuran sukses klasifikasi biasan%a #iukur #en&anmen&u!i #ata baru (fres" #ata %an& ti#ak#i&unakan seba&ai trainin& set, teta$i labeln%a!u&a tela" #iketa"ui sebelumn%a*
:emu#ian #i"itun& $ro$orsi !umla" #ata+instance%an& #a$at #iklasifikasi secara benar (akurasin%a*
ikenal term;term se$erti :%atau *5:%5atau*:&%an& menun!ukkan $erban#in&an$ersentase (trainin& set ) testin& set*
-
7/24/2019 DMKDD-5
5/28
November 23, 2015 Gunawan, Teknik Informatika STTS 5
Classification (8
efinisi %an& lebi" mem$er!elas $eranclassification #alam minin& #atabase)- iberikan sebua" #atabase / < 4$1, $2, *****, $n #ari
se!umla" tu$les (atau items + recor#s + e6am$les +
instances #an "im$unan class T < 4t1, t2, *****, tm,classification $roblem a#ala" men#efinisikan suatu$emetaan f ) /T %an& men&ara"kan $ike #alamsalahsatuclass t!*
=alau$un ter#a$at mo#el lainn%a (mis* Na>ve
?a%es, mo#el %an& men#efinisikan $emetaantersebut biasan%a beru$a varian #ari)- Classification rules- Classification tree
-
7/24/2019 DMKDD-5
6/28
November 23, 2015 Gunawan, Teknik Informatika STTS @
Conto" Classification
iberikan e6am$les berikut)
?an&unla" sebua" mo#el %an& #a$at mem$re#iksi class #ari instanceoutlook=sunny, temperature=hot, humidity=low,#an windy=true*
-
7/24/2019 DMKDD-5
7/28
November 23, 2015 Gunawan, Teknik Informatika STTS A
'asala" $a#a enis tribut %an&
iklasifikasi Ter#a$at 2 (#ua ti$e utama atribut)
- Numerik - nilai;nilain%a a#ala" an&ka*
- Nominal - nilai;nilain%a termasuk #alam bebera$a kemun&kinan %an&
!umla"n%a terbatas #an #is$esifikasi sebelumn%a*
-
7/24/2019 DMKDD-5
8/28
November 23, 2015 Gunawan, Teknik Informatika STTS B
Termasuk $a#a Tar&et tributn%a
In$ut ttributes #an Tar&et trribute, semua numerik)Cycle
time (ns)Main memory
(Kb)Cache(Kb)
Channels Performance
MYCT MMIN MMAX CACH CHMIN CHMAX PRP
1 1! !" "### !" 1" 1$ 1%$
% $### ## & $ & "%'
#
'
1!
'
###
'
$###
'
#
'
'
1
'
!
#$ $# !1 $### & # # "
#% $# 1### ### # # # !
*ol+si ,Ti-a. *e-erhana/,Ti-a. *e-erhana/0inear Reression2
PRP = -55.9 + 0.0489 MYCT + 0.0153 MMIN + 0.0056 MMAX+ 0.6410 CACH - 0.2700 CHMIN + 1480 CHMAX
-
7/24/2019 DMKDD-5
9/28
November 23, 2015 Gunawan, Teknik Informatika STTS
1 i$erkenalkan ole" Dobert C* Eolte (1B3*(F
ikenal !u&a #en&an +1, -olte.*
1D sen#iri a#ala" ke$en#ekan #ari .1;ruleatau .Inferin& Du#imentar% Dule (
Hut$ut al&oritma 1D a#ala" one level decision tree%an& #a$at
#isa!ikan !u&a melalui sebua" classification rule sets* 1D ti#ak menekankan akurasi sem$urna atau 100 benar $a#a
rule sets %an& #i"asilkan*
1D ti#ak $erna" #i$ertimban&kan seba&ai sala" satu $en#ekatanformal #alamMachine Learning atau Data Mining*
Namun teta$ "arus #iin&at) Mengapa harus dipusingkan dengansebuah decision tree yang kompleks ketika sebuah rule setssederhana dapat melakukannya!
(F Eolte, Dobert C* 13* "ery simple classification rules perform #ell on most commonly used datasets*'ac"ine earnin&, 11)@3;0* Eolte beker!a #i Com$uter Science e$artement, 9niversit% of Httawa*
-
7/24/2019 DMKDD-5
10/28
'lgoritma /riginal 1
November 23, 2015 Gunawan, Teknik Informatika STTS 10
Sumber:
!"e Develo#ment o$ %olte&s ' Classi$ier Craig *evill+Manning ,eo$$rey %olmes
and Ian %- .itten- Ini bu/an #a#er ' yang ditulis %olte a/an teta#i #a#er yang
mengembang/an algoritma %olte-
-
7/24/2019 DMKDD-5
11/28
November 23, 2015 Gunawan, Teknik Informatika STTS 11
'lgoritma 1
FOR EACH at!"#t
FOR EACH $a%#& 'a! at!"#t !(!) "&(t#* &"#a,
#%& &t '&(a( aa
M&(,!t#( &"&a/a &!( &"#a, *&%a
'!,a!%*a( 'a! /aa(a( at!"#t=$a%#& !(! a/at*a( &"#a, *&%a a( /a%!( "a(a*
'!,a!%*a( at!"#t '&(a( $a%#& !(!
Ta"a,*a( &"#a, #%& IF atribut = value
THEN kelas*& 'a%a #%& &t
H!t#(%a, t!(*at *&a%a,a( 'a! #%& &t !(!
P!%!,%a, &"#a, #%& &t '&(a( t!(*at
*&a%a,a( t&*&!%
-
7/24/2019 DMKDD-5
12/28
November 23, 2015 Gunawan, Teknik Informatika STTS 12
0eather ro!lem 2atasetsOutlook Temperature Humidity Windy Play
Sunny Hot High Weak No
Sunny Hot High Strong No
Overcast Hot High Weak Yes
Rain Mild High Weak Yes
Rain Cool Normal Weak Yes
Rain Cool Normal Strong NoOvercast Cool Normal Strong Yes
Sunny Mild High Weak No
Sunny Cool Normal Weak Yes
Rain Mild Normal Weak Yes
Sunny Mild Normal Strong YesOvercast Mild High Strong Yes
Overcast Hot Normal Weak Yes
Rain Mild High Strong No
-
7/24/2019 DMKDD-5
13/28
November 23, 2015 Gunawan, Teknik Informatika STTS 13
1,-olte untuk 0eather ro!lem
!anda 1 2ili"an andom
-
7/24/2019 DMKDD-5
14/28
November 23, 2015 Gunawan, Teknik Informatika STTS 18
I3
i$erkenalkan ole" * Cen#rowska (1BA*(F
Termasuk kate&ori algoritma covering, berbe#a #en&anI3 %an& termasuk #alam kate&ori algoritma divide andcon$uer*
isebut #en&an $en#ekatan coverin&, karena $a#a setia$sta&e #ii#entifikasi rule %an& men&cover se!umla"instances*
Hut$ut al&oritma /DIS' a#ala" se!umla" classificationrules*
/DIS' "an%a men&"asilkan rule;rule %an& sem$urnaatau 100 benar**
-
7/24/2019 DMKDD-5
15/28
November 23, 2015 Gunawan, Teknik Informatika STTS 15
rinsi4 'lgoritma Co$ering
4ace untuk semua instancese6am4les
7ang 8ico$er rules se9auh ini
ules "ang akan 8itam!ahkan kemu8ian
Classi$ication ules untu/ Masala" yang Sama:
I3 51'-6 !%* /elas1b
I3 8'-6 9*D y86- !%* /elas1a
I3 8'-6 9*D y516- !%* /elas1b
;agaimana dengan Decision !reenya?7)- PRISM: an Algorithm for Inducing Modular Rules-
International =ournal o$ Man+Mac"ine Studies @ol- 67 *o- 4 ##- A4>+A70-
November 23, 2015 Gunawan, Teknik Informatika STTS 1@
-
7/24/2019 DMKDD-5
17/28
November 23, 2015 Gunawan, Teknik Informatika STTS 1A
'lgoritma I3
FOR EACH *&%a
I(!!a%!a! E '&(a( !(ta(& &t
HIE E &(a('#( !(ta(& 'a%a *&%a C
&(t#* &"#a, #%& '&(a( H *( a( &/&'!*! *&%a C
NTI R &/#(a ta* a'a at!"#t a( 'a/at '!/a*a!:) %a*#*a( FOR EACH at!"#t A a( t!'a* t&a#* R)'a( t!a/ (!%a! $)
P&t!"a(*a( #(t#* &(a"a, *('!! A=$ /a'a H 'a! R
P!%!, A 'a( $ #(t#* &a*!a%*a( a*#a! /;t
t!/< /!%!, *('!! '&(a( (!%a! P a( t&"&a:
Ta"a,*a( A=$ *& 'a%a R
Ha/#%a, a !(ta(& a( t&$& %&, R 'a! E
Catatan:
p = positive examples dari suatu kelas
t = total instances
-
7/24/2019 DMKDD-5
18/28
November 23, 2015 Gunawan, Teknik Informatika STTS 1B
Contact ens 2ataset
Berdasarkan kondisi apa seorang ahli optik akan menyarankan
seseorang yang mengalami gangguan mata: menggunakan softcontact-lense, menggunakan hard contact-lenses, atau justru
tidak disarankan menggunakan contact-lense?
i$erkenalkan !u&a ole" * Cen#rowska*
Catatan terms)
- s$ectacle $rescri$tion < $eruntukan kaca mata %an& #iketa"ui
- m%o$e < rabun #ekat
- "%$ermetro$e < rabun !au"
- asti&matism < $an#an&an kabur, %lack of point focus&'' an
astigmatism refers to an irregular curvature of the cornea'
- $resb%o$ic ( J K85 ta"un < batas usia laLimn%a untuk mam$u meli"at
ob%ek $a#a !arak !au" (85 ta"un
- $re;$resb%o$ic ( M K85 ta"un < sebelum masa $resb%o$ic
umber term *ord+et ,'- http..###'allaboutvision'com.askdoc.astigmatism'htm
-
7/24/2019 DMKDD-5
19/28
November 23, 2015 Gunawan, Teknik Informatika STTS 1
Contact ens 2atasetage spectacle
prescription
stigmatism tear production
rate
reccomended
lenses
young myope no reduced none
young myope no normal so!t
young myope yes reduced noneyoung myope yes normal hard
young hypermetrope no reduced none
young hypermetrope no normal so!t
young hypermetrope yes reduced none
young hypermetrope yes normal hard
pre"pres#yopic myope no reduced none
pre"pres#yopic myope no normal so!t
pre"pres#yopic myope yes reduced none
pre"pres#yopic myope yes normal hard
pre"pres#yopic hypermetrope no reduced none
pre"pres#yopic hypermetrope no normal so!t
pre"pres#yopic hypermetrope yes reduced none
pre"pres#yopic hypermetrope yes normal none
pres#yopic myope no reduced none
pres#yopic myope no normal nonepres#yopic myope yes reduced none
pres#yopic myope yes normal hard
pres#yopic hypermetrope no reduced none
pres#yopic hypermetrope no normal so!t
pres#yopic hypermetrope yes reduced none
pres#yopic hypermetrope yes normal none
-
7/24/2019 DMKDD-5
20/28
November 23, 2015 Gunawan, Teknik Informatika STTS 20
I3 untuk Contact ens (#1)
ibentuk rule %an& men&cover setia$ class) ,a')t)#an ((&*
'isaln%a #imulai #ari ,a'.
IF > THEN &&('&' = ,a'
9ntuk kon#isi $a#a ES %an& masi" koson& ter#a$at $ili"an)
a& = #( 2;8
a& = /&-/&"/! 1;8
a& = /&"/! 1;8
/&ta%& /&!/t!( = /& 3;12
/&ta%& /&!/t!( = ,/&&t/& 1;12
at!at! = ( 0;12
astigmatism = es !"#$
t&a /'#t!( at& = &'#&' 0;12
t&a /'#t!( at& = (a% 4;12
#i$ili" ba&ian %an& terbesar < 8+12, secara acak #i$ili" sala" satu antaranomor A #an nomor $a#a #aftar #i atas, misaln%a nomor A)
IF at!at! = & THEN &&('&' = ,a'
-
7/24/2019 DMKDD-5
21/28
November 23, 2015 Gunawan, Teknik Informatika STTS 21
I3 untuk Contact ens (#%)
Dule I at!at! = & t,&( &&('&' = ,a'
!elas ti#ak akurat* i"at tabel berikut)
Tam4ak !ahwa rule terse!ut han"a mengco$er ; instance "ang
!enar 8ari total 1% instances< ehingga refinement 4erlu 8ilakukan
untuk rule: IF at!at! = & AN > THEN
&&('&' = ,a'
age spectacle
prescription
stigmatism tear production
rate
reccomended
lenses
young myope yes reduced none
young myope yes normal hard
young hypermetrope yes reduced none
young hypermetrope yes normal hard
pre"pres#yopic myope yes reduced nonepre"pres#yopic myope yes normal hard
pre"pres#yopic hypermetrope yes reduced none
pre"pres#yopic hypermetrope yes normal none
pres#yopic myope yes reduced none
pres#yopic myope yes normal hard
pres#yopic hypermetrope yes reduced none
pres#yopic hypermetrope yes normal none
-
7/24/2019 DMKDD-5
22/28
November 23, 2015 Gunawan, Teknik Informatika STTS 22
I3 untuk Contact ens (#&)
IF at!at! = & AN > THEN &&('&' = ,a'
9ntuk kon#isi $a#a ES %an& masi" koson& ter#a$at A $ili"an)a& = #( 2;4
a& = /&-/&"/! 1;4
a& = /&"/! 1;4
/&ta%& /&!/t!( = /& 3;6
/&ta%& /&!/t!( = ,/&&t/& 1;6
t&a /'#t!( at& = &'#&' 0;6
t&a /'#t!( at& = (a% 4;6
#i$ili" ba&ian %an& terbesar < 8+@, %aitu nomor @)
IF at!at! = & AN t&a /'#t!( at& =(a% THEN &&('&' = ,a'
Sebenarn%a untuk sebua" class (misaln%a ,a', al&oritma #a$at#i$aksa ber"enti (misaln%a #isini $a#a kasus contact lens* Namun,
ba&aimana !ika exact rule "arus #i$erole", tan$a mem$e#ulikansekom$leks a$a$un rulen%a
-
7/24/2019 DMKDD-5
23/28
November 23, 2015 Gunawan, Teknik Informatika STTS 23
I3 untuk Contact ens (#;)
Dule IF at!at! = & AN t&a /'#t!(
at& = (a% THEN &&('&' = ,a' !elas masi"belum akurat* i"at tabel berikut)
Tam4ak !ahwa rule terse!ut han"a mengco$er ; instance "ang
!enar 8ari total = instances< ehingga refinement 4erlu 8ilakukankem!ali untuk rule:
IF at!at! = & AN t&a /'#t!(
at& = (a% AN > THEN &&('&' = ,a'
age spectacle
prescription
stigmatism tear production
rate
reccomended
lenses
young myope yes normal hard
young hypermetrope yes normal hard
pre"pres#yopic myope yes normal hardpre"pres#yopic hypermetrope yes normal none
pres#yopic myope yes normal hard
pres#yopic hypermetrope yes normal none
-
7/24/2019 DMKDD-5
24/28
November 23, 2015 Gunawan, Teknik Informatika STTS 28
I3 untuk Contact ens (#5)
IF at!at! = & AN t&a /'#t!( at& =
(a% AN > THEN &&('&' = ,a'
9ntuk kon#isi $a#a ES %an& masi" koson& ter#a$at 5 $ili"an)
a& = #( 2;2
a& = /&-/&"/! 1;2
a& = /&"/! 1;2
spectacle prescription = mope %"%
/&ta%& /&!/t!( = ,/&&t/& 1;3
Se"in&&a "arus #i$ili" ba&ian %an& terbesar, teta$i %an& mana
2+2 atau 3+3 alam "al ini .sebaiknya dipilih yang mengcover lebihbanyak instance, atau nomor 8, se"in&&a rule se!au" ini a#ala")
IF at!at! = & AN t&a /'#t!( at& =(a% AN /&ta%& /&!/t!( = /& THEN&&('&' = ,a'
-
7/24/2019 DMKDD-5
25/28
November 23, 2015 Gunawan, Teknik Informatika STTS 25
I3 untuk Contact ens (#=)
:arena rule IF at!at! = & AN t&a
/'#t!( at& = (a% AN /&ta%&/&!/t!( = /& THEN &&('&' =
,a' su#a" akurat* i"at tabel berikut)
Tam4ak !ahwa rule terse!ut han"a mengco$er & instance "ang
!enar 8ari total %; instances> 8an !aru & 8ari ; instance untuk
&&('&' = ,a'< elan9utn"a & instance 4a8a ta!el 8iatas 8iha4us 8ari total %; instances> 8an mencari rule lainn"a
8engan !entuk:
IF > THEN &&('&' = ,a'
age spectacle
prescription
astigmatism tear production
rate
reccomended
lenses
young myope yes normal hard
pre"pres#yopic myope yes normal hard
pres#yopic myope yes normal hard
-
7/24/2019 DMKDD-5
26/28
November 23, 2015 Gunawan, Teknik Informatika STTS 2@
I3 untuk Contact ens (#*)
en&an cara %an& sama $a#a sta&e sebelumn%a, berturut;turut
#i$erole") a& = #(, a#ala" %an& terbaik untuk con#ition $ertama
(men&cover 1 #ari A*Mengapa angka 7 ?
at!at! = &, a#ala" %an& terbaik untuk con#ition
ke#ua (#i$ili" 1+3*Apakah pilihan ini karena tips! ?
t&a /'#t!( at& = (a%, a#ala" %an& terbaik
untuk con#ition keti&a (#i$ili" 1+1*
Dule %an& #i$erole" a#ala") IF a& = #( AN
a(t!at! = & AN t&a /'#t!( at& =
(a% THEN &&('&' = ,a' su#a" akurat*
Mengco"er #$#? Atau %$%?
a#i seluru" kasus "ar# tela" tercover, se"in&&a lan&ka";lan&ka"
berikutn%a a#ala" untuk kasus soft #an none*
-
7/24/2019 DMKDD-5
27/28
November 23, 2015 Gunawan, Teknik Informatika STTS 2A
I3 untuk Contact ens (#)Koleksi ule engka4 4a8a 'khir roses (? ule) I /&ta%& /&!/t!( = /& a(' at!at! = &
a(' t&a /'#t!( at& = (a% t,&( &&('at!( =,a'
I a& = #( a(' at!at! = & a(' t&a /'#t!(at& = (a% t,&( &&('at!( = ,a'
I a& = #( a(' at!at! = ( a(' t&a /'#t!(at& = (a% t,&( &&('at!( = t
I a& = /&-/&"/! a(' at!at! = ( a(' t&a/'#t!( at& = (a% t,&( &&('at!( = t
I /&ta%& /&!/t!( = ,/&&t/& a(' at!at!= ( a(' t&a /'#t!( at& = (a% t,&(&&('at!( = t
I t&a /'#t!( at& = &'#&' t,&( &&('at!( =((&
I a& = /&"/! a(' /&ta%& /&!/t!( = /&a(' at!at! = ( t,&( &&('at!( = ((&
I a& = /&-/&"/! a(' /&ta%& /&!/t!( =,/&&t/& a(' at!at! = & t,&( &&('at!( =((&
I a& = /&"/! a(' /&ta%& /&!/t!( =,/&&t/& a(' at!at! = & t,&( &&('at!( =((&
-
7/24/2019 DMKDD-5
28/28
November 23, 2015 Gunawan, Teknik Informatika STTS 2B
I3 untuk Contact ens (#?)
akhirnya'''''
'0'@@@'0'@@@
J'AG'A 2IC/B' untukmengatasi Gangguan 3ata
'n8a "ang se!enarn"a*