frida pres
Post on 03-Jun-2018
240 Views
Preview:
TRANSCRIPT
-
8/12/2019 Frida Pres
1/30
On Applications of RoughSets theory to Knowledge
DiscoveryFrida Coaquira
UNIVERSITY OF PUERTO RICO
MAYAGEZ CAMPUSfrida_cn@math.uprm.edu
-
8/12/2019 Frida Pres
2/30
Introduction
One goal of the Knowledge Discovery is extract meaningful
knowledge.
Rough Sets theory was introduced by Z. Pawlak (1982) as
a mathematical tool for data analysis.
Rough sets have many applications in the field of
Knowledge Discovery: feature selection, discretization
process, data imputations and create decision Rules.
Rough set have been introduced as a tool to deal with,
uncertain Knowledge in Artificial Intelligence Application.
-
8/12/2019 Frida Pres
3/30
Equivalence Relation
Let X be a set and let x, y, and z be elements of X.
An equivalence relation R on X is a Relation on X
such that:
Reflexive Property: xRx for all x in X.
Symmetric Property: if xRy, then yRx.
Transitive Property: if xRy and yRz, then xRz.
-
8/12/2019 Frida Pres
4/30
Rough Sets Theory
Let , be a Decision system data,
Where: U is a non-empty, finite set called the universe,A is a non-empty finite set of attributes, C and D are subsets
of A, Conditional and Decision attributes subsetsrespectively.
for is called the value set of a,
The elements of U are objects, cases, states, observations.The Attributes are interpreted as features, variables,
characteristics conditions, etc.
),,,( DCAUT
aVUa : ,Aa aV
-
8/12/2019 Frida Pres
5/30
Indiscernibility Relation
The Indecernibility relationIND(P) is an
equivalence relation.
Let , , the indiscernibilityrelation IND(P), is defined as follows:
for all
Aa AP
:),{()( UUyxPIND ,Pa )}()( yaxa
-
8/12/2019 Frida Pres
6/30
Indiscernibility Relation
The indiscernibility relation defines apartitionin U.
Let , U/IND(P) denotes a family of all equivalence
classes of the relationIND(P), called elementary sets.
Two other equivalence classes U/IND(C) and
U/IND(D), called condition and decision equivalenceclasses respectively, can also be defined.
AP
-
8/12/2019 Frida Pres
7/30
R-lower approximation
Let and , R is a subset of conditional
features, then the R-lower approximation
set of X, is the set of all elements of U which
can be with certainty classified as elements ofX.
R-lower approximation set of X is a subset of X
CRUX
}:/{ XYRUYXR
-
8/12/2019 Frida Pres
8/30
R-upper approximation
the R-upper approximation set of X, is theset of all elements of U such that:
X is a subset of R-upper approximation set of X.R-upper approximation contains all data which can possibly
be classified as belonging to the setX
the R-Boundary set of Xis defined as:
}:/{ XYRUYXR
XRXRXBN )(
-
8/12/2019 Frida Pres
9/30
Representation of the approximation sets
XRXR If then, X is R-definible(the boundary set is empty)If then X is Rough with respect to R.
ACCURACY := Card(Lower)/ Card (Upper)
XRXR
-
8/12/2019 Frida Pres
10/30
Decision Class
The decision d determines thepartition
of the universe U.
Where for
will be called the classification of objects
in T
determined by the decision d.
The setXkis called the k-th decision class of T
},...,{)( )(1 drT XXdCLASS
})(:{ kxdUxXk )(1 drk
)(dCLASST
-
8/12/2019 Frida Pres
11/30
Decision Class
This system data information has 3 classes, We represent the
partition: lower approximation, upper approximation and boundary
set.
-
8/12/2019 Frida Pres
12/30
Rough Sets Theory
Lets considerU={x1, x2, x3, x4, x5, x6, x7, x8} and the
equivalence relation R with the equivalence classes:
X1={x1,x3,x5}, X2={x2,x4}and X3={x6,x7,x8} is a Partition.
Let the classification C={Y1,Y2,Y3} such that
Y1={x1, x2, x4}, Y2={x3, x5, x8}, Y3={x6, x7}
Only Y1has lower approximation, i.e. ,21 XYR
-
8/12/2019 Frida Pres
13/30
Positive region and Reduct
Positive regionPOSR(d) is called thepositive regionof classification
CLASST(d) is equal to the union of all lower approximation
of decision classes.
Reducts ,are defined as minimal subset of condition
attributeswhich preserve positive region defined by the set
of all condition attributes, i.e.
A subset is a relative reduct iff1 ,
2 For every proper subset condition 1 is not true.
)()( DPOSDPOS CR CR
RR '
-
8/12/2019 Frida Pres
14/30
Dependency coefficient
Is a measure of association, Dependency coefficient
between condition attributes A and a decision attribute dis
defined by the formula:
Where, Card represent the cardinality of a set.
)(
))((),(
UCard
dPOSCarddA A
-
8/12/2019 Frida Pres
15/30
Discernibility matrix
Let U={x1, x2, x3,, xn} the universe on decision system
Data.Discernibilitymatrixis defined by:
,
where, is the set of all attributes that classify objects
xiandxjinto different decision classes in U/Dpartition.
for some i, j } .
))}()(,()()(:{ jijiij xdxdDdxaxaCam nji ,...,3,2,1,
ijm
}{:{)( amCaCCORE ij
-
8/12/2019 Frida Pres
16/30
Dispensable feature
Let Ra family of equivalence relations and let P R,
P is dispensablein Rif IND(R) =IND(R-{P}),
otherwise P is indispensable in R.
COREThe set of all indispensable relation in C will be called the
core of C.
CORE(C)= RED(C), whereRED(C) is the family of all
reducts of C.
-
8/12/2019 Frida Pres
17/30
Small Example
Let , the universe set., the conditional features set.
,Decision features set.
},,,,,,{ 7654321 xxxxxxxU
},,,{ 4321 aaaaC
}{dD
d
1 0 2 1 1
1 0 2 0 1
1 2 0 0 2
1 2 2 1 0
2 1 0 0 2
2 1 1 0 2
2 1 2 1 1
1
a2
a 3
a4
a
1x
2x
3x
4x
5x
6x
7x
{,,{, {{, {,{,,,{,,{,,, {,,,,,{,,, ,,
,,
-
8/12/2019 Frida Pres
18/30
Discernibility Matrix
-
-
- -
- -
1x2x 3x 4x 5x 6x
2x
3x
4x
5x
6x
7x
},,{ 432 aaa
}{ 2a
},{ 32 aa
},{ 42 aa
},,{ 321 aaa},,,{ 4321 aaaa
},,,{ 4321 aaaa },,{ 321 aaa
},,,{ 4321 aaaa
},,,{ 4321 aaaa
},,,{ 4321 aaaa
},{ 43 aa
},{ 43 aa},{ 43 aa},{ 21 aa
-
8/12/2019 Frida Pres
19/30
Example
Then, the Core(C) = {a2}
The partition produces by Core is
U/{a2} = {{ x1,x2},{x5, x6,x7},{x3,x4}},
and the partition produces by the decision feature dis
U/{d}={{ x4},{ x1,x2,x7},{x3,x5,x6}}
-
8/12/2019 Frida Pres
20/30
Similarity relation
A similarity relationon the set of objects is
, It contain all objects similar to x.
Lower approximation
, is the set of all element of Uwhich can be with certainty classified as elements of X.
Upper approximation
SIM-Possitive regionof partitionLet
}:{ xySIMUyxSIM TT
}:{)( XxSIMXxXSIM TT
UX
Xx
TT xSIMXSIM
)(
)}(,...,1:{ driXi
})(:{ ixdUxXi )(
1
)(}){(dr
i
iTT XSIMdSIMPOS
UX
-
8/12/2019 Frida Pres
21/30
Similarity measures
a
b
are parameters, this measure is not symmetric.
Similarity for nominal attribute
minma x
1),(aa
vvvvS
ji
jia
otherwise.0
if1),( jia vvS
ajaji vvv
aa ,
)(
1 )().(
),(),(
),(
dr
k
ji
jiakdPdr
vakdPvakdP
vvS
-
8/12/2019 Frida Pres
22/30
-
8/12/2019 Frida Pres
23/30
Attribute Reduction
The purpose is select a subset of attributes from an Original
set of attributes to use in the rest of the process.
Selection criteria: Reduct concept description.
Reduct is the essential part of the knowledge, which defineall basic concepts.
Other methods are:
Discernibility matrix (nn)
Generate all combination of attributes and then evaluatethe classification power or dependency coefficient
(complete search).
-
8/12/2019 Frida Pres
24/30
Discretization Methods
The purpose is development an algorithm that find a
consistent set of cuts point which minimizes the number of
Regions that are consistent.
Discretization methods based on Rough set theory try to find
These cutpointsA set of S points P1, , Pn in the plane R2 , partitioned into
two disjoint categories S1, S2 and a natural number T.
Is there a consistent setof lines such that the partition of the
plane into region defined by them consist of at most T
regions?
-
8/12/2019 Frida Pres
25/30
Consistent
Def.A set of cuts P is consistent with A (or A-consistent) iff,
where and are general decisions of A and AP
respectively.
Def.A set Pirrof cuts isA-irreducible iff Pirris A-consistent
and any its proper subfamily P ( P PPirr) is not
A-inconsistent.
PAA
A PA
-
8/12/2019 Frida Pres
26/30
Level of Inconsistency
Let Ba subset of Aand
WhereXiis a classification of U and
, i= 1,2,,n
Lcrepresents the percentage of instances which can beCorrectly classified into classXiwith respect to subset B.
U
XBL
i
c
ji XX
UXi
-
8/12/2019 Frida Pres
27/30
Imputation Data
The rules of the system should have Maximum in terms of
consistency.
The relevant attributes for x is defined by.
is defined }
And the relation
for all
x and y are consistent if .
ExampleLet x=(1,3,?,4), y=(2,?,5,4) and z=(1,?,5,4)
x and z are consistent
x and y are not consistent
)(:{)( xaRaxrelR
)()( yaxayxRc )()( yrelxrela RR
yxRc
zxRc
-
8/12/2019 Frida Pres
28/30
Decision rules
F1 F2 F3 F4 D Rules
O3 0 0 0 1 L R1
O5 0 0 1 3 L R1
O1 0 1 0 2 L R2
O4 0 1 1 0 M R3
O2 1 1 0 2 H R4
Rule1 if (F2=0) then (D=L)Rule2 if (F1=0) then (D=L)
Rule3 if (F4=0) then (D=M)
Rule4 if (F1=0) then (D=H)
The algorithm should minimize the number of features
included in decision rules.
-
8/12/2019 Frida Pres
29/30
References
[1] Gediga, G. And Duntsch, I. (2002) Maximum Consistency ofIncomplete Data Via Non-invasive Imputation. ArtificialIntelligence.
[2] Grzymala, J. and Siddhave, S. (2004) Rough set Approach to RuleInduction from Incomplete Data. Proceeding of the IPMU2004,the10th International Conference on information Processing and
Management of Uncertainty in Knowledge-Based System.[3] Pawlak, Z. (1995) Rough sets. Proccedings of the 1995 ACM 23rd
annual conference on computer science.
[4]Tay, F. and Shen, L. (2002) A modified Chi2 Algorithm forDiscretization. In IEEE Transaction on Knowledge and Dataengineering, Vol 14, No. 3 may/june.
[5] Zhong, N. (2001) Using Rough Sets with Heuristics for FeatureSelection. Journal of Intelligent Information Systems, 16, 199-214,Kluwer Academic Publishers.
-
8/12/2019 Frida Pres
30/30
THANK YOU!
top related