action rules /lecture ii

ACTION RULES/Lecture II/

presented bypresented by

Zbigniew RasZbigniew RasUNC-Charlotte, Computer ScienceUNC-Charlotte, Computer Science

Decision tableAny information system of the form Any information system of the form

S = (U, AS = (U, AFlFl A AStSt {d}), where {d}), where d d A AFlFl A AStSt is a distinguished attribute called is a distinguished attribute called

decision. decision. The elements of AThe elements of AStSt are called stable conditions are called stable conditions the elements of Athe elements of AFlFl {d} are called flexible {d} are called flexible

conditionsconditions

Example of action ruleExample of action rule::

[ ([ (bb11, , vv11 w w11) ) ( (bb22,, v v22 w w22) ) … … ( (bbpp,, v vpp w wpp)](x) )](x)

[([(dd,, k k11 k k22)])](x) (x)

Action Rules [Z. Ras & A. Wieczorkowska]

Assumption: (i)[(1 i p) (bi AFl)]

X a b c d

x1 0 S 0 L

x2 0 R 1 L

x3 0 S 1 L

x4 0 R 1 L

x5 2 P 2 L

x6 2 P 2 L

x7 2 S 2 H

{a, c} - stable attributes, {a, c} - stable attributes,

{b,d}{b,d} - - flexibleflexible attributes, attributes,

dd - decision attribute. - decision attribute.

Action Rules

Rules discovered: Rules discovered: rr11 = = [ [ (b, P) (d, L)]

rr22 = = [(a, 2) ^[(a, 2) ^ (b, S) (d, H)] Notation: (r2)={a,b}, (r2)=d.

Decision Table(r1, r2)- action rule:[(b, P S)](x) [(d, L H)](x)

St Flex St Flex St Flex DecisionA B C D E F Ga1 * b1 * c1 * d1 g1

a1 * b2 * e2 f2 g2

E-Action rule: (B, b1 b2) ^ (E = e2) ^ (F, f2) (G, g1 g2)

What about support & confidence of action rules?

E-Action Rules [L.-S. Tsay & Z. Ras]

Action ruleAction rule r r::

[ (b[ (b11, , vv11 w w11) ) (b (b22, , vv22 w w22) ) … … (b(bpp, , vvpp w wpp)](x) )](x) [(d, [(d, kk11 k k22)] (x))] (x)

ObjectObject x certainly supports x certainly supports rulerule r r in S = (X, A) if: in S = (X, A) if:

1) (1) (i i p)[ p)[ bbii(x)(x) == v vii ] and ] and dd(x) = (x) = kk11 2) (2) (y y X)( X)(i i p)[ p)[ bbii(y) = (y) = wwii ] and ] and dd(y) = (y) = kk22 3) (3) (b b A – [{b A – [{bi i : 1 : 1 i i p} p} {d}])[ b(x) = {d}])[ b(x) = b(y) ] b(y) ]

[Object-Based] Support of Action Rules

aa11 aa22 bb11 bb22 aa33 aa44 dd

xx nn11 nn22 vv11 vv22 uu11 uu22 kk11

yy nn11 nn22 ww11 ww22 uu11 uu22 kk22CSupS(r) = card{x: x certainly supports r in S}

Action ruleAction rule r r::

[([(bb11, v, v11 w w11) ) ( (bb22, v, v22 w w22) ) … … ( (bbpp, v, vpp w wpp)](x) )](x) [([(d, kd, k11 k k22)] (x))] (x)

ObjectObject x possibly supports x possibly supports rulerule r r in S = (X, A) if: in S = (X, A) if:

1) (1) (i i p)[ p)[ bbii(x) = (x) = vvii ] and ] and dd(x) = (x) = kk11 2) (2) (y y X)( X)(i i p)[ p)[ bbii(y) = (y) = wwii ] and ] and dd(y) = (y) = kk22

3) (3) (c c A AStSt)[)[cc(x) = (x) = cc(y)](y)]

[Object-Based] Support of Action Rules

aa11 aa22 bb11 bb22 cc11 cc22 dd


yy mm11 mm22 ww11 ww22 uu11 uu22 kk22

PSupS(r) = card{x: x possibly supports r in S}

Action rule Action rule rr::[([(bb11, v, v11 w w11) ) ( (bb22, v, v22 w w22) ) … … ( (bbpp, v, vpp w wpp)](x) )](x)

[([(d, kd, k11 k k22)] (x))] (x)

ObjectObject x xX supports rule rX supports rule r in S = (X, A), if there are two in S = (X, A), if there are tworules rrules r11, r, r22 extracted from S and there exists object y extracted from S and there exists object y X Xsatisfying two conditions:satisfying two conditions:

((i i p)[[ p)[[ bbii (r(r11)] )] [ [bbii(x) = (x) = vvii]] ]] (r(r11)=)=dd dd(x) = (x) = kk11

((i i p)[[ p)[[ bbii (r(r22)] )] [ [bbii(y) = (y) = wwii]] ]] (r(r22)=)=dd dd(y) = (y) = kk2 2

[[b [[b A AStSt] ] [b[b (x) = b(y)](x) = b(y)] ] ]

[Rule-Based] Support of Action Rules

Confidence:ConfS(r) = RSupS(r)/SupS(r1)

RSupS(r) = card{x: x supports r in S}

aa11 aa22 bb11 bb22 cc11 cc22 dd


yy ww11 ww22 uu11 uu22 kk22

(r(r22)) = {a1,a2,b1,b2,c1,c2}

(r(r11))

(r(r22))

Assumption: S S = (X, A, V) is information system, Y = (X, A, V) is information system, Y X. X. Attribute b Attribute b A is flexible in S and b A is flexible in S and b11, b, b22 V Vbb. .

By By S(Y, b1, b2) we mean a number from (0, + we mean a number from (0, +] which] whichdescribes the average describes the average predicted cost of approved actionpredicted cost of approved action associated with a possible re-classification of qualifying associated with a possible re-classification of qualifying objects in Y from class bobjects in Y from class b11 to b to b22. Object x . Object x Y qualifies for Y qualifies for re-classification from bre-classification from b11 to b to b22, if b(x) = b, if b(x) = b11. .

SS(Y, b(Y, b11, b, b22) = +) = +, if there is no action approved which is , if there is no action approved which is required for a possible re-classification of qualifying objects required for a possible re-classification of qualifying objects in Y from class bin Y from class b11 to b to b22

Cost of Action Rule [Ras & Tzacheva]

If Y is uniquely defined, we often write S(b1, b2) instead of SS(Y, b(Y, b11, b, b22).).

Cost of Action Rule

Action rule Action rule r: :

[(b[(b11, v, v11→ w→ w11) ) (b (b22, v, v22→ w→ w22) ) … … ( b( bpp, v, vpp→ w→ wpp)](x) )](x) (d, k (d, k11→ k→ k22)(x) )(x)

The cost of r in S::

costcostSS(r) = (r) = {{SS((vvii , w , wii) : 1 ) : 1 i i p} p}

Action rule Action rule r r is is feasible in Sfeasible in S, if cost, if costSS(r) < (r) < SS(k(k11 , k , k22). ).

For any feasible action rule For any feasible action rule rr, the cost of the conditional , the cost of the conditional part of part of r r is lower than the cost of its decision part. is lower than the cost of its decision part.

Extension: Cost of Action Rule

RRSS[([(dd, , kk11 → k → k22)] denotes set of feasible action rules in S having )] denotes set of feasible action rules in S having term (d, kterm (d, k11 → k → k22) as their decision part. ) as their decision part.

Assumption:Assumption: Among action rules in R Among action rules in RSS[([(dd, , kk11 → k → k22)] the user )] the user identifies rule identifies rule rr of minimal cost value. But that cost value of minimal cost value. But that cost value may still be too high to get his approval for implementation of may still be too high to get his approval for implementation of rr. .

The cost of The cost of rr might be high because of the high cost of one might be high because of the high cost of one of its sub-terms (bof its sub-terms (bjj, v, vjj → w → wjj). ).

In such case, In such case, we may look for an action rule in Rwe may look for an action rule in RSS[(b[(bjj, v, vjj → w → wjj)] of minimal )] of minimal

cost cost value needed to re-classify qualifying objects fromvalue needed to re-classify qualifying objects from v vjj to w to wjj..

Rules Rules short on left sideshort on left side. It was observed such rules were . It was observed such rules were not not interestinginteresting – active mining. – active mining.

Example: Example:

r = [(br = [(b11, v, v11 → w → w11) ) … … (b(bjj, v, vjj → w → wjj)) … … ( b ( bpp, v, vpp → w → wpp)](x) )](x)

(d, k(d, k11 → k → k22)(x) )(x)

In RIn RSS[[(b(bjj, v, vjj → w → wjj))] we find] we find

rr11 = = [(b[(bj1j1, v, vj1j1 → w → wj1j1) ) (b (bj2j2, v, vj2j2 → w → wj2j2) ) … … ( b( bjqjq, v, vjqjq → w → wjqjq)])](x) (x)

(b(bjj, v, vjj → w → wjj))(x) (x)

Then, we can compose r with rThen, we can compose r with r11 and the same replace and the same replace

term term (b(bjj, v, vjj → w → wjj)) by term from the left hand side of r by term from the left hand side of r11::

[([(bb11, , vv11 → → ww11) ) … … [([(bbjj11, , vvjj11 → → wwjj11) ) ( (bbjj22, , vvjj22 → → wwjj22) ) … … ( ( bbjqjq, , vvjqjq → → wwjqjq)])] … … ( ( bbpp, , vvpp → → wwpp)]()](xx) ) ( (dd, , kk11 → → kk22)()(xx))

Cost of Action Rule

In order to construct action rules of the lowest cost, we In order to construct action rules of the lowest cost, we build build Search Graph Search Graph GGSS, which is a directed graph, that is , which is a directed graph, that is dynamically built by applying action rules discovered dynamically built by applying action rules discovered from S to its nodes. from S to its nodes.

The initial nodeThe initial node n n00 of the graph Gof the graph GSS contains information contains information coming from the user, associated with the system S, coming from the user, associated with the system S, about what about what objects objects he/she would like to reclassifyhe/she would like to reclassify (ex. (ex. from the class described by value kfrom the class described by value k11 of the attribute d to of the attribute d to the class kthe class k22)) and what is the current cost, and what is the current cost, SS(k(k11, k, k22)), of , of thethe reclassification reclassification k k1 1 → k→ k22 . .

Any other node Any other node nn in G in GSS shows an shows an alternative wayalternative way to to achieve the same reclassificationachieve the same reclassification with a with a cost cost that isthat is lowerlower than the costthan the cost assigned to all assigned to all nodes nodes which arewhich are preceding npreceding n in G in GSS. .

Search Graph [Tzacheva & Ras]

Search Graph

Assume that N is the set of nodes in graph GAssume that N is the set of nodes in graph GSS and n and n00 is its initial node. For any node nis its initial node. For any node n N, by N, by

f(n) = (Yf(n) = (Ynn, {[ v, {[ vn,jn,j → w → wn,j n,j , , SS(v(vn,jn,j, w, wn,jn,j)]} )]} jj In In))

we mean we mean its domainits domain (set of objects in S), (set of objects in S), set of actionsset of actions needed to reclassify objects from needed to reclassify objects from YYnn, and , and their costtheir cost, , where Ywhere Ynn X X..

We say that We say that action rule raction rule r, discovered from S, , discovered from S, is is applicable applicable

to node nto node n if: if:YYn n RSup RSupSS((rr) ≠ Ø) ≠ Ø ((k k I Inn)[)[rr R RSS[ v[ vn,kjn,kj → w → wn,kn,k]]]]

n0 = {[ k1 → k2 , S (k1, k2)]}

r = [(b1, v1 → w1) ^ (b2, v2 → w2)^ … ^( bp, vp → wp)](x) => (d, k1 → k2)(x)

n1 = {[ v1 → w1 , S (v1, w2)], [ v2 → w2 , S (v2, w2)], …, [ vp → wp , S (vp, wp)]}

r1

n2

n3

r4

rn

nn

rj

Information System SRS [(d, k1 →, k2)]

r1 r2 r3 rn

Minimal Cost Reclassification Search Graph for S.

Property 1.Property 1.

Let Let f(n f(n00) = (Y, {[k) = (Y, {[k11 →→ kk22, , SS(k(k11,k,k22)]}), )]}),

f(n) = (Yf(n) = (Ynn, {[ v, {[ vn,,kn,,k → w → wn,,k n,,k , , SS (v (vn,,kn,,k, w, wn,,kn,,k)]})]}k k In In)). . The cost assigned to the node n for reclassifying x The cost assigned to the node n for reclassifying x

YYnn from from kk11 to to kk22 is equal to: is equal to:

CostCostk1→k2k1→k2(n, x)(n, x) = {{SS(v(vn,,kn,,k, w, wn,,kn,,k): k ): k I Inn}}

Property 2.Property 2. If node n If node n22 is a successor of the node n is a successor of the node n11, , then then

ConfConfk1→k2k1→k2(n(n22, x), x) ConfConfk1→k2k1→k2(n(n11, , x)x)

Property 3.Property 3. If node n If node n22 is a successor of the node n is a successor of the node n11, , then then

CostCostk1→k2k1→k2(n(n22, x), x) CostCostk1→k2k1→k2(n(n11, x), x)

Search Graph Properties

Search for Action Rules [Tzacheva & Ras]

We propose We propose A*A* type type algorithmalgorithm for speeding up the for speeding up the construction ofconstruction of the the shortest path from the root to the shortest path from the root to the goal nodegoal node in graph Gin graph GSS..

A*A* is probably one of the most is probably one of the most popular search popular search algorithmsalgorithms in AI. It is an informed, optimal search in AI. It is an informed, optimal search algorithm, which uses a algorithm, which uses a heuristic estimate ofheuristic estimate of remaining remaining distance to the goaldistance to the goal by means of a heuristic function by means of a heuristic function h(N)h(N) . .

We assume that user provides three threshold values: We assume that user provides three threshold values:

11 - threshold for - threshold for minimum confidenceminimum confidence of action rules. of action rules. 22 - threshold for - threshold for maximum costmaximum cost of action rules. of action rules. 33 - threshold for - threshold for minimum minimum feasibilityfeasibility of action rules. of action rules.

Heuristic Method - A*

We assume that:We assume that:

h(nh(nii)) = = [cost(n[cost(nii,Y,Yii) - ) - 22]/]/33

Heuristic value Heuristic value h(nh(nii)) is is associated with any node nassociated with any node nii in G. in G. It shows the maximal number of steps that might be It shows the maximal number of steps that might be needed to reach the goal. needed to reach the goal.

Also, we assume that: Also, we assume that:

g(ng(nii)) is the is the number of edges number of edges to the current nodeto the current node

Then, we associate an estimated path length to the goal Then, we associate an estimated path length to the goal for each node as follows:for each node as follows:

f(f(nnii) = h() = h(nnii) + g() + g(nnii))

Proposed Algorithm - A*1.1. Initialize Initialize QQ with search node with search node [([conf(n[([conf(noo),h(n),h(noo)],[n)],[noo])]])] as the only as the only

entry; Initialize domain of nentry; Initialize domain of noo (given by user) as (given by user) as YYoo. .

2.2. If If QQ is empty, fail is empty, fail. . Else, pick search nodeElse, pick search node ss from from QQ with a with a least least value of value of ff. If two search nodes in . If two search nodes in QQ have the same least value have the same least value of of ff assigned to them, if an ontology is available, pick search assigned to them, if an ontology is available, pick search node node ss from from QQ with the highest value of with the highest value of Ont(s)Ont(s)..

3.3. If If state(state(ss)) is a goal and is a goal and conf(conf(ss))11, return , return ss (we have reached (we have reached the goal). Otherwise the goal). Otherwise remove s from remove s from QQ..

4.4. Find all childrenFind all children of of state(state(ss)) and and create all the one-step create all the one-step extensionsextensions of of ss to each descendant. to each descendant.

5.5. If If state(state(s1s1)) is a child of is a child of state(state(ss)) and and rr is the action rule applied is the action rule applied to to ss in order to move from in order to move from ss to to s1s1, then initialize , then initialize YYstate(state(s1s1)) as as YYstate(state(ss)) Dom DomSS((rr)) and if an ontology is available, and if an ontology is available, Ont(Ont(s1s1) as ) as Ont(Ont(rr))

6.6. Add all the extended paths to Add all the extended paths to QQ; ;

7.7. Go to step 2.Go to step 2.

Implementation and Testing The heuristic strategy for lowest cost reclassification – The heuristic strategy for lowest cost reclassification –

LowestCostReclassifierLowestCostReclassifier software, software, is implemented in is implemented in C++C++ using using the Microsoft the Microsoft Visual Studio 7Visual Studio 7.0 IDE and compiler. .0 IDE and compiler.

The user is asked to enter the attribute in which he/she is The user is asked to enter the attribute in which he/she is interested in reclassifying, its current and the desired valuesinterested in reclassifying, its current and the desired values. . Also the user chooses the following 3 thresholds:Also the user chooses the following 3 thresholds:

11 - - minimum confidenceminimum confidence of action rules of action rules22 - - maximum costmaximum cost of action rules of action rules33 - - minimum minimum feasibilityfeasibility of action rules. of action rules.

And the And the currently knowncurrently known to the user to the user costcost of reclassification of reclassification

The action rules have the following form: The action rules have the following form:

(attribute, valueFrom - > valueTo | cost ) => (attribute, valueFrom - > valueTo | cost ) => (attribute, valueFrom -> valueTo | cost) confidence(attribute, valueFrom -> valueTo | cost) confidence

The LowestCostReclassifier software was The LowestCostReclassifier software was tested and applied to tested and applied to three different databasesthree different databases. Two in . Two in medical domainmedical domain, and one in , and one in financial domainfinancial domain. .

Conclusions

We extract action rules as per the original algorithm We extract action rules as per the original algorithm presented in [62]. Next, we proposed a presented in [62]. Next, we proposed a heuristic heuristic approach using A* algorithmapproach using A* algorithm of building a search graph of building a search graph G which will G which will identify an action rule of the lowest costidentify an action rule of the lowest cost considering three thresholds the user provides: considering three thresholds the user provides: min min confidence, max cost, and min feasibilityconfidence, max cost, and min feasibility..

Further, we observed that Further, we observed that even the maximum costeven the maximum cost threshold is not reachable, we will threshold is not reachable, we will still return the best still return the best node foundnode found thus far, which cost would still be thus far, which cost would still be lowerlower than than the the currently knowncurrently known cost to the user. cost to the user.

In that sense, the In that sense, the leavesleaves in our graph in our graph GG and the nodes and the nodes close to them would close to them would represent the most actionable represent the most actionable knowledgeknowledge and the same the and the same the mostly mostly unexpectedunexpected/interesting knowledge related to a desired /interesting knowledge related to a desired reclassification of objects. reclassification of objects.

Subjective measure: user-driven, domain-dependent.Include unexpectedness [Silberschatz and Tuzhilin, 1995]Silberschatz and Tuzhilin, 1995], novelty, actionability [Piatesky-Shapiro & Matheus, 1994].

Claim 1 [Suzuki, Padmanabhan & Tuzhilin]Unexpectedness is partially an objective concept.A B is unexpected with respect to the belief on the dataset D if the following conditions hold:

B = False [ B and logically contradict each other] A holds on a large subset of D

A* B holds which means A*

Our Claim: Our Claim: ActionabilityActionability is partially an objective concept is partially an objective concept. .

Actionability measure = Cost of an action Actionability measure = Cost of an action rulerule

Final Claims

Questions?

Thank You

Our Claim: the most cheap rules are Our Claim: the most cheap rules are most of actionablemost of actionable

Claim 2 [Silberschatz & Tuzhilin]Claim 2 [Silberschatz & Tuzhilin]the most of actionable rules are unexpected

Our Claim: The most cheap Our Claim: The most cheap rules are rules are unexpected

References:Z. Ras, A. Tzacheva, L.-S. Tsay, “Action Rules”, in Encyclopedia of Data Warehousing and Mining, (Ed. J. Wang), Idea Group Inc., 2005, will appearA. Tzacheva, Z. Ras, "Action rules mining", in the Special Issue on Knowledge Discovery, International Journal of Intelligent Systems, Wiley, 2005, will appearA. Tzacheva, Z. Ras, “Discovering non-standard semantics of semi-stable attributes”, Proceedings of Flairs-2003, St. Augustine, Florida, AAAI Press, 2003, 330-334

Final Claims

action rules /lecture ii

Documents

r2 action rule

feasible action rule

cost of r

action rulesrules

rasaction rule r

cost of action rulersd

k1 k2 x object x

p sx d