block-structured process discovery: filtering infrequent behaviour sander leemans dirk fahland wil...

25
Block-Structured Process Discovery: Filtering Infrequent Behaviour Sander Leemans Dirk Fahland Wil van der Aalst Eindhoven University of Technology

Upload: rosaline-baker

Post on 18-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Block-Structured Process Discovery:Filtering Infrequent Behaviour

Sander LeemansDirk Fahland

Wil van der AalstEindhoven University of Technology

Sander Leemans 2

Process discovery

Fast

FittingPreciseGeneralSimpleSound

3

Trade-off

ILPαHeuristics Miner Evolutionary Tree Miner

not fastnot fittingnot sound

not fittingnot sound

not simplenot sound

Flower modelnot precise

a b c

def

τ τ

Sander Leemans 4

Infrequent behaviour

• 80% model• Filtering beforehand is

difficult

• Filter during discovery

Sander Leemans 5

Outline

?

Sander Leemans 6

Process trees

a b

c d

e

a

b

c d

e

Sander Leemans 7

Outline

?

Sander Leemans 8

Divide & conquer

{<a,c,d,e,b>, <a,b,e,d,c>, <a,e,c,b,d>, <a,d,b,c,e>}

{<a>, <a>, <a>, <a>}

{<c,d,e,b>, <b,e,d,c>, <e,c,b,d>, <d,b,c,e>}

recurserecurse

a

Sander Leemans 9

Finding operator

{<a,c,d,e,b>, <a,b,e,d,c>, <a,e,c,b,d>, <a,d,b,c,e>}

{<c,d,e,b>, <b,e,d,c>, <e,c,b,d>, <d,b,c,e>}

recurse

a

• Find cut in directly-follows graph• Sequence: edges crossing one-

way only

b c

d e

a

Sander Leemans 10

Inductive Miner

• Divide activities, select operator• Split log• Recurse until base case

?

?

{c,d}{a,b}

{c} {d}

Sander Leemans 11

Outline

?

Sander Leemans 12

Inductive Miner - infrequent

• Divide activities, select operator• Split log• Recurse until base case

?

?

{c,d}{a,b}

{c} {d}

Threshold

Sander Leemans 13

Divide activities, select operator

• Filter infrequent edges(b,c) 100(b,d) 100(b,e) 100(b,a) 1

b c

d e

a

1

100

100

100

100100 100

100

100

100

100

Sander Leemans 14

2

5

5

4

2

22 2

2

5

Divide activities, select operator

• Weaker log• Use eventually-follows

relation instead• Amplifies “correct”

edgesb c

d e

a

1

1

1

1

1

11 1

1

1

1

a b c

<a, b, c>

(b,c) 1(b,d) 1(b,e) 1(b,a) 1

Sander Leemans 15

Split log

{a} {b}

<a,a,a,b,b,b>

<a,a,b,b,a,b><a,a,a> <b>

<a,a> <b,b,b>

<a,a,a> <b,b,b>

Sander Leemans 16

Outline

?

Sander Leemans 17

Comparison

• 12 logs, 5 miners

• Discover (< 2 hours)• (Convert to Petri net)• Measure

• Inductive Miner• Inductive Miner -

infrequent• Heuristics Miner• Integer Linear

Programming Miner• Evolutionary Tree Miner• (Flower model)• (Trace model)

Sander Leemans 18

Comparisonfitness

precision

generalisation

simplicitymining time

completed

soundness

IMIMiHMILPETM

Sander Leemans 19

Comparisonfitness

precision

generalisation

simplicitymining time

completed

soundness

IMIMiHMILPETM

Sander Leemans 20

Comparisonfitness

precision

generalisation

simplicitymining time

completed

soundness

IMIMiHMILPETM

Sander Leemans 21

Comparisonfitness

precision

generalisation

simplicitymining time

completed

soundness

IMIMiHMILPETM

Sander Leemans 22

Comparisonfitness

precision

generalisation

simplicitymining time

completed

soundness

IMIMiHMILPETM

Sander Leemans 23

Comparisonfitness

precision

generalisation

simplicitymining time

completed

soundness

IMIMiHMILPETM

Sander Leemans 24

You have been watchingIn order of appearance

?

?

Sander Leemans 25

Base cases• Enough empty traces:

filter

• Average number of events per trace close enough to 1

<a>1000

<a,a>1

< >1

a

<a>1000

<a,a>1000

< >1000

a τ

<a>1000

<a,a>1

< >1000 x

τ <a>1000

<a,a>1