faculteit technologie management genetic process mining ana karla medeiros ton weijters wil van der...

31
/faculteit technologie management Genetic Process Mining Genetic Process Mining Ana Karla Medeiros Ana Karla Medeiros Ton Weijters Ton Weijters Wil van Wil van der Aalst der Aalst Eindhoven University of Technology Department of Information Systems [email protected]

Post on 19-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

/faculteit technologie management

Genetic Process MiningGenetic Process Mining

Ana Karla Medeiros Ana Karla Medeiros Ton Weijters Ton Weijters Wil van der Aalst Wil van der Aalst

Eindhoven University of Technology

Department of Information Systems

[email protected]

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining – Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining – Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Process Mining

X = apply for licenseA = classes motobikeB = classes carC = theoretical exam

C = theoretical examD = practical motorbike examE = practical car examY = get result

/faculteit technologie management

Process Mining (cont.)

• Most of the current techniques cannot handle– Structural constructs: non-free choice, duplicate tasks

and invisible tasks– Noisy logs– Reason: local approach

 

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining– Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Genetic Algorithms

– Global approach local optimum

global optimum

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining– Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Genetic Process Mining (GPM)

Aim: Use genetic algorithm to tackle non-free choice, invisible tasks, duplicate tasks and noise.

Internal Representation

Fitness Measure

Genetic Operators

/faculteit technologie management

GPM – Build the Initial Population

• Causal Matrix

Input

XX AA BB CC DD EE YY Output

XX

AA

BB

CC

DD

EE

YY

XA

BY

EC

D

/faculteit technologie management

XA

BY

EC

D

GPM – Build the Initial Population

• Causal Matrix

Input

XX AA BB CC DD EE YY Output

XX 0 1 1 0 0 0 0

AA 0 0 0 1 1 0 0

BB 0 0 0 1 0 1 0

CC 0 0 0 0 1 1 0

DD 0 0 0 0 0 0 1EE 0 0 0 0 0 0 1YY 0 0 0 0 0 0 0

/faculteit technologie management

GPM – Build the Initial Population

• Causal Matrix

Input

XX AA BB CC DD EE YY Output

XX 0 1 1 0 0 0 0 A \/ B

AA 0 0 0 1 1 0 0 C /\ D

BB 0 0 0 1 0 1 0 C /\ E

CC 0 0 0 0 1 1 0 D \/ E

DD 0 0 0 0 0 0 1 Y

EE 0 0 0 0 0 0 1 Y

YY 0 0 0 0 0 0 0 True

XA

BY

EC

D

/faculteit technologie management

GPM – Build the Initial Population

• Causal Matrix

Input True X X A \/ B A /\ C B /\ C D \/ E

XX AA BB CC DD EE YY Output

XX 0 1 1 0 0 0 0 A \/ B

AA 0 0 0 1 1 0 0 C /\ D

BB 0 0 0 1 0 1 0 C /\ E

CC 0 0 0 0 1 1 0 D \/ E

DD 0 0 0 0 0 0 1 Y

EE 0 0 0 0 0 0 1 Y

YY 0 0 0 0 0 0 0 True

XA

BY

EC

D

/faculteit technologie management

GPM – Build the Initial Population

• Every individual has the same amount of tasks

(1) Log

X

A

B Y

C

ED

(2) Set of tasks

(3) Randomly created individuals

X

A

BY

C

E

D

X

AB

Y

C ED

/faculteit technologie management

GPM – Calculate Fitness

• Main idea– Benefit the individuals that can parse more frequent

material in the log

• Challenges– How to assess an individual’s fitness?– How to punish individuals that allow for undesired extra

behavior?

/faculteit technologie management

Fitness - How to assess an individual’s fitness?

- Use continuous semantics parser and register problems L = log and CM = causal matrix

/faculteit technologie management

Trace:

X,A,C,D,Y

For noise-free, fitness punishes:

AND-join AND-join OR-join OR-join OR-split OR-split AND-split AND-split

XA

BY

EC

D

Original net

XA

BY

EC

D

Individual

/faculteit technologie management

Trace:

X,A,C,D,Y

For noise-free, fitness punishes:

AND-split AND-split OR-split OR-split OR-join OR-join AND-join AND-join

XA

BY

EC

D

Original net

Individual

XA

BY

EC

D

/faculteit technologie management

Fitness - How to assess an individual’s fitness?

/faculteit technologie management

Fitness - How to punish individuals that allow for undesired extra behavior?

Fitness = 1

XA

BY

EC

D

X

AB

Y

C ED

X

A

B

YC

E

D

/faculteit technologie management

XA

BY

EC

D

X

AB

Y

C ED

X

A

B

YC

E

D

Fitness - How to punish individuals that allow for undesired extra behavior?- Count the amount of enabled tasks at every

reachable marking

/faculteit technologie management

GPM – Calculate Fitness

where

L = log and CM = causal matrix and CM[] = population

/faculteit technologie management

GPM – Create next population

• Genetic operators– Crossover

• Recombines existing material in the population• Crossover point = task• Crossover probability• Subsets are swapped

– Mutation• Introduce new material in the population• Every task of a individual can be mutated• Mutation probability

/faculteit technologie management

GPM – Create next population

• Genetic operators - Crossover

XA

BY

EC

DX

A

BY

EC

DParent 1 Parent 2

XA

BY

EC

DX

A

BY

EC

D

Offspring 1 Offspring 2

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining – Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Experiments and Results

• Experiments– ProM framework

• Genetic Algorithm Plug-in• http://www.processmining.org

– Simulated data

• Results– The genetic algorihm found models that could parse all

the traces in the log

/faculteit technologie management

ProM framework – Genetic Algorithm Plug-in

/faculteit technologie management

ProM framework – Genetic Algorithm Plug-in

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining – Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Conclusion and Future Work

• Conclusion– Genetic algorithms can be used to mine process

models

• Future Work– Tackle duplicate tasks

• How to detect the right level of abstraction?

– Apply the genetic process mining to "real-life" logs• How to deal with noise?

/faculteit technologie management

http://www.processmining.orghttp://www.processmining.org