faculteit technologie management genetic process mining wil van der aalst ana karla medeiros ton...

34
/faculteit technologie management Genetic Process Mining Genetic Process Mining Wil van der Aalst Wil van der Aalst Ana Karla Medeiros Ana Karla Medeiros Ton Weijters Ton Weijters Eindhoven University of Technology Department of Information Systems [email protected]

Post on 21-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

/faculteit technologie management

Genetic Process MiningGenetic Process Mining

Wil van der Aalst Wil van der Aalst Ana Karla Medeiros Ana Karla Medeiros Ton Weijters Ton Weijters

Eindhoven University of Technology

Department of Information Systems

[email protected]

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining – Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining – Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Process Mining

X = apply for licenseA = classes motobikeB = classes carC = theoretical exam

C = theoretical examD = practical motorbike examE = practical car examY = get result

/faculteit technologie management

Process Mining (cont.)

• Most of the current techniques cannot handle– Structural constructs: non-free choice, duplicate tasks

and invisible tasks– Noisy logs– Reason: local approach

 

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining– Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Genetic Algorithms

– Global approach local optimum

global optimum

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining– Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Genetic Process Mining (GPM)

Aim: Use genetic algorithm to tackle noise, duplicate activities, non-free choice and invisible tasks

Internal Representation

Fitness Measure

Genetic Operators

/faculteit technologie management

GPM – Internal Representation

• Causal Matrix

Input

XX AA BB CC DD EE YY Output

XX

AA

BB

CC

DD

EE

YY

/faculteit technologie management

GPM – Internal Representation

• Causal Matrix

Input

XX AA BB CC DD EE YY Output

XX 0 1 1 0 0 0 0

AA 0 0 0 1 1 0 0

BB 0 0 0 1 0 1 0

CC 0 0 0 0 1 1 0

DD 0 0 0 0 0 0 1EE 0 0 0 0 0 0 1YY 0 0 0 0 0 0 0

/faculteit technologie management

GPM – Internal Representation

• Causal Matrix

Input

XX AA BB CC DD EE YY Output

XX 0 1 1 0 0 0 0 A \/ B

AA 0 0 0 1 1 0 0 C /\ D

BB 0 0 0 1 0 1 0 C /\ E

CC 0 0 0 0 1 1 0 D \/ E

DD 0 0 0 0 0 0 1 Y

EE 0 0 0 0 0 0 1 Y

YY 0 0 0 0 0 0 0 True

/faculteit technologie management

GPM – Internal Representation

• Causal Matrix

Input True X X A \/ B A /\ C B /\ C D \/ E

XX AA BB CC DD EE YY Output

XX 0 1 1 0 0 0 0 A \/ B

AA 0 0 0 1 1 0 0 C /\ D

BB 0 0 0 1 0 1 0 C /\ E

CC 0 0 0 0 1 1 0 D \/ E

DD 0 0 0 0 0 0 1 Y

EE 0 0 0 0 0 0 1 Y

YY 0 0 0 0 0 0 0 True

/faculteit technologie management

GPM – Internal Representation

• Causal Matrix– Compact representation

Input True X X A \/ B A /\ C B /\ C D \/ E

XX AA BB CC DD EE YY Output

XX 0 1 1 0 0 0 0 A \/ B

AA 0 0 0 1 1 0 0 C /\ D

BB 0 0 0 1 0 1 0 C /\ E

CC 0 0 0 0 1 1 0 D \/ E

DD 0 0 0 0 0 0 1 Y

EE 0 0 0 0 0 0 1 Y

YY 0 0 0 0 0 0 0 True

TaskTask InputInput OutputOutput

XX {}{} {{A,B}}{{A,B}}

AA {{X}}{{X}} {{C},{D}}{{C},{D}}

BB {{X}}{{X}} {{C},{E}}{{C},{E}}

CC {{A,B}}{{A,B}} {{D,E}}{{D,E}}

DD {{A},{C}}{{A},{C}} {{Y}}{{Y}}

EE {{B},{C}}{{B},{C}} {{Y}}{{Y}}

YY {{D},{E}}{{D},{E}} {}{}

/faculteit technologie management

GPM – Internal Representation

• Causal Matrix– Semantics

TaskTask InputInput OutputOutput

AA {}{} {{B},{C,D}}{{B},{C,D}}

BB {{A}}{{A}} {{E,F}}{{E,F}}

CC {{A}}{{A}} {{E}}{{E}}

DD {{A}}{{A}} {{F}}{{F}}

EE {{B},{C}}{{B},{C}} {{G}}{{G}}

FF {{B},{D}}{{B},{D}} {{G}}{{G}}

GG {{E},{F}}{{E},{F}} {}{}

Invisible tasks only fire to enable visible tasks!

/faculteit technologie management

GPM – Internal Representation

• Causal Matrix– Semantics

TaskTask InputInput OutputOutput

AA {}{} {{B},{C,D}}{{B},{C,D}}

BB {{A}}{{A}} {{E,F}}{{E,F}}

CC {{A}}{{A}} {{E}}{{E}}

DD {{A}}{{A}} {{F}}{{F}}

EE {{B},{C}}{{B},{C}} {{G}}{{G}}

FF {{B},{D}}{{B},{D}} {{G}}{{G}}

GG {{E},{F}}{{E},{F}} {}{}

Deadlock!

Invisible tasks only fire to enable visible tasks!

/faculteit technologie management

GPM – Internal Representation

• Causal Matrix– Mappings

TaskTask InputInput OutputOutput

AA {}{} {{B},{C,D}}{{B},{C,D}}

BB {{A}}{{A}} {{E,F}}{{E,F}}

CC {{A}}{{A}} {{E}}{{E}}

DD {{A}}{{A}} {{F}}{{F}}

EE {{B},{C}}{{B},{C}} {{G}}{{G}}

FF {{B},{D}}{{B},{D}} {{G}}{{G}}

GG {{E},{F}}{{E},{F}} {}{}

/faculteit technologie management

GPM – Internal Representation

• Causal Matrix– Mappings

TaskTask InputInput OutputOutput

AA {}{} {{C,D}}{{C,D}}

BB {}{} {{D}}{{D}}

CC {{A}}{{A}} {}{}

DD {{A,B}}{{A,B}} {}{}

/faculteit technologie management

GPM – Fitness Measure

• Main idea– Benefit the individuals that can parse more frequent

material in the log

• Challenges– How to assess an individual’s fitness?– How to punish individuals that allow for undesired extra

behavior?

/faculteit technologie management

Fitness - How to assess an individual’s fitness?

- Use continuous semantics parser and register problems L = log and CM = causal matrix

/faculteit technologie management

Trace:

SS,A,B,C,D,EE

A

B

Original net

E

C

DSS EE

A

B

I ndividual

E

C

DSS EE

For noise-free, fitness punishes:

OR-split OR-split AND-split AND-split AND-join AND-join OR-join OR-join

/faculteit technologie management

Trace:

SS,A,B,C,D,EE

For noise-free, fitness punishes:

OR-join OR-join AND-join AND-join AND-split AND-split OR-split OR-split

A

B

Original net

E

C

DSS EE

A

B

I ndividual

E

C

DSS EE

/faculteit technologie management

Fitness - How to assess an individual’s fitness?

/faculteit technologie management

Fitness - How to punish individuals that allow for undesired extra behavior?

Fitness = 1

/faculteit technologie management

Fitness - How to punish individuals that allow for undesired extra behavior?- Count the amount of enabled tasks at every

reachable marking

/faculteit technologie management

Fitness Measure

where

L = log and CM = causal matrix and CM[] = population

/faculteit technologie management

Genetic Operators

• Crossover– Recombines existing material in the population– Crossover probability– Crossover point = task– Subsets are swapped

• Mutation– Introduce new material in the population– Mutation probability– Every task of a individual can be mutated

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining – Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Experiments and Results

• Experiments– ProM framework

• Genetic Algorithm Plug-in• http://www.processmining.org

– Simulated data

• Results– The genetic algorihm found models that could parse all

the traces in the log

/faculteit technologie management

ProM framework – Genetic Algorithm Plug-in

/faculteit technologie management

ProM framework – Genetic Algorithm Plug-in

/faculteit technologie management

Outline

• Process Mining

• Genetic Algorithms

• Genetic Process Mining – Internal Representation– Fitness measure– Genetic Operators

• Experiments and Results

• Conclusion and Future Work

/faculteit technologie management

Conclusion and Future Work

• Conclusion– Genetic algorithms can be used to mine process

models

• Future Work– Tackle duplicate tasks– Apply the genetic process mining to "real-life" logs

/faculteit technologie management

http://www.processmining.orghttp://www.processmining.org