process mining - chapter 7 - conformance checking
DESCRIPTION
Slides supporting the book "Process Mining: Discovery, Conformance, and Enhancement of Business Processes" by Wil van der Aalst. See also http://springer.com/978-3-642-19344-6 (ISBN 978-3-642-19344-6) and the website http://www.processmining.org/book/start providing sample logs.TRANSCRIPT
Chapter 7Conformance Checking
prof.dr.ir. Wil van der Aalstwww.processmining.org
Overview
PAGE 1
Part I: Preliminaries
Chapter 2 Process Modeling and Analysis
Chapter 3Data Mining
Part II: From Event Logs to Process Models
Chapter 4 Getting the Data
Chapter 5 Process Discovery: An Introduction
Chapter 6 Advanced Process Discovery Techniques
Part III: Beyond Process Discovery
Chapter 7 Conformance Checking
Chapter 8 Mining Additional Perspectives
Chapter 9 Operational Support
Part IV: Putting Process Mining to Work
Chapter 10 Tool Support
Chapter 11 Analyzing “Lasagna Processes”
Chapter 12 Analyzing “Spaghetti Processes”
Part V: Reflection
Chapter 13Cartography and Navigation
Chapter 14Epilogue
Chapter 1 Introduction
Conformance checking
PAGE 2
software system
(process)model
eventlogs
modelsanalyzes
discovery
records events, e.g., messages,
transactions, etc.
specifies configures implements
analyzes
supports/controls
enhancement
conformance
“world”
people machines
organizationscomponents
businessprocesses
Using conformance checking
PAGE 3
Context
• Corporate governance, risk, compliance, and legislation such as the Sarbanes-Oxley (US), Basel II/III (EU), J-SOX (Japan), C-SOX (Canada), 8th EU Directive (EURO-SOX), BilMoG (Germany), MiFID (EU), Law 262/05 (Italy), Code Lippens (Belgium), and Code Tabaksblat (Netherlands).
• ISO 9001:2008 requires organizations to model their operational processes.
• Business alignment: make sure that the information systems and the real business processes are well aligned.
PAGE 4
Auditing
• The term auditing refers to the evaluation of organizations and their processes.
• Audits are performed to ascertain the validity and reliability of information about these organizations and associated processes.
• This is done to check whether business processes are executed within certain boundaries set by managers, governments, and other stakeholders.
• Obviously, process mining can help to detect fraud, malpractice, risks, and inefficiencies.
• All events in a business process can be evaluated and this can be done while the process is still running.
PAGE 5
Deviations?
• Is the model or the log “wrong”?• “Desirable” or “undesirable” deviations?• “Breaking the glass” may save lives!
PAGE 6
Replay: Connecting events to model elements is essential for process mining
PAGE 7
event log process model
Play-In
event logprocess model
Play-Out
event log process model
Replay
• extended model showing times, frequencies, etc.
• diagnostics• predictions• recommendations
A
B
C
DE
p2
end
p4
p3p1
start
Play Out (Classical use of models)
PAGE 8
A B C D
A C B DA B C D
A E D
A C B DA C B D
A E D
A E D
Play In (Process Discovery)
PAGE 9
A
B
C
DE
p2
end
p4
p3p1
start
ABCDACBDAED
ACBDAED
ABCD…
a process discovery algorithm like the αalgorithm
A
B
C
DE
p2
end
p4
p3p1
start
Replay
PAGE 10
A B C D
A
B
C
DE
p2
end
p4
p3p1
start
Replay can detect problems
PAGE 11
AC D
Problem!missing token
Problem!token left behind
A
B
C
DE
p2
end
p4
p3p1
start
Replay can extract timing information
PAGE 12
A5B8 C9 D13
5
8
9
13
3
4
5
43
265
8
764
7
74
3
Let us now focus on conformance checking based on Replay
PAGE 13
event log process model
Play-In
event logprocess model
Play-Out
event log process model
Replay
• extended model showing times, frequencies, etc.
• diagnostics• predictions• recommendations
Four models, one log
PAGE 14
astart register
request
bexamine
thoroughly
cexamine casually
dcheck ticket
decide
pay compensation
reject request
reinitiate request
e
g
h
f
end
p3p1
p2 p4
N1
p5
astart register
request
bexamine
thoroughly
cexamine casually
checkticket
decide
pay compensation
reject request
reinitiate request
e
g
hf
endp1 p2
N2
dp3
p4
astart register
request
cexamine casually
dcheckticket
decide reject request
e hend
p1
p2
p3
p4
p5
N3
astart register
request
bexamine thoroughly
cexamine casually
d checkticket
decide
pay compensation
reject request
reinitiate requeste
g
hfend
N4
p1
PAGE 15
Replaying (1/3)σ1 on N1
PAGE 16
astart
b
c
d
e
g
h
f
end
p3p1
p2 p4
p5
p=0c=0m=0r=0
p=1c=0m=0r=0
astart
b
c
d
e
g
h
f
end
p3p1
p2 p4
p5
p=3c=1m=0r=0
Replaying (2/3)
PAGE 17
astart
b
c
d
e
g
h
f
end
p3p1
p2 p4
p5
p=4c=2m=0r=0
astart
b
c
d
e
g
h
f
end
p3p1
p2 p4
p5
p=5c=3m=0r=0
Replaying (3/3)
PAGE 18
astart
b
c
d
e
g
h
f
end
p3p1
p2 p4
p5
p=6c=5m=0r=0
astart
b
c
d
e
g
h
f
end
p3p1
p2 p4
p5
p=7c=6m=0r=0
p=7c=7m=0r=0
No problems found!
Replaying (1/3)σ3 on N2
PAGE 19
astart
b
c e
g
hf
endp1 p2d
p3p4
p=0c=0m=0r=0
p=1c=0m=0r=0
astart
b
c e
g
hf
p1 p2d
p3p4
p=2c=1m=0r=0
end
Replaying (2/3)
PAGE 20
astart
b
c e
g
hf
p1 p2d
p3p4
p=3c=2m=1r=0
astart
b
c e
g
hf
endp1 p2d
p3p4
p=4c=3m=1r=0
m
m
end
Replaying (3/3)
PAGE 21
astart
b
c e
g
hf
p1 p2d
p3p4
p=5c=4m=1r=0
astart
b
c e
g
hf
p1 p2d
p3p4
p=6c=5m=1r=0
p=6c=6m=1r=1
m
mr
end
end
Problems encountered when replaying σ3 on N2
• One missing token (of 6 consumed tokens)• One remaining token (of 6 produced tokens)
PAGE 22
astart
b
c e
g
hf
p1 p2d
p3p4
p=6c=6m=1r=1
mr
end
Computing fitness at trace level
PAGE 23
astart
b
c e
g
hf
p1 p2d
p3p4
p=6c=6m=1r=1
mr
end
Replaying (1/3)σ2 on N3
PAGE 24
astart
c
d
e hend
p1
p2
p3
p4
p5
p=0c=0m=0r=0
p=1c=0m=0r=0
astart
c
d
e hend
p1
p2
p3
p4
p5
p=3c=1m=0r=0
Replaying (2/3)
PAGE 25
astart
c
d
e hend
p1
p2
p3
p4
p5
p=4c=2m=0r=0
astart
c
d
e hend
p1
p2
p3
p4
p5
p=5c=4m=1r=0
m
Replaying (3/3)
PAGE 26
astart
c
d
e hend
p1
p2
p3
p4
p5
p=5c=5m=2r=2
m
m
r
r
p = 5, c = 5, m = 2, and r = 2
Computing fitness at the log level
PAGE 27
Example values
PAGE 28
astart register
request
bexamine
thoroughly
cexamine casually
dcheck ticket
decide
pay compensation
reject request
reinitiate request
e
g
h
f
end
p3p1
p2 p4
N1
p5
astart register
request
bexamine
thoroughly
cexamine casually
checkticket
decide
pay compensation
reject request
reinitiate request
e
g
hf
endp1 p2
N2
dp3
p4
astart register
request
cexamine casually
dcheckticket
decide reject request
e hend
p1
p2
p3
p4
p5
N3
astart register
request
bexamine thoroughly
cexamine casually
d checkticket
decide
pay compensation
reject request
reinitiate requeste
g
hfend
N4
p1
Diagnostics
PAGE 29
astart register
request
bexamine
thoroughly
cexamine casually
checkticket
decide
pay compensation
reject request
reinitiate request
e
g
hf
endp1 p2d
p3p4
1391 1391
566 566
1537
1537
1537
1537 461461
930
930146
146
971 971
problem443 tokens remain in place p2,
because d did not occur although the model expected d to happen
problem443 tokens were missing in place p2 during replay, because d happened even though
this was not possible according to the model
+443-443
Diagnostics
PAGE 30
astart register
request
cexamine casually
dcheckticket
decide reject request
e hend
p1
p2
p3
p4
p5
1391
problem430 tokens remain in place p1, because c did not happen while the model expected c to happen
1391
971 971
1537
1537 930 930
1537
153715371391
problem146 tokens were missing in
place p2 during replay, because d happened while this was not possible according to the model
problem10 tokens were missing in place p1 during
replay, because c happened while this was not possible according to the model
problem566 tokens were missing in
place p3 during replay, because e happened
while this was not possible according to the model
problem607 tokens remain in place p5, because h did not happen while the model expected h to happen
problem461 of the 1391 cases did not
reach place end
+430-10 -566
-461+607
-146
Drilling down
PAGE 31
Comparing footprints
PAGE 32
PAGE 33
Differences quantified
PAGE 34
(x:y where x is in log or N1 and y in N2)
Diagnostics(x:y where x is in log or N1 and y in N2)
PAGE 35
astart register
request
bexamine
thoroughly
cexamine casually
dcheck ticket
decide
pay compensation
reject request
reinitiate request
e
g
h
f
end
p3p1
p2 p4
N1
p5
astart register
request
bexamine
thoroughly
cexamine casually
checkticket
decide
pay compensation
reject request
reinitiate request
e
g
hf
endp1 p2
N2
dp3
p4
Checking Declare specifications
PAGE 36
register request
decide
pay compensation
reject request
g
h
a e
non co-existence
precedencebranched response
See Declare and LTL checker in ProM
Connecting event log and model
• Very important!• Model may be discovered or hand-made.• Connected during replay.• Starting point for other types of process mining!
PAGE 37
...
process case
activity activity instance
event
attribute
timestamp
resource*
1
*1
*
1
*1
1
**1
1
*a
b
c
d
e
f g
h
i
event level
costs
trans- action
model level
instance level
j
k