synthesizing object life cycles from business process...

31
Noname manuscript No. (will be inserted by the editor) Synthesizing Object Life Cycles from Business Process Models Rik Eshuis · Pieter Van Gorp the date of receipt and acceptance should be inserted later Abstract UML activity diagrams can model the flow of stateful business objects among activities, implicitly specifying the life cyles of those objects. The actual object life cycles are typically expressed in UML state machines. The implicit life cycles in UML activity diagrams need to be discovered in order to derive the actual object life cycles or to check the consistency with an existing life cycle. This paper presents an automated approach for synthesizing a UML state machine modeling the lifecycle of an object that occurs in different states in a UML activity diagram. The generated state machines can contain parallelism, loops and cross- synchronization. The approach makes life cycles that have been modeled implicitly in activity diagrams explicit. The synthesis approach has been implemented using a graph transformation tool and has been applied in several case studies. Keywords process models, state machines, UML, model transformation 1 Introduction Modern business information systems support the coordination and execution of business processes in which business objects are manipulated. To develop a busi- ness information system, both the business process and business object perspective need to be modeled. The Unified Modeling Language [39] is an industrial standard that can be used for modeling such information systems [7]. UML offers two different behavioral notations for the business process and business object perspectives. First, UML activity diagrams can specify business processes, containing both business activities and their ordering as well as the flow of stateful business objects among these activities. State changes of these business objects are due to business activities. Second, UML state machines can express the dynamic behavior of busines objects, i.e. their life cycles or life histories consisting of object states connected by transitions [33, 34, 39]. Rik Eshuis · Pieter Van Gorp Eindhoven University of Technology P.O. Box 513, NL-5600 MB, Eindhoven, The Netherlands {h.eshuis,p.m.e.v.gorp}@tue.nl

Upload: phungkhanh

Post on 27-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Noname manuscript No.(will be inserted by the editor)

Synthesizing Object Life Cycles from Business ProcessModels

Rik Eshuis · Pieter Van Gorp

the date of receipt and acceptance should be inserted later

Abstract UML activity diagrams can model the flow of stateful business objectsamong activities, implicitly specifying the life cyles of those objects. The actualobject life cycles are typically expressed in UML state machines. The implicitlife cycles in UML activity diagrams need to be discovered in order to derivethe actual object life cycles or to check the consistency with an existing life cycle.This paper presents an automated approach for synthesizing a UML state machinemodeling the lifecycle of an object that occurs in different states in a UML activitydiagram. The generated state machines can contain parallelism, loops and cross-synchronization. The approach makes life cycles that have been modeled implicitlyin activity diagrams explicit. The synthesis approach has been implemented usinga graph transformation tool and has been applied in several case studies.

Keywords process models, state machines, UML, model transformation

1 Introduction

Modern business information systems support the coordination and execution ofbusiness processes in which business objects are manipulated. To develop a busi-ness information system, both the business process and business object perspectiveneed to be modeled. The Unified Modeling Language [39] is an industrial standardthat can be used for modeling such information systems [7].

UML offers two different behavioral notations for the business process andbusiness object perspectives. First, UML activity diagrams can specify businessprocesses, containing both business activities and their ordering as well as the flowof stateful business objects among these activities. State changes of these businessobjects are due to business activities. Second, UML state machines can express thedynamic behavior of busines objects, i.e. their life cycles or life histories consistingof object states connected by transitions [33,34,39].

Rik Eshuis · Pieter Van GorpEindhoven University of TechnologyP.O. Box 513, NL-5600 MB, Eindhoven, The Netherlands{h.eshuis,p.m.e.v.gorp}@tue.nl

Page 2: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

2 Rik Eshuis, Pieter Van Gorp

Though both notations overlap, they are complementary. A UML activity dia-gram gives a global view of a business process, addressing different objects, while aUML state machine of an object life cycle offers a local view, linked to one aspectof the global process view [12]. There is a need to relate both views, for instanceto check and ensure consistency, or to support traceability. However, an importantobstacle is that object life cycles are only implicitly specified in business processmodels. Moreover, UML state machines have a syntax that differs considerablyfrom the Petri net-inspired syntax of UML activity diagram. For instance, UMLstate machines use state hierarchy to express intra-object parallelism, while UMLactivity diagrams express parallelism using fork and join nodes. This complicatesdefining a mapping between activity diagrams and state machines.

This paper defines an automated approach for synthesizing a hierarchical statemachine from a UML activity diagram that specifies a business process modelreferencing a stateful object. This way, the approach discovers an object life cyclethat is hidden in a business process model. As we discuss in Section 2, the synthe-sized state machine can be used to generate a software system that coordinatesthe execution of the activities to which the business object pertains. Alternatively,the synthesized state machine can be used to check consistency with existing statemachine descriptions [24].

The synthesis approach consists of two phases. In the first phase, nodes notrelevant for the object life cycle are filtered from the activity diagram. In the secondphase, the remaining part of the activity diagram is translated into a hierarchicalstate machine specifying the life cycle of the object referenced by the processmodel. In particular, the state hierarchy of the state machine is inferred from thestructure of the activity diagram. The approach is based on graph transformationsand is fully automated. It has been implemented using the graph transformationtool GrGen [14]. Section 6 gives more details on the prototype and our experiencesin applying the prototype to case studies.

The remainder of this paper is structured as follows. Section 2 gives an overviewof the approach using a running example. Section 3 discusses the first phase of thesynthesis approach, in which action nodes and irrelevant control nodes are filteredfrom the activity diagram. Section 4 details the second phase, in which a filteredactivity diagram is translated into a hierarchical state machine with the same con-trol flow. The translation may fail, since activity diagrams allow synchronizationusing forks and joins in ways not allowed in the control flow of state machines.Section 5 extends the state machine translation to deal with activity diagramswith fine-grained synchronization, which is mimicked using event synchronizationin the constructed state machines. Section 6 presents a prototype that implementsthe approach and discusses our experiences in applying the prototype in case stud-ies. Section 7 discusses related work. Section 8 ends the paper with the conclusion.Appendix A contains formal definitions for the different phases of the synthesisapproach that are explained in Section 3, 4, and 5.

This paper is an extended and revised version of a previous conference pa-per [12]. Novel parts include an extension of the approach to activity diagramswith cross-synchronization (Section 5), a more elaborate description of the proto-type, more case studies, and a formalization of the approach.

Page 3: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 3

Receive

Reject

Determine costs

Update client contribution

[else]

[ok]

Notify client

Pay

: Claim

[policy checked ]

: Claim

[rejected ]

: Claim

[archived ]

: Claim

[policy not checked ]

: Claim

[damage checked ]: Claim

[settled]

Check policy

Check damage

: Claim

[assessed]

: Claim

[damage not checked ]

Decide

: Claim

[decided]

: Claim

[received]Confirm receipt

: Claim

[confirmed]

Fig. 1: Activity diagram specifying process for handling insurance claims

2 Overview

To introduce the salient features of the approach, we show in Fig. 1 an examplebusiness process model in a UML2 activity diagram. The process specifies handlingan insurance claim. Atomic activities are represented by action nodes (ovals) likeReceive. After receiving the claim, in parallel (bar symbol) the receipt is confirmedand the policy and damage are checked. After the policy and damage have beenchecked, a decision (diamond symbol) is made to either reject the claim or toaccept the claim. In that case, the cost are calculated and paid to the client, andin parallel the periodic contribution to be paid by the client is updated. Finally,if the claim receipt has been confirmed and the claim is fully processed, the clientis notified about the decision.

The process updates stateful object Claim. Each state of Claim is represented byan object node (rectangle). The Claim object can be in multiple states at the sametime. For instance, after Receive has completed, the Claim object is in three parallelstates: received, policy not checked, and damage not checked. Certain activities changethe local state of the Claim object. For instance, Check policy changes the state frompolicy not checked to policy checked, but does not affect the other states.

Implicitly, the process model specifies the life cycle of the Claim object. Eachobject node in the activity diagram references a local state of the life cycle. Thatobject life cycle can contain sequence, parallelism, choice, and loops. An exampleof a choice is the Claim[decided] object node in Fig. 1, which specifies according tothe UML2 [39] semantics that object Claim in state decided is either input to Reject

or to Determine costs but not both. An example of parallelism are the object nodesClaim[policy checked] and Claim[damage checked], which specify that the Claim objecthas be in both states before activity Decide can start.

In UML [39], the behavior of objects is by default specified in state machines,also called statecharts. These state machines use hierarchical (composite) statesto express parallelism. Fig. 2 shows a hierarchical state machine modeling the lifecycle of a Claim object. Note that states received, policy not checked and damage

not checked can be active in parallel, just as in Fig. 1. The UML state machineexplicitly models the object life cycle specified implicitly in Fig. 1.

There are two main reasons why it is desirable to automatically derive froma process model with an implicit object life cycle an explicit description of that

Page 4: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

4 Rik Eshuis, Pieter Van Gorp

rejected

assessed

policy checked

damage checked

[else]

[ok]

archived

settled

process

O4

O3

policy not checked

damage not checked

decided

received confirmedO1

O2

Fig. 2: State machine specifying life cycle of object Claim in Fig. 1

life cycle. First, such a description can be used to generate software code. In thecontext of UML [28,39], hierarchical state machines are a core technique fromwhich tools like Rational [35] and Rhapsody [19] generate software code.

Second, a derived object life cycle can be used to check consistency with anexisting life cycle description. For instance, Kuster et al. [24] explain that theIBM Insurance Application Architecture [5] uses life cycles to define the behav-ior of business objects in the insurance domain, based on the Acord standard(http://www.acord.org), and that business process models in which these statefulobjects are manipulated need to be consistent with those life cycles. Consistencycan be checked by first deriving an explicit object life cycle for an object fromthe process model, and next comparing the object life cycle with the referenceobject life cycle using techniques like matching [30], consistency checking [8] orequivalence testing [27].

There are three problems that have to be solved in order to derive a hierarchicalstate machine description from a process model such as the one in Fig. 1. First,some parts of the process model are not relevant for the object life cycle. Forinstance, the Update client contribution activity does not affect any state of theClaim object, and therefore does not occur in the state machine. These irrelevantparts need to be removed from the process model, but the indirect flows betweendifferent object nodes need to be preserved.

Second, UML state machines use hierarchical (composite) states to expressparallelism. These composite states have no counterpart in process models likeactivity diagrams. An example of a composite state is process in Fig. 2 that containsother states like O1 and received. To derive a hierarchical state machine, compositestates need to be inferred from the activity diagram syntax. UML 1.5 [38] proposesto translate each pair of fork-join bars to an AND state in order to map activitydiagrams to state machines. (A fork is a bar with one incoming and multipleoutgoing edges while a join is a bar with multiple incoming and one outgoingedge.) However, an activity diagrams in which forks and joins are not paired canalso be translated into a hierarchical state machine [12], provided the activitydiagram does not contain cross-synchronization, as explained in the next point.For instance, the activity diagram in Fig. 1 does not have paired forks and joins,but it can be translated to the state machine in Fig. 2.

Third, the control flow of activity diagrams can express cross-synchronizationbetween parallel branches, which is impossible to express in state machine con-trol flow [39]. Fig. 3 shows a typical example of an activity diagram that cannot

Page 5: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 5

Withdraw cash

Check balance Dispense cash

Log entryPrint receipt

Account

[uncredited ]

Account

[checked]

Account

[credited]

Fig. 3: Process of withdrawing money from ATM

be translated directly into state machines, assuming action nodes map to statemachine BASIC states and no filtering rules are applied. After the Withdraw cash

action, two parallel branches are started. However, there is a cross-synchronizationbetween the two branches: Print receipt requires that both Check balance and Log en-

try have been completed. Such a synchronization cannot be expressed in UML 2.3state machine control flow [39]. For this example, the cross-synchronization con-struct that impedes the translation occurs in a part of the activity diagram thatis irrelevant to the object life cycle, so a state machine — containing a sequence ofthree BASIC nodes — can be constructed. However, if the cross-synchronizationconstruct is relevant for the object life cycle, the resulting activity diagram cannotbe synthesized into a state machine with equivalent control flow.

We define an approach that solves these three problems. Input is an activ-ity diagram specifying the behavior of a stateful object. The approach first filtersirrelevant nodes from the activity diagram (Section 3), and next synthesizes a hier-archical state machine from the filtered activity diagram (Section 4) that preservesthe control flow of the activity diagram. Both steps are fully automated and donot require any user interaction. Applying the approach to the activity diagramin Fig. 1 results in the state machine shown in Fig. 2, modulo the names of thecomposite states. If the filtered activity diagram contains cross-synchronization,the approach can construct a hierarchical state machine with similar behavior, butthen control flow is no longer preserved; we therefore use event-synchronizationto mimick the behavior of the original activity diagram. If the approach fails forthe remaining cornercases, diagnostics can be provided giving precise feedback onwhich part of the activity diagram causes the failure. Section 4.2 explains how theobtained state machine can be further refined into an executable state machinethat coordinates the execution of activities from the process model.

We focus in this paper on a single activity diagram with object nodes thatreference the same object type. If a single activity diagram references multipleobject types, then for each object type a version of the activity diagram canbe created that references only that object type, by removing from the originalactivity diagram those object flows and object nodes that do not refer to theobject type. If multiple activity diagrams reference the same object type, theycan be grouped into a single activity diagram by adding relevant control nodes toconnect the different diagrams.

Activity diagrams also support a pin-style modeling of object flows, in whichinput and output objects of each activity are modeled with input and output pinsthat are attached to the activity. The basic pin style is equivalent to the objectnode style [39]: each object node translates into an output pin for each activity thatproduces objects for the object node and into an input pin for each activity that

Page 6: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

6 Rik Eshuis, Pieter Van Gorp

Phase Stage Section1 Preprocessing activity diagram 3.11 Filtering activity diagram 3.22 Constructing state hierarchy 4.12 Constructing hierarchical state machine without cross-synchronization 4.22 Constructing hierarchical state machine with cross-synchronization 5

Fig. 4: Outline of approach

consumes objects from the object node. However, the general pin-style notationis more expressive, since it allows the specifications of alternative pin sets, calledparameter sets [39]. The results in this paper carry over to the basic pin-stylenotation. We plan to study the general pin-style notation in future work.

As explained in the introduction, the approach has two phases: filtering anactivity diagram and translating the filtered activity diagram into a state machine.Each phase consists of several stages, which are listed in Fig. 4, and are defined inthe next sections.

3 Filtering activity diagrams

In the first phase of the synthesis approach, nodes not relevant for the object lifecycle are filtered from the activity diagram. For the input activity diagram, werequire that every action node has one incoming and one outgoing control flow,and every object node has at least one incoming or outgoing object flow. An objectnode can have multiple incoming or outgoing object flows. The resulting activitydiagram contains only object nodes and relevant control nodes and is translatedinto a hierarchical state machine in the second phase, which is explained in thenext section.

The filtering phase consists of two stages, which are explained in the sequelof this section. First, the activity diagram is preprocessed and transformed intoa normal form. Second, several filtering rules are applied in arbitrary order tothe activity diagram. The filtering stage stops if no filtering rule can be appliedanymore to the activity diagram.

3.1 Preprocessing

In the preprocessing stage, we transform each activity diagram into a normal formby ensuring that each action node has one incoming and outgoing edge. An activitydiagram in normal form has no dangling (object) nodes: each node is on a directedpath from the initial to a final node. Preprocessing consists of two steps that areperformed iteratively in random order until the process model is not changedanymore.

In the first step, we ensure that each object node has at least one incomingand one outgoing edge. If an object node o has no incoming edge, then we takethe action node a to which o is input. If a is not unique since o is input to multipleaction nodes, this step fails. Otherwise, a is unique and there is a unique controlflow that enters a. We change the target of the control flow from a to o. A sym-metrical rule is used for object nodes that have no outgoing edges. For instance,

Page 7: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 7

Reject

Determine costs

Update client contribution[else]

[ok]

Claim

[policy checked ]

Claim

[damage checked ]

Check policy

Check damage

Decide

Claim

[decided]

Fig. 5: Part of activity diagram in Fig. 1 after preprocessing

Claim

[policy checked ]

Claim

[rejected]

Claim

[archived]

Claim

[policy not checked ]

Claim

[damage checked ]

Claim

[settled]

Claim

[assessed]

Claim

[damage not checked ]

Claim

[decided]

[else]

[ok]

Claim

[received]

Claim

[confirmed]

Fig. 6: Activity diagram of insurance claim process (Fig. 1) after filtering

in Fig. 1 object node Claim[rejected] has no outgoing object flow. The precedingactivity Reject is source of one outgoing control flow that targets a merge node.The source of this control flow is changed to Claim[rejected].

In the second step, we ensure that each action node gets exactly one incomingand one outgoing edge. By constraint, an action node has one incoming and oneoutgoing control flow. If an action node also has one or more incoming objectflows, a join is inserted that synchronizes the control flow and the object flows.This synchronization denotes that all inputs of activity need to be present beforeit can start, which is in line with the UML2 standard [39]. For instance, this stepensures that before both Reject and Determine costs in Fig. 5 joins are inserted thatsynchronize object flow and control flow. Similarly, if an action node has one ormore outgoing object flows, a fork is inserted that splits the control flow and theobject flows. For instance, in Fig. 5 after Decide an extra fork has been inserted.

Both steps may introduce control nodes that mix object flow and control flow,which is not allowed by the UML standard. However, this is harmless for thesynthesis approach, since eventually object flows have to be translated into statemachine control flows anyway.

3.2 Filtering

In the filtering stage, nodes that do not refer to the object life cycle are removedfrom the activity diagram. Fig. 6 shows the activity diagram that results fromfiltering the activity diagram in Fig. 1. Object nodes refer to states of the object lifecycle and are therefore preserved. Action nodes do not explicitly refer to the objectlife cycle and are therefore removed. However, the final constructed state machinemay be refined to include call actions from the activity diagram, as explained in

Page 8: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

8 Rik Eshuis, Pieter Van Gorp

= action node , bar, diamond

R1

R7

R2

R4

R3

R5

R6

R8

[g1]

[g2]

[g1]

[g2]

R9 R10

[g1]

[g2]

[g1&g3]

[g4]

[g2]

[g3]

[g1&g4]

Fig. 7: Filtering rules for activity diagrams

Section 4.2. Furthermore, control nodes that do not influence object nodes areremoved. For instance, the rightmost pair of fork/join in Fig. 1 are removed, sinceonly one of the two parallel branches between the fork and join references objectnodes, not both. As Fig. 6 illustrates, control nodes are only included if theyinfluence object nodes.

The actual filtering is realized by applying ten different filtering rules, graph-ically specified in Fig. 7. Each filtering rule has a left-hand side, specifying itsprecondition, and a right-hand side, specifying its effect. A rule can only be ap-plied to a subgraph of the activity diagram if the subgraph satisfies the left-handside pattern, i.e., all elements should be present and all negative elements (denotedwith a cross) should be absent. The rules are designed to be confluent, i.e., eachpair of disjoint rules has mutually exclusive preconditions. Formal definitions ofthe filtering rules are listed in Appendix A.2.

Each filtering rule eliminates irrelevant nodes and edges from an activity dia-gram in normal form. Rule R1 removes non-object nodes with a single predecessorand a single successor. Object nodes are kept since they need to be preserved.Action nodes are removed, but the invoked activities that change the state ofthe object can be incorporated in the synthesized state machine, as explained inSection 4.2 Rules R2, R3, and R7, R8 merge decisions, merges and forks, joins,respectively. Rule R4 removes a self loop. Rule R5 removes a redundant edge be-tween a fork and join. Rule R6 removes a final activity node that is part of aparallel branch. Rule R9 and R10 remove a superfluous decision or merge that isin parallel to an object node that specifies the same decision or merge behavior.In R2 and R9, guard conditions on removed edges are moved to remaining edgesto preserve behavior.

Page 9: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 9

Order

[accepted]

Order

[accepted]

a) b)

Order

[accepted]

Order

[accepted]

Fig. 8: Example to illustrate different effect of filtering rules

Reduction rules R2 and R4 on decisions and merges resemble transformationrules on flow graphs [21] that test wether a flow graph is reducible [1], i.e., eachloop has a single point of entry. However, flow graphs are sequential, so they do notcontain parallelism, expressed with forks and joins in activity diagrams. Anotherdifference is that reducible flow graphs are transformed to a single node, whilethe filter approach results in a graph that contains all object nodes and relevantcontrol nodes (cf. Fig. 6).

The most interesting feature of the rules is that decisions and merges aredifferently from forks and joins, as illustrated by the example filterings shown inFig. 8. The models on the right are the result of iteratively applying all filteringrules to the corresponding models on the left. Fig. 8(a) results by applying first ruleR5 and next rule R1 two times. In Fig. 8(b), rules R2 and R3 are not applicable.We experimented with several other alternative definitions for rules R2 and R3,for instance by weakening the preconditions. However, such rules would merge thedecision and merge in Fig. 8(b), which is undesirable as explained before.

Rules R7 and R8 together replace a previously defined rule [12] that mergedforks or joins. That old rule generalized R7 and R8 by merging two neighboringforks or joins provided they are not contained in a loop. But the previous definitionturnes out to violate confluence in certain cases. Also, that rule merges a pair ofa fork and a join that specify cross-synchronization, i.e., the fork and join are inparallel branches and the join is successor of the fork; for instance, the inner fork-join in Fig. 3 specify such cross-synchronization. But then cross-synchronization isdestroyed and the behavior of the activity diagram is changed, which is undesirable.

The different filtering rules are applied iteratively in arbitrary order. The filter-ing step stops if no filtering rule can be applied anymore to the activity diagram.After the filtering rules have been applied, the activity diagram contains no actionnodes, but only object nodes and control nodes. The resulting activity diagramis input to the translation from activity diagrams to hierarchical state machines,detailed in the next section.

3.3 Discussion

The filtering rules are heuristics that help to obtain an activity diagram only con-taining object nodes plus relevant control nodes. Alternatively, filtering could beapplied after constructing a state machine directly from an input activity diagram.However, such a procedure can yield state machines whose control flow is unneces-sarily complex, if the extended translation of Section 5 needs to be applied ratherthan the base translation of Section 4.

To illustrate this point, consider the activity diagram in Fig. 3, which containscross-synchronization. By first filtering the activity diagram, the cross-synchronization

Page 10: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

10 Rik Eshuis, Pieter Van Gorp

construct is removed, and an activity diagram containing a sequence of three ob-ject nodes is obtained, which maps to a state machine containing a sequence ofthree states using the translation of Section 4. But if the activity diagram is trans-lated to a state machine before filtering, the extended translation of Section 5 isneeded due to the cross-synchronization. The resulting state machine has equiv-alent behavior, but does not preserve the control flow of the activity diagram. Incontrast, the state machine obtained using the proposed approach does preservethe control flow of the filtered input activity diagram.

4 Translation to state machines

Output of the filtering phase is an activity diagram containing only object nodesand control nodes. In the next phase, a hierarchical state machine is synthesizedfrom the activity diagram. Basis for the synthesis is an existing, formally definedtranslation from Petri nets to state machines [10]. The syntax of Petri nets closelyresembles that of activity diagrams. Key difficulty in synthesizing state machines isthe construction of hierarchical (AND and OR) states, which have no counterpartin activity diagram syntax. The constructed state machine preserves the controlflow of the input activity diagram.

We first explain how the state hierarchy, consisting of AND and OR states, isbuilt from a filtered activity diagram. Next, we explain how the state hierarchyand the filtered activity diagram are used to construct the complete state machine.

4.1 Constructing the state hierarchy

The state hierarchy is a tree of states. Leaves of the tree are BASIC states whileinternal nodes are AND and OR states. The tree is visualized by nesting childnodes inside parent nodes. For instance, in Fig. 2 BASIC node policy checked ischild of OR node O3 which is in turn child of AND node check.

The AND/OR tree is built in a stepwise fashion, by applying transformationrules to a structure which is a hybrid of activity diagrams and state machines.The structure contains states and edges resembling activity diagrams, but eachstate that is source or target of an edge is the root of an AND/OR tree; the statesinside this tree are not incident to any edge. Two transformation rules (T1 andT2) are used: one for merging OR states (T1) and one for creating AND states(T2). Each transformation rule reduces edges from the structure but adds statehierarchy. The procedure stops if the structure contains no edges and one statethat contains all other states. That state is the root of the state hierarchy. Theprocedure may fail, in which case the activity diagram cannot be translated intoa state machine.

We now elaborate the initialization step, in which the initial hybrid structureis created, and the two rules T1 and T2. The next subsection explains how therules are used to translate an activity diagram into a hierarchical state machine.To simplify the exposition, we do not show the formal specifications of the rulesbut we apply the rules to the claim processing example. Appendix A.3 containsformal definition of the initialization step and the two transformation rules.

Page 11: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 11

Initialization. The filtered activity diagram is copied into a new structure thatextends activity diagrams with AND/OR trees. Only roots of the AND/OR treescan be used in the activity diagram flow relation.

The new expanded activity diagram is created as follows. The nodes of theexpanded activity diagram are the nodes of the filtered activity diagram plus a setof OR states. For each object node and for each control node, except forks andjoins, an OR state is created in the new structure. The OR state becomes parentof the node for which it is created.

For each edge in the filtered activity diagram, an edge in the expanded activitydiagram is created. If the edge is incident to a node that has an OR parent, thatnode is replaced with its OR parent in the edge, i.e., a source or target of anedge that is not a fork or join is replaced with its OR parent. Fig. 9 shows theinitialization of the filtered activity diagram in Fig. 6.

T1) Merging OR states. If an edge connects two OR states and both statesare not predecessor of a join or successor of a fork, then the two OR states canbe merged into a new OR state which becomes parent of all children of the twomerged OR states. The new OR state replaces the old OR states and the edgeconnecting the two old OR states is removed from the structure. Note that thisdoes not imply that the edge is not present in the final state machine, since statemachine edges are defined separately (see Section 4.2). Fig. 10a shows how the rule

received

damage not checked damage checked

policy not checked policy checkedrejected

assessed settled

archivedO8

O1

O2 O3

O4 O5

O6

O9

O10 O11

O12

O13 O14

confirmed

O7decided [ok]

[else]

Fig. 9: Initialization of synthesis procedure for filtered activity diagram in Fig. 6

received

damage not checked damage checked

policy not checked policy checked O8

O1

O2 O3

O4 O5

O6

confirmed

O7decided

received

damage not checked damage checked

policy not checked policy checked O8

O1

O2 O3

O4 O5

O67

confirmed

decided

T1

(a) Example application of rule T1

received

damage not checked damage checked

policy not checked policy checked O8

O1

O2 O3

O45

O67

confirmed

decided

received

damage not checked damage checked

policy not checked policy checked O8

O1

O2 O3

O45

O67

confirmed

decided

T2

A1

(b) Example application of rule T2

Fig. 10: Example applications of transformation rules for example model

Page 12: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

12 Rik Eshuis, Pieter Van Gorp

is applied in the synthesis of the state hierarchy for the structure in Fig. 9: ORnodes O6 and O7 are merged. New node O67 specifies a tree, so the two BASICstates it contains are not connected by an edge.

T2) Creating AND states. Each set of OR states that is input (output) to a join(fork) translates into an AND state, which becomes parent of the OR states. Eachpair of OR states in the set must have the same predecessors and successors. Theset of OR states is replaced with a new OR state that becomes input (output) tothe join (fork). The OR state becomes parent of the created AND state. Fig. 10bshows an example application of this rule. OR nodes O45 and O67 have the samepredecessors and successors and can be grouped under parent AND node A1, whichis new. Note that O2 cannot be grouped since it has different successors thanO45 and O67. Furthermore, rule T2 is not applicable to the two activity diagramfragments shown in Fig. 10a, since for instance O4 and O6/O67 have differentsuccessors.

4.2 Constructing the state machine

The previous step has resulted in an AND/OR tree of states. We now explainhow a hierarchical state machine can be constructed from this tree plus the inputactivity diagram. States of the state machine are the states in the AND/OR treeplus additional fork and join pseudo states.

Each node in the filtered activity diagram has a unique counterpart in thestate machine. If two nodes in the activity diagram are connected by an edge, inthe state machine the counterparts of these nodes are also connected by an edge.For instance, object nodes Claim[policy not checked] and Claim[policy checked] areconnected by an edge in the activity diagram in Fig. 6, so in the state machineBASIC states policy not checked and policy checked are connected by an edge; seeFig. 2.

Since composite state machine states have no counterpart in activity diagramsyntax, edges in the generated state machines only connect BASIC states. Wehave defined postprocessing rules that rewrite the state machine edges into edgesbetween composite states by eliminating interlevel edges and fork and join pseudonodes. Applying the translation to the activity diagram of Fig. 6 results in the statemachine shown in Fig. 2. The initial and final states inside the AND state process

are due to postprocessing. The names of composite states have been manuallydefined.

The synthesized state machine is a skeleton that can be further refined into anexecutable state machine as follows. Each non-initial state s in the state machine isreached due to completion of a previous activity A. Therefore, each incoming tran-sition of s has trigger cpl(A). For instance, state policy checked is entered if activitycheck policy completes. The predecessor states of s are responsible for invoking theactivity A, so policy not checked is responsible for invoking check policy. We modelthe invocation by decomposing each predecessor state into two substates. Uponentering the initial state init of the predecessor state, the activity is invoked, andnext a state busy is entered in which the system waits for the completion of theinvoked activity. To illustrate the construction, Fig. 11 shows the refined statesand transitions within composite state O3 in Fig. 2.

Page 13: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 13

init

busy check policy

/ invoke check policy policy checked

cpl(check policy) policy unchecked

Fig. 11: Refined state O3 of state machine in Fig. 2

The synthesis procedure constructs hierarchical state machines whose behavioris equivalent to the filtered input activity diagrams (see Appendix A.5). The statemachine translation preserves the syntactic structure of the filtered input activitydiagram: each BASIC node corresponds to an object node. Since state machinecontrol flow cannot express cross-synchronization, as explained in Section 2, theprocedure fails for activity diagrams that contain cross-synchronization. The nextsection extends the translation to deal with those activity diagrams.

5 Cross-synchronization

In this section, we extend the synthesis procedure by defining a new transfor-mation rule that removes cross-synchronization from a filtered activity diagram.Since the behavior of the resulting state machine differs from that of the originalactivity diagram, we define how the resulting state machine can be refactored intoa state machine whose behavior is equivalent to the filtered activity diagram. Therefactored state machine uses internal events and guards, which compensate forthe change in behavior due to applying the new transformation rule.

5.1 Removing cross-synchronization

The new transformation rule T3 takes an OR state that has only a fork as pre-decessor and only a join as successor. Rule T3 removes all incoming and outgoingedges of the OR state, so after applying the transformation rule, no edge is incidentto the OR state.

Rule T3 is only applied if transformation rules T1 and T2 are not applicable,to avoid that T3 prevents the creation of OR and AND states for activity diagramparts that do not contain cross-synchronization. For instance, applying T3 to O35

in Fig. 10b would prevent the creation of of AND state A1, which is not desirable.Rule T2 is slightly extended: if the activity diagram only contains OR states

and no edges, T2 will create an AND state of which the OR states are children.This ensures that the OR state to which T3 is applied, is put in parallel with theother states.

Both rules are formally defined in Appendix A.3.

5.2 Extended mapping to state machines

Applying the three transformation rules results in a state hierarchy, for which theprocedure outlined in Section 4.1 constructs a state machine. However, if T3 is

Page 14: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

14 Rik Eshuis, Pieter Van Gorp

Receive

Make production planProduce

Check client detailsCreate bill Receive payment

Ship order

Order

[unchecked ]

Order

[checked]

Order

[billed]

Order

[paid]

Order

[shipped]

Order

[produced]

Order

[planned ]Order

[unplanned]

Order

[costs calculated ]Calculate costs

Order

[costs unknown ]

Fig. 12: Activity diagram of production process

O1

O2

producedunplanned planned

unchecked checked billed paid

shipped

costs calculated

O3

O4costs unknown

t1

t2

Fig. 13: Inconsistent state machine constructed for Fig. 12 by applying rules T1,T2 and T3

applied, the constructed state machine is not consistent with the input activitydiagram, since they have different behavior.

We illustrate this with the activity diagram in Fig. 12, which is based ona workflow scenario of the Workflow Management Coalition (http://wfmc.org).Object node Order[costs calculated] connects two parallel branches. The procedureconstructs for the activity diagram in Fig. 12 the state machine in Fig. 13, inwhich BASIC state costs calculated is put in parallel with the BASIC states forthe other object nodes. While the state machine preserves the syntactic controlflow of the activity diagram in Fig. 12, the actual behavior is quite different dueto state machine transitions t1 and t2. For instance, if t1 is taken then accordingto state machine semantics [20,39] not only BASIC state unplanned is left, but thecomplete AND state is left and its descendant states. Transition t1 re-enters theAND node but does not specify any target state for O3. Therefore default stateunchecked is entered as a result of taking t1. Similarly, if t2 is taken, the AND stateis completely left, and default states unplanned and costs unknown are re-entered.In all cases, the state machine behavior is not consistent with that of the activitydiagram in Fig. 12, which does not specify that object nodes are re-entered.

Page 15: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 15

O1

O2

producedunplanned planned

unchecked checked billed paid

shipped[in(costs

calculated)] / f

/ e

costs calculated

init

e

f

O3

O4costs unknown

Fig. 14: Refactored state machine for state machine in Fig. 13

This difference can be resolved by refactoring the constructed state machineinto a state machine with events and guards. Appendix A.4 specifies the refactoringrules formally. The basic idea is to split and modify the state machine transitionsthat cause the inconsistent behavior. First, for every OR state O that has beenrefactored using T3, a BASIC state initO is created that becomes the default stateof O. For every state machine transition t, visualized as bar, that leaves a BASICstate outside O and enters a BASIC state b inside O, remove the edge from t tob, and let the incoming edge of t generate event et. Next, add a state machinetransition from initO to b with trigger event et. Thus, if modified t is taken in therefactored state machine, the new transition is taken and b is entered.

Next, for every state machine transition t, visualized as bar, that leaves aBASIC state b inside O and enters a BASIC state outside O, remove the edgefrom b to t, and add guard condition [in(b)] to t. Let t generate event et. Add anedge from b to initO with trigger event et. Thus, modified t is only taken in therefactored state machine if b is active, and when it is taken b is left.

To illustrate the definition, Fig. 14 shows the refactored state machine forFig. 13. First, we introduce an auxiliary BASIC state init for the OR state O4.Auxiliary node init is the default node of O2, so init belongs to the initial stateconfiguration of the state machine.

The transition from init to costs unknown is taken when the transition fromunplanned to planned is taken and generates event e. The transition from checked

to billed can only be taken if state costs calculated is active. If the transition istaken, event f is generated that triggers the transition leaving costs calculated.The behavior of the refactored state machine is thus consistent with the activitydiagram in Fig. 12.

5.3 Discussion

For some corner cases, the extended translation outputs state machines that areinconsistent with the input filtered activity diagram, namely if an object node thatis processed in T3 is active multiple times simultaneously. In the state machine

Page 16: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

16 Rik Eshuis, Pieter Van Gorp

name

[S2]

name[S4]

name

[S6]

name

[S1]

name[S5]

name

[S3]

name[S7]

Fig. 15: Filtered activity diagram with unsafe object node name[S4]

O1

O2

S1 S2

S4

S5 S6

/ e

/ f

init

e O2

f

S3

S7[in(S4)] / h

[in(S4)] / g

g h

Fig. 16: Inconsistent refactored state machine generated for Fig. 15

translation, the corresponding state node can be active at most once by definitionof state machines. Consequently, the state machine behaves differently from theactivity diagram.

For instance, the activity diagram in Fig. 15 allows that name[S4] is active twiceat the same time, if states name[S1] and name[S5] have been left and name[S3] andname[S7] have not yet been entered. Consequently, if the fork entering name[S3]

is taken, name[S4] still remains active, so the fork entering name[S7] can be takennext, and in the end both name[S4] and name[S7] are active. The synthesis approachoutlined in Section 4 fails since no AND states can be created that contain the ORparent of :O[S4]. For the state machine, BASIC states S2, S4, and S6 can becomeactive by leaving S1 and S5, triggering O2 to move to S4. If next S3 is entered andg is generated then S4 is left. However, this means that the transition from S6 toS7 gets blocked, so the state machine cannot reach S7, which is not consistent withthe activity diagram.

Another important feature of the extended synthesis approach is that rule T3is not confluent, since T3 can be applied to different parts of the same activitydiagram. For instance, for the example in Fig. 15, rule T3 can be applied to the ORparents of S2 and S6, rather than to the OR parent of S4. A different state machineis then generated. It seems useful to let end users guide the extended synthesis

Page 17: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 17

approach, allowing them to choose during the execution to which particular statesrule T3 is applied. That way, they can control the outcome of the approach.

Finally, in some cases the extended translation fails, for instance on the filteredactivity diagram in Fig. 17(a). The fork starts two parallel branches with states S2

and S3, but if the fork is not chosen, then the branch with state S3 is active whilethe branch with state S2 is inactive. An AND node that contains both S2 and S3

does not have similar behavior, since all child nodes of an AND node are activesimultaneously. The translation therefore fails: repeatedly applying T1 results inan expanded activity diagram in which S1 and S3 share the same OR parent. ThisOR parent is both input and output to the fork, which prevents an applicationfrom rules T2 and T3. Generalizing, the extended translation fails for activitydiagrams in which some parallel branches are not activated simultaneously, suchas in Fig. 17. A modeler can repair these activity diagrams by adding control flowsuch that all branches are activated simultaneously (see Fig. 17(b)). Another repairoption is to duplicate object nodes that are used in multiple, exclusive branches(see Fig. 17(c)). The feedback provided by the reduction procedure can provideinput for the repair.

Appendix A.5 formally characterizes the subclass of filtered activity diagramsfor which the translation constructs a consistent state machine. The activity dia-grams in Fig. 15 and Fig. 17(a) are not in this subclass. The appendix also containsa proof that the translation constructs for an activity diagram in this subclass astate machine whose behavior corresponds to the behavior of the filtered inputactivity diagram, so the filtered activity diagram and the state machine are con-sistent.

6 Validation

To evaluate the feasibility of the approach, we have realized a prototype toolthat implements the approach and we have tested the tool on several syntheticexamples and on process models that are either based on real-life scenarios ortaken directly from the literature. In this section, we explain the architecture ofthe tool and discuss the results of in applying the approach to the case studyprocess models. The developed prototype plus example models are available in anonline virtual machine at http://is.ieis.tue.nl/staff/pvgorp/research/#ad2sc.Instructive screencasts are also provided there.

6.1 Architecture

We have decided to implement the rules in Sections 3 and 4 as graph transfor-mation rules using the general purpose graph transformation engine GrGen [14].This engine provides a scalable implementation for state-of-the-art matching andrewriting constructs and provides especially useful support for visual debugging.Fig. 18 shows the overall architecture of the resulting implementation. The fig-ure’s left-hand side shows that that the tool reads activity diagrams expressed inXMI syntax according to the UML 2.3 standard. Such XMI code is generated bymainstream tools like MagicDraw or the Eclipse UML 2 plugin. The rectangle atthe top of the figure represents the GrGen platform [14].

Page 18: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

18 Rik Eshuis, Pieter Van Gorp

name

[S2]

name

[S1]

name

[S3]

(a) Filtered activity diagram on which the extended translation fails

name

[S2]

name

[S1]

name

[S3]

(b) Repaired filtered activity diagram

name

[S2]

name

[S1]

name

[S3a]

name

[S3b]

(c) Another repaired filtered activity diagram

Fig. 17: Fail case for extended translation and possible repairs

GrGenGraph Rewriting Tool

with optional Visualization

Eclipse based editors(Eugenia/Kieler and

Official UML 2 Tools)

UML Activity Diagram

XMI

UML Statechart

XMI

Filteringrules

(Section 3)

Synthesisrules

(Section 4)

Fig. 18: Implementation architecture

The right-hand and bottom side of Fig. 18 shows that our transformation im-plementation produces output models that can be consumed by many interestingtools, including Eclipe-based editors. The implementation produces hierarchicalstate machines expressed in UML XMI that conforms to the official UML 2.3standard. Our toolchain generates two types of XMI files. First of all, it generatesXMI based on the OMG standardized metamodel for UML 2. This type of XMI

Page 19: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 19

output enables users to open the generated state machines in commercial UML ed-itors such as MagicDraw [31] or Enterprise Architect [36]. Secondly, the toolchaingenerates XMI based on a minimalistic metamodel for state machines. For thattype of XMI, the toolchain provides an academic Eclipse EMF/GMF based editorthat provides advanced automatic layouting features for state machines based onKIELER [16].

6.2 Example Rewrite Rule in GrGen syntax

Besides reusing transformation code related to [10] and [41], the code supportingthis paper consists of (1) graph transformation rules that implement the filteringprocess from Section 3 and (2) integration code supporting the overall transforma-tion chain. The transformation challenges related to (2) are discussed in a separatepaper [40] while an example fragment of (1) is discussed below.

Fig. 19 shows a fragment in the textual graph transformation rule syntax ofGrGen. Lines 1 to 10 implement rule R1 from Section 3. The rule matches theexample from Fig. 1 as follows: on line 2, node variable n matches the Update

client contribution node while the src and trg nodes match the neighboring forkand join. The two anonymous edge variables on line 2 (i.e., the two occurrences of“− : ActivityEdge− >”) match the arcs between these nodes. By default, GrGenuses isomorphic matching, which means that src and trg must match differentnodes of the activity diagram. But for this rule, we do allow that src and trg referto the same node, i.e., we use homomorphic matching (line 3). The patterns onlines 4 and 5 express that n should have exactly one input and output edge. Iffor instance for n node Receive is chosen, these negative patterns prevent the rulefrom matching, since Receive has four outgoing edges. Lines 7 and 8 from Fig. 19realize the side-effects (i.e., the right-hand side) of rule R1: line 7 removes n andits attached edges while line 8 establishes a new edge between the original sourceand target elements src and trg. The meaning of line 8 is based on the conventionthat elements that are declared in the right-hand side (in this case an anonymousedge of type ControlFlow) should be created when the left-hand side pattern ismatched in the host graph.

6.3 Case studies

To further evaluate the feasibility of the approach, we applied the prototype toseveral process models (Table 1) using a Windows 7 desktop machine with IntelCore 2 Duo processor of 2.93 GHz and 4GB RAM.

The considered process models are divided in two classes. The first class con-tains three real-life industrial processes that we modeled ourselves: ordering anddelivery of bikes (P1), handling of dermatology patients (P2), and requesting aconstruction permit (P3). As with the running example, the control flow of P1and P2 is block-structured (each fork that starts parallel branches has one mat-ching join that synchronizes the branches) but by preprocessing the object flows,extra forks and joins are introduced that destroy the block-structure (cf. Fig. 5).Process P3 is not block-structured and contains cross-synchronization that is notremoved by the filtering rules; hence T3 is applied.

Page 20: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

20 Rik Eshuis, Pieter Van Gorp

1 rule oneinputoutput {2 src:ActivityNode -:ActivityEdge -> n:ActivityNode \( ObjectNode+

CentralBufferNode) -:ActivityEdge -> trg:ActivityNode;3 hom(src ,trg);4 negative { :TwoOrMoreInputs(n); }5 negative { :TwoOrMoreOutputs(n); }6 modify {7 delete(n);8 src -:ControlFlow -> trg;9 }

10 }11 pattern TwoOrMoreInputs(n:ActivityNode) {12 multiple{ -:ActivityEdge -> n; }13 }14 pattern TwoOrMoreOutputs(n:ActivityNode) {15 multiple{ n -:ActivityEdge ->; }16 }

Fig. 19: Code fragment of the GrGen based implementation.

Table 1: Characteristics of the cases

Case study #act

ion

nod

es

#ob

ject

nod

es

#fo

rks/

join

s

#d

ecis

ion

s/m

erges

#co

ntr

ol

flow

s

#ob

ject

flow

s

pre

pro

c.(m

s)

filt

er(m

s)

tran

slati

on

(ms)

T3

ap

plied

?

P1 Bikeshop 15 12 7 4 33 15 78 47 156 noP2 Dermatology 17 5 4 10 39 6 94 78 187 noP3 Construction permit 17 12 2 2 27 22 94 93 312 yesP4 CAD CAM [25] 15 4 3 10 38 6 93 94 218 noP5 Create catalogue [37] 10 6 3 4 17 13 78 62 220 noP6 Media store [15] 10 3 3 0 16 6 94 78 187 noP7 Tax collection [29] 13 6 2 1 19 12 109 63 234 no

The second class contains activity diagrams with stateful object nodes takenfrom the literature (P4–P7). To ensure that the activity diagrams have sufficientcomplexity, we only selected activity diagrams that contain parallelism in theircontrol flow. If an activity references multiple stateful objects, we selected thestateful object with the most states. Processes P5 and P7 model a protocol betweentwo parties in which object nodes denote messages exchanged among the parties.For these activity diagrams, we modeled the protocol as a stateful object andconsidered the different messages as states of the protocol. Processes P4 and P5contained minor execution errors that we have repaired, as we explain below.Process P4 contains cross-synchronization, but the filtering rules remove the partcontaining cross-synchronization, hence T3 is not applied.

Table 1 shows that the prototype constructs for each activity diagram a hier-archical state machine in less than a second. The translation to state machines isthe most time-consuming step, especially if T3 is applied. We have shown beforethat the synthesis procedure defined in Section 4 runs in polynomial time [10] andthat the GrGen-implementation scales well for large input models [41].

Page 21: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 21

Applying the prototype to the examples revealed that process modeling withobject flows can be intricate due to the peculiar semantics of activity and objectnodes. According to UML [39], all incoming edges of an activity need to be activein order to start. If the activity is part of a parallel branch that is embedded ina loop, this property can easily lead to errors such as in P4 and P5. Adding anobject flow from for instance Claim[settled] to Notify client also results in an executionerror: then Claim[settled] is required as mandatory input whereas Claim[settled] isnot entered if the claim is accepted. However, the approach can still construct astate machine, which contains the same error.

7 Related work

Our work is most closely related to research on relating object life cycles andbusiness process models (i) and to approaches for synthesizing hierarchical statemachines (ii). We also briefly discuss other approach that translate activity dia-grams into state machines.

Object life cycles and business process models. Only a few approaches [4,23,24,32]consider the relation between business process models that reference business ob-jects and object life cycles of these objects. All these approaches consider flat finitestate machines whereas we consider hierarchical state machines that can have par-allelism. Moreover, some approaches [24,32] generate a process model from a setof object life cycles, where each object life cycle is specified by a flat finite statemachine. Whereas we study the reverse direction: how can an object life cycle begenerated from a process model?

Cabanillas et al. [4] define an algorithm to derive object life cycles from businessprocess models with object flows. The algorithm only deals with sequential processmodels and generates sequential object life cycles. In contrast, the approach devel-oped in this paper allows parallelism in both process model and object life cycles,which complicates the definitions of the filtering and synthesis of the approach.Inaddition, they do not consider filtering rules, which are essential to enable thediscovery of implicitly specified object life cycles from business process models.

Kumaran et al. [23] give algorithms for deriving sequential, flat state machinesfrom a business process model with object flow. The process model does not specifyany explicit object states; the derived state machines contain the activities ofthe process model. Consequently, the life cycles in the approach of Kumaran etal. offer an alternative, distributed view on the process model. Whereas in thispaper process models do specify explicitly object states, but do not contain anyexplicit life cycles. The synthesis approach discovers these life cycles, which requiresfiltering rules that are not needed by Kumaran et al.

State machine synthesis. There are several works that study how to generate astate machine from a set of scenarios specified as either MSC or LSC, e.g. [18,42,43]. The constructed state machine satisfies all scenarios, i.e., each scenario isplayable with the state machine. A major difference between scenarios and activitydiagrams is that scenarios reference state machine events but not states, whereasactivity diagrams reference state machine states (object nodes) but not events.

Page 22: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

22 Rik Eshuis, Pieter Van Gorp

There are two important differences with our work. First, these approachestranslate the complete control flow of the scenarios to a state machine. In ourapproach, we translate object flows to a state machine. Since an activity diagramis a mixture of object flow and control flow, the object flows need to be filteredfrom the activity diagram. This step is not present in the scenario-based synthesisapproaches. The filtering phase is a key element of our approach to discover ahidden object life cycle from a process model.

Second, only a limited set of state machines can be synthesized from scenar-ios, compared to the state machines constructible with our approach. In mostapproaches [18,43], each scenario-based state machine consists of communicat-ing sequential state machines, so there is one top-level AND state that containssequential finite state machines that execute in parallel. Whereas in our work,constructed state machines can have parallelism at arbitrary levels of nesting, notjust the top level.

Third, input scenarios can be inconsistent [18]. In that case, no state machinecan be constructed and the scenarios need to be repaired. The input for our ap-proach is a single activity diagram, so the consistency issue is not relevant. Thesynthesis approach allows that a filtered activity diagram contains a deadlock, inwhich case the deadlock is preserved in the generated state machine.

Whittle and Jayaraman [42] study synthesis of hierarchical state machinesfrom UML 2.0 interaction diagrams, which contain activity diagram constructsto specify complex parallel behavior. However, the interaction diagrams are re-quired to be (block-)structured [26]: each fork matches with a join and pairs ofmatching nodes are properly nested and loops with multiple exits are not allowed.Consequently, the synthesized state machines are also block-structured. Whereasthe translation defined in Sect. 4 takes as input unstructured activity diagramsand constructs state machines that can be unstructured, for instance containingloops with multiple exits or unbalanced forks and joins, and that can containcross-synchronization.

Translations from activity diagrams into state machines. State machines have beenused to define a formal semantics for activity diagrams without object nodes,e.g. [2,6,9]. Main difference with our approach is that these translations focus oncontrol flow only, ignoring object nodes. Furthermore, these translations do notdefine hierarchical state machines.

8 Conclusion

We have defined an approach for synthesizing an object life cycle from a businessprocess model, making the implicit life cycle contained in the process model ex-plicit. The approach is fully automated and has been implemented with the graphtransformation tool GrGen [14]. Synthesized hierarchical state machines can beused to generate a software system supporting the business process, for instanceusing UML case tools [22,35], or to assess consistency with existing state machinedescriptions [8,30].

Future work is to enlarge the scope of the translation in several ways. First, weplan to define a translation for BPMN models [3], which have a similar notationas activity diagrams for data (object) flows but whose semantics is somewhat

Page 23: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 23

different [3]. Next, we plan to further extend the translation to generate object-centric process designs, in which objects of different object types interact witheach other.

References

1. Aho, A., Sethi, R., Ullman, J.: Compilers: Principles, Techniques, and Tools. AddisonWesley (1986)

2. Borger, E., Cavarra, A., Riccobene, E.: An ASM Semantics for UML Activity Diagrams. In:T. Rus (ed.) Proc. International Conference on Algebraic Methodology and Software Tech-nology (AMAST 2000), Lecture Notes in Computer Science 1826, pp. 293–308. Springer(2000)

3. BPMN Task Force: Business Process Model and Notation (BPMN) Version 2.0. ObjectManagement Group (2011). OMG Document Number formal/2011-01-03.

4. Cabanillas, C., Resinas, M., Cortes, A.R., Awad, A.: Automatic generation of a data-centered view of business processes. In: H. Mouratidis, C. Rolland (eds.) CAiSE 2011,Lecture Notes in Computer Science, vol. 6741, pp. 352–366. Springer (2011)

5. Dick, N., Huschens, J.: Iaa the ibm insurance application architecture. In: P. Bernus,K. Mertins, G. Schmidt (eds.) Handbook on Architectures of Information Systems, In-ternational Handbooks on Information Systems, pp. 619–637. Springer Berlin Heidelberg(1998)

6. Dumas, M., Fjellheim, T., Milliner, S., Vayssiere, J.: Event-based coordination of process-oriented composite applications. In: W.M.P. van der Aalst, B. Benatallah, F. Casati,F. Curbera (eds.) Business Process Management, vol. 3649, pp. 236–251 (2005)

7. Engels, G., Forster, A., Heckel, R., Thone, S.: Process modeling using UML. In: M. Dumas,W. van der Aalst, A. ter Hofstede (eds.) Process-Aware Information Systems, pp. 85–117.Wiley (2005)

8. Engels, G., Kuster, J.M., Heckel, R., Groenewegen, L.: A methodology for specifying andanalyzing consistency of object-oriented behavioral models. In: Proc. ESEC / SIGSOFTFSE, pp. 186–195 (2001)

9. Eshuis, R.: Symbolic model checking of UML activity diagrams. ACM Transactions onSoftware Engineering Methodology 15(1), 1–38 (2006)

10. Eshuis, R.: Translating safe Petri nets to statecharts in a structure-preserving way. In:A. Cavalcanti, D. Dams (eds.) Proc. FM 2009, Lecture Notes in Computer Science, vol.5850, pp. 239–255. Springer (2009)

11. Eshuis, R.: Statechartable Petri nets. Formal Aspects of Computing 25(5), 659–681 (2013)12. Eshuis, R., Van Gorp, P.: Synthesizing object life cycles from business process models.

In: P. Atzeni, D.W. Cheung, S. Ram (eds.) Proc. ER 2012, Lecture Notes in ComputerScience, vol. 7532, pp. 307–320. Springer (2012)

13. Esparza, J.: Reduction and synthesis of live and bounded free choice Petri nets. Informa-tion and Computation 114(1), 50–87 (1994)

14. Geiß, R., Batz, G.V., Grund, D., Hack, S., Szalkowski, A.: Grgen: A fast SPO-based graphrewriting tool. In: A. Corradini, H. Ehrig, U. Montanari, L. Ribeiro, G. Rozenberg (eds.)Proc. ICGT 2006, Lecture Notes in Computer Science, vol. 4178, pp. 383–397. Springer(2006). URL http://dx.doi.org/10.1007/11841883_27

15. Giese, H., Graf, J., Wirtz, G.: Closing the gap between object-oriented modeling of struc-ture and behavior. In: R.B. France, B. Rumpe (eds.) Proc. UML’99, Lecture Notes inComputer Science, vol. 1723, pp. 534–549. Springer (1999)

16. von Hanxleden, R., Fuhrmann, H., Sponemann, M.: KIELER—The KIEL Integrated En-vironment for Layout Eclipse Rich Client. In: Proceedings of the Design, Automation andTest in Europe University Booth (DATE’11). Grenoble, France (2011)

17. Harel, D.: On visual formalisms. Communications of the ACM 31(5), 514–530 (1988)18. Harel, D., Kugler, H.: Synthesizing state-based object systems from LSC specifications.

Int. Journal of Foundations of Computer Science 13(1), 5–51 (2002)19. Harel, D., Kugler, H.: The Rhapsody semantics of statecharts (or, on the executable core

of the UML) - preliminary version. In: H. Ehrig, W. Damm, J. Desel, M. Große-Rhode,W. Reif, E. Schnieder, E. Westkamper (eds.) Integration of Software Specification Tech-niques for Applications in Engineering, Lecture Notes in Computer Science 3147, pp.325–354. Springer (2004)

Page 24: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

24 Rik Eshuis, Pieter Van Gorp

20. Harel, D., Naamad, A.: The STATEMATE semantics of statecharts. ACM Transactionson Software Engineering and Methodology 5(4), 293–333 (1996)

21. Hecht, M., Ullman, J.: Characterizations of reducible flow graphs. J. ACM 21, 367–375(1974)

22. I-Logix: Rhapsody (2005). Available at http://www.ilogix.com23. Kumaran, S., Liu, R., Wu, F.Y.: On the duality of information-centric and activity-centric

models of business processes. In: Z. Bellahsene, M. Leonard (eds.) CAiSE, Lecture Notesin Computer Science, vol. 5074, pp. 32–47. Springer (2008)

24. Kuster, J.M., Ryndina, K., Gall, H.: Generation of business process models for object lifecycle compliance. In: G. Alonso, P. Dadam, M. Rosemann (eds.) Proc. BPM, LectureNotes in Computer Science, vol. 4714, pp. 165–181. Springer (2007)

25. Lin, C.P., Jeng, L.D., Lin, Y.P., Jeng, M.: Management and control of information flowin CIM systems using UML and Petri nets. Int. J. Computer Integrated Manufacturing18(2&3), 107–121 (2005)

26. Liu, R., Kumar, A.: An analysis and taxonomy of unstructured workflows. In: W. van derAalst, B. Benatallah, F. Casati, F. Curbera (eds.) Proc. 3rd Conference on Business Pro-cess Management (BPM 2005), Lecture Notes in Computer Science, vol. 3649, pp. 268–284(2005)

27. Massink, M., Latella, D., Gnesi, S.: On testing {UML} statecharts. The Journal ofLogic and Algebraic Programming 69(1?2), 1 – 74 (2006). DOI http://dx.doi.org/10.1016/j.jlap.2006.03.001. URL http://www.sciencedirect.com/science/article/pii/S1567832606000257

28. Mellor, S.J., Balcer, M.J.: Executable UML - A Foundation for Model-Driven Architecture.Addison Wesley object technology series. Addison-Wesley (2002)

29. Mendling, J., Hafner, M.: From WS-CDL choreography to BPEL process orchestration.J. Enterprise Inf. Management 21(5), 525–542 (2008)

30. Nejati, S., Sabetzadeh, M., Chechik, M., Easterbrook, S.M., Zave, P.: Matching and merg-ing of statecharts specifications. In: Proc. ICSE, pp. 54–64. IEEE Computer Society (2007)

31. No Magic, Inc.: MagicDraw. http://www.magicdraw.com32. Redding, G., Dumas, M., ter Hofstede, A.H.M., Iordachescu, A.: Generating business pro-

cess models from object behavior models. IS Management 25(4), 319–331 (2008)33. Schrefl, M., Stumptner, M.: On the design of behavior consistent specializations of object

life cycles in OBD and UML. In: M.P. Papazoglou, S. Spaccapietra, Z. Tari (eds.) Advancesin Object-Oriented Data Modeling, pp. 65–104. MIT Press (2000)

34. Shlaer, S., Mellor, S.: Object Oriented Life Cycles: Modeling the World in States. PrenticeHall (1991)

35. Software, I.R.: Rose (2005). Available at http://www.rational.com36. Sparx Systems: Enterprise Architect. http://www.sparxsystems.eu/EnterpriseArchitect37. Universal business language version 2.1. OASIS (2013)38. UML Revision Taskforce: OMG UML Specification v. 1.5. Object Management Group

(2003). OMG Document Number formal/2003-03-01. Available at http://www.uml.org39. UML Revision Taskforce: UML 2.3 Superstructure Specification. Object Management

Group (2010). OMG Document Number formal/2010-05-05.40. Van Gorp, P.: Applying traceability and cloning techniques to compose input-destructive

model transformations into an input-preserving chain. In: 1st Workshop on Compositionand Evolution of Model Transformations, King’s College, London, UK (2011)

41. Van Gorp, P., Eshuis, R.: Transforming process models: Executable rewrite rules versus aformalized java program. In: D.C. Petriu, N. Rouquette, Ø. Haugen (eds.) Proc. MoDELS2010, Lecture Notes in Computer Science, vol. 6395, pp. 258–272. Springer (2010)

42. Whittle, J., Jayaraman, P.K.: Synthesizing hierarchical state machines from expressivescenario descriptions. ACM Trans. Softw. Eng. Methodol. 19(3) (2010)

43. Whittle, J., Schumann, J.: Generating statechart designs from scenarios. In: ICSE, pp.314–323 (2000)

A Formal definitions

This appendix contains formal definitions of the syntax of activity diagrams and state machines(Appendix A.1), the filtering rules (Appendix A.2), the transformation rules that result instate machines (Appendix A.3), the refactoring rules that deal with cross-synchronization(Appendix A.4), plus a proof of correctness (Appendix A.5).

Page 25: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 25

A.1 Activity diagrams and state machines

An activity diagram is a graph that specifies the ordering of activities and the flow of objectsthat are used by the activities.

Definition 1 (Activity diagrams) An activity diagram is a tuple (A,O,C,E, guard, instate)where

– A is the set of activities;– O is the set of object nodes. For this paper, we assume that all object nodes refer to the

same object, which is therefore not formalized;– C is the set of control nodes, also called pseudo nodes, partitioned into sets {i}, DM , FJ ,

F , where– i is the unique start node, which has no predecessor;– DM is the set of decisions and merges;– FJ is the set of forks and joins;– F is the set of final nodes, which have no successors;

– E ⊆ A ∪ O ∪ C × A ∪ O ∪ C is the set of flow edges, where an edge that enters or leavesan object node in O is called an object flow; all other edges are control flows;

– guard : E → L is a partial function assigning a guard label from the set of labels L to anedge. We only allow guard labels on edges leaving decision nodes;

– instate : O → S is a function that specifies for each object node the unique state in S thatthe object is in.

We put some constraints on the edge relation E:

– the induced graph is weakly connected, so for every pair of nodes n1, n2 ∈ A∪O∪C thereis an undirected path between n1 and n2;

– i has no predecessor;– each node f ∈ F has no successors;– each activity a ∈ A has one incoming control flow and one outgoing control flow.– each control node n ∈ C \ ({i} ∪ F ), except the initial and final nodes, has a predecessor

and a successor.

As explained in Section 2, we consider in this paper only activity diagrams that processa single stateful object. Each different state of the object is represented with an object node.The stateful object has a lifecyle that is represented in a state machine.

A state machine is hierarchical hypergraph [17], consisting of nodes arranged in a tree anddirected hyperedges. There are three types of nodes: BASIC, AND, and OR. BASIC nodesare leaves of the tree while AND and OR node are internal. The state machine formalizationused here stems from earlier work [11]. That work also defines an execution semantics of statemachines.

Definition 2 A state machine is a tuple (N,H, source, target, guard, child, default, r), where

– N is a set of nodes, which is partitioned into sets BN , AN , and ON , where– BN is a finite set of BASIC nodes, which are not decomposed into other nodes;– AN is a finite set of AND nodes, which specify parallel decomposition;– ON is a finite set of (X)OR nodes, which specify exclusive-or decomposition;

– H is a finite set of hyperedges, N ∩H = ∅;– source : H → P(N) is a function defining the non-empty set of source nodes for each

hyperedge;– target : H → P(N) is a function defining the non-empty set of target nodes for each

hyperedge;– guard : H → L is a partial function assigning a guard label from L to a hyperedge;– child ⊆ N ×N is a predicate that relates a child node to its parent node, so (n, n′) ∈ child

means n is child of n′. We require that child arranges the nodes in N in a rooted tree, soevery node in N , except the root, has one parent node, and every node is indirectly childof the root. We require x ∈ BN if and only if {y | (y, x) ∈ child} = ∅, so only BASICnodes have no children.

– default : ON → N is a function that identifies for each OR node n one of its children as thedefault node: default(n) ∈ children(n). As defined below in the semantics, if a hyperedgeh enters n but does not explicitly enter any of its children, then h enters default(n).

– r ∈ N is the root of the tree induced by child. For technical reasons, r is required to bean OR node.

Page 26: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

26 Rik Eshuis, Pieter Van Gorp

A.2 Filtering rules

Formal definitions are presented for the filtering rules presented in Section 3. Each rule operateson an activity diagram (A,O,C,E, guard, instate) and consists of a (pre)condition and aneffect, which is the new activity diagram tuple (A′, O′, C′, E′, guard′, instate′). We only definethe elements of the tuple that are changed, so if for instance A′ is omitted, then A′ = A.

To define the rules in a concise way, we introduce additional notation. Let n ∈ A ∪O ∪Cbe a node. Then in(n) = { n′ | (n′, n) ∈ E } and out(n) = { n′ | (n, n′) ∈ E }. For a binaryrelation R, the relation R(x ← y) is the relation obtained by replacing each appearance of ywith x in each tuple of R [13].

Definition 3 (Rule R1) Condition. There is a node n ∈ A∪C such that |in(n)| = |out(n)| =1.

Effect. Let {n1} = in(n) and {n2} = out(n). Then

– A′ = A \ {n};– C′ = C \ {n};– E′ = (E ∩ ((A′ ∪O′ ∪ C′)× (A′ ∪O′ ∪ C′))) ∪ (n1, n2)– instate′ = instate ∩ (O′ × S).

Definition 4 (Rule R2) Condition. There are decisions/merges d1, d2 ∈ DM such that(d1, d2) ∈ E and |in(d2)| = 1.

Effect.

– C′ = C \ {d2};– E′ = (E \ {(d1, d2)})(d1 ← d2);– guard′ = (guard∩((A∪O∪C′)×L))⊕{(d2, x) 7→ g | x ∈ A∪O∪C′∧g = guard((d1, d2))∧

guard(d2, x)},

where ⊕ denotes function overriding.

Definition 5 (Rule R3) Condition. There are decisions/merges d1, d2 ∈ DM such that(d1, d2) ∈ E and |out(d1)| = 1.

Effect.

– C′ = C \ {d1};– E′ = (E \ {(d1, d2)})(d2 ← d1);

Definition 6 (Rule R4) Condition. There is a decision/merge n ∈ DM such that (n, n) ∈E.

Effect.

– C′ = C \ {n};– E′ = E \ {(n, n)};

Definition 7 (Rule R5) Condition. There are fork/joins n1, n2 ∈ FJ such that (n1, n2) ∈ Eand there is a simple path of length 2 or more from n1 to n2.

Effect.

– E′ = E \ {(n1, n2)};

Definition 8 (Rule R6) Condition. There is a fork or join n ∈ FJ and a final node f ∈ Fsuch that (n, f) ∈ E and |out(f)| > 1.

Effect.

– C′ = C \ {f};– E′ = E \ {(n, f)};

Definition 9 (Rule R7) Condition. There are forks or joins n1, n2 ∈ FJ such that (n1, n2) ∈E, |out(n1)| > 1 and |in(n2)| = 1.

Effect.

– C′ = C \ {n2};

Page 27: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 27

– E′ = (E \ {(n1, n2)})(n1 ← n2);

Definition 10 (Rule R8) Condition. There are forks or joins n1, n2 ∈ FJ such that (n1, n2) ∈E, |out(n1)| = 1 and |in(n2)| > 1.

Effect.

– C′ = C \ {n1};– E′ = (E \ {(n1, n2)})(n2 ← n1);

Definition 11 (Rule R9) Condition. There is a decision d ∈ DM and object node o ∈ Osuch that

– there is a fork f ∈ FJ such that in(d) = {f} = in(o);– out(d) = out(o);– out(d), out(o) ⊆ FJ .

Effect.

– C′ = C \ {d};– E′ = E ∩ ((A ∪O ∪ C′)× (A ∪O ∪ C′));– guard′ = (guard ∩ ((A ∪O ∪ C′)′ × L))⊕ {(o, x) 7→ g | x ∈ N ′ ∧ g = guard((d, x))}.

Definition 12 (Rule R10) Condition. There is a merge m ∈ DM and object node o ∈ Osuch that

– there is a join j ∈ FJ such that out(m) = {j} = out(o);– in(m) = in(o);– in(m), in(o) ⊆ FJ .

Effect.

– C′ = C \ {m};– E′ = E ∩ ((A ∪O ∪ C′)× (A ∪O ∪ C′));

A.3 Transformation rules

There are three transformation rules. Each rule operates on an expanded activity diagram.An expanded activity diagrams contains AND/OR trees, the leaves of which are object nodesor control nodes that are not bars. The internal nodes of each tree are AND and OR nodes.Edges only connect bars and OR nodes that root the AND/OR trees.

We first define expanded activity diagram; next we define the initialization rule that trans-forms a filtered activity diagram into an expanded activity diagram.

Definition 13 (Expanded activity diagram) An expanded activity diagram is a tuple(A,O,C,E, guard, instate, AN,ON, parent) where

– A, O, C, guard, and instate are as defined for activity diagrams;– E ⊆ (ON ∪ FJ) × (ON ∪ FJ) is the edge relation, now defined between OR nodes and

fork and joins only;– ON is the set of OR nodes;– AN the set of AND nodes;– parent : (O ∪ C ∪ AN ∪ ON) → (AN ∪ ON) is a function that maps a child node to its

unique parent node.

Definition 14 (Initialization) Condition. None of the filtering rules R1–R8 apply to theactivity diagram (A,O,C,E, guard, instate), so A = ∅. There is no edge connecting two fork-s/joins: E ∩ (FJ × FJ) = ∅. If the latter constraint is violated, so there are forks or joinsn1, n2 ∈ FJ and (n1, n2) ∈ E then a dummy object node o can be inserted between theforks/joins, so (n1, n2) is replaced with (n1, o) and (o, n2) in E.

Effect. The expanded activity diagram is defined as the tuple (A′, O′, C′, E′, guard′,instate′, AN ′, ON ′, parent′), where

– A′ = A;

Page 28: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

28 Rik Eshuis, Pieter Van Gorp

– O′ = O;– C′ = C;– E′ = E(onn ← n) for each n ∈ A ∪O ∪ (C \ FJ);– guard′ = { (onn, x) 7→ g | n ∈ DM ∧ x ∈ A ∪O ∪ C ∧ g = guard(onn, x) }– instate′ = ∅;– AN ′ = AN ;– ON ′ = { ono | o ∈ O } ∪ { onc | c ∈ C \ FJ };– parent′ = { (n, onn) | n ∈ O′ ∪ C′ }.

The transformation rules T1 and T2 operate on expanded activity diagrams. As the fil-tering rules, each transformation rule has a condition and an effect, which is the definition ofthe new activity diagram (A′, O′, C′, E′, guard′, instate′, AN ′, ON ′, parent′).

We define two versions of T1. Version a operates on an edge connecting two OR nodes.Version b operates on a fork/join node that has a single predecessor and a single successor,both of which are OR nodes. Such a fork/join results after applying T2a to the source andtarget sets of the fork/join.

Definition 15 (Rule T1a) Condition. Let there be nodes o1, o2 ∈ ON such that out(o1) ={o2} and in(o2) = {o1}, and for every fork/join n ∈ FJ , o1 and o2 are not both input or bothoutput for n, so o1, o2 6∈ in(n) and o1, o2 6∈ out(n).

Effect. Let o be a fresh OR node, so o 6∈ ON .

– E′ = E \ {(o1, o2)};– ON ′ = (ON \ {o1, o2}) ∪ o;– parent′ = parent(o1 ← o, o2 ← o).

Definition 16 (Rule T1b) Condition. Let there be nodes o1, o2 ∈ ON such that there is afork/join n ∈ FJ such that out(o1) = {n} and in(o2) = {n}, and for every other fork/joinn′ ∈ FJ , o1 and o2 are not both input or both output for n′, so o1, o2 6∈ in(n′) and o1, o2 6∈out(n′).

Effect. Let o be a fresh OR node, so o 6∈ ON .

– C′ = C \ {n};– E′ = E \ {(o1, n), (n, o2)};– ON ′ = (ON \ {o1, o2}) ∪ o;– parent′ = parent(o1 ← o, o2 ← o).

Definition 17 (Rule T2a) Condition. Let there be a fork/join n ∈ FJ and a set X of ORnodes such that in(n) = X or out(n) = X such that for each pair of nodes x1, x2 ∈ X,in(x1) = in(x2) and out(x1) = out(x2).

Effect. Let a and o be a fresh AND and OR node, respectively, so a 6∈ AN and o 6∈ ON .

– E′ = E(o← X);– AN ′ = AN ∪ {a};– ON ′ = ON ∪ {o};– parent′ = parent ∪ { (x, a) | x ∈ X } ∪ {(a, o) }

In the definition, E(o← X) abbreviates E(o← x1, . . . , o← xn) where |X| = n.Rule T2b is a minor variation on T2 that is needed if T3 has been applied. In that case,

an expanded activity diagram results in which there are no edges but only OR nodes that rootthe AND/OR trees. One of these AND/OR trees is rooted by the node whose incident edgeswere removed in T3.

Definition 18 (Rule T2b) Condition. Let X be a set of OR nodes. Each node x ∈ X hasno parent and every OR node o ∈ ON \X does have a parent. The set of edges E is empty.

Effect. Let a and o be a fresh AND and OR node, respectively, so a 6∈ AN and o 6∈ ON .

– AN ′ = AN ∪ {a};– ON ′ = ON ∪ {o};– parent′ = parent ∪ { (x, a) | x ∈ X } ∪ {(a, o)}

Page 29: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 29

Definition 19 (Rule T3) Let there be a fork f and join j, where f, j ∈ FJ and an OR nodeo ∈ ON such that (f, o), (o, j) ∈ E.

Condition. T1 and T2 are not applicable, and |out(f)| > 1 and |in(j)| > 1.Effect. All edges from and to o are removed, so

– E′ = E ∩ (((ON \ o) ∪ n)× ((ON \ o) ∪ n));

Definition 20 (Finalization) Let AD = (A,O,C,E, guard, instate) be the filtered activitydiagram and let EAD = (A,O,C,E, guard, instate, AN,ON, parent) be the expanded activitydiagram that results after repeatedly applying T1, T2 and T3 to the expanded activity diagramderived from AD.

Condition. Rules T1, T2, and T3 are not applicable to EAD. Next, the set of edges ofEAD must be empty, so EEAD = ∅. This condition implies that the set of forks and joins ofEAD is empty.

Effect. The state machine tuple becomes (N ′, H′, source′, target′, guard′, child′, default′, r′),where

– N ′ = BN ′ ∪AN ′ ∪ON ′, where– BN ′ = { instate(o) | o ∈ OAD } ∪ (CAD \ FJAD)– AN ′ = ANEAD

– ON ′ = ONEAD

– H′ = FJ ∪ { e ∈ EAD | |in(e)| = |out(e)| = 1 }– source′ = { (x, y) 7→ {x} | (x, y) ∈ EAD ∩H′ }∪ { n 7→ in(n) | n ∈ FJAD }(instate(o)←

o, for each o ∈ OAD).– target′ = { (x, y) 7→ {y} | (x, y) ∈ EAD ∩H′ }∪{ n 7→ out(n) | n ∈ FJAD }(instate(o)←

o, for each o ∈ OAD)– guard′ = {h 7→ g | h ∈ H′ ∧ h ∈ EAD ∧ g = guardAD(h)}– child′ = parentEAD(instate(o)← o, for each o ∈ OAD)– default′ = { (oni, i) }– r′ = oni

Function default only specifies the default completion for the root node, which contains theinitial node of the activity diagram as BASIC node. The default nodes are redundant forregular edges since the source set and target set of each hyperedge are complete: no defaultnodes need to be added to enter a valid configuration [11].

A.4 Refactoring

If T3 was applied to synthesize a state machine, then the generated state machine is notconsistent with the filtered activity diagram, and the state machine needs to be refactored, asdiscussed in Section 5.

The refactoring rules use state machines with events and guards. Consequently, the statemachine tuples are expanded with a set E of events and partial functions event, action : H →E. Next, we allow guard conditions to reference BASIC nodes using predicate in(n), wheren ∈ BN , which is true if n is currently active. The intended meaning is that if for hyperedge hthese functions are defined, h is enabled if event(h) occurs, its states in source(h) are active,and guard(h) evaluates to true. If h is taken, action(h) is generated.

We now define an initialization rule and two refactoring rules. As before, when definingthe new state machine tuples only the changed elements are defined.

Definition 21 (Initialization refactoring) Let (N,H, source, target, guard, child, default, r)be a state machine. Let O ∈ ON be an OR node in the state machine.

Condition. Node O was processed in T3.Effect. The state machine tuple becomes (N ′, H′, source′, target′, guard′, child′, default′, r′,

E′, event′, action′), where

– N ′ = BN ′ ∪AN ′ ∪ON ′, where– BN ′ = BN ∪ { initO}– AN ′ = AN– ON ′ = ON

Page 30: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

30 Rik Eshuis, Pieter Van Gorp

– child′ = child ∪ {(initO, O)}– default′ = default ∪ {(O, initO)}– E′ = ∅– event′ = ∅– action′ = ∅

The first refactoring rule processes hyperedges that enter the OR node O that was pro-cessed in T3.

Definition 22 (Refactoring-1) Let (N,H, source, target, guard, child, default, r, event,action) be a state machine. Let O ∈ ON be an OR node in the state machine. Let h ∈ H bea hyperedge.

Condition. The initialization rule has been applied. Node O was processed in T3. A targetnode of h is contained in O, so there is a BASIC node n ∈ target(h) such that child∗(n,O).

Effect. The state machine tuple becomes (N ′, H′, source′, target′, guard′, child′, default′,r′, event′, action′), where

– H′ = H ∪ {copyh}– source′ = source ∪ {(copyh, initO)}– target′ = (target ∪ {(copyh, X)})⊕ {(h, target(h) \X)}– E′ = E ∪ {eventh}– event′ = event ∪ {(copyh, eventh)}– action′ = action ∪ {(h, eventh)}

where X = { n | n ∈ target(h) ∧ child∗(n,O)}).

In the definition of target, operator ⊕ denotes function overriding.The second refactoring rule processes hyperedges that leave the OR node O that was

processed in T3.

Definition 23 (Refactoring-2) Let (N,H, source, target, guard, child, default, r, event,action) be a state machine. Let O ∈ ON be an OR node in the state machine. Let h ∈ H bea hyperedge.

Condition. The initialization rule has been applied. Node O was processed in T3. A sourcenode of h is contained in O, so there is a BASIC node n ∈ source(h) such that child∗(n,O).

Effect. The state machine tuple becomes (N ′, H′, source′, target′, guard′, child′, default′, r′,E′, event′, action′), where

– H′ = H ∪ {copyh}– source′ = (source ∪ {(copyh, X)})⊕ {(h, source(h) \X)}– target′ = target ∪ {copyh, (initO)}– guard′ = guard⊕ {h 7→ g | g = guard(h) ∧

∧x∈X in(x)}

– E′ = E ∪ {eventh}– event′ = event ∪ {(copyh, eventh)}– action′ = action ∪ {(h, eventh)}

where X = { n | n ∈ source(h) ∧ child∗(n,O)}).

A.5 Correctness

We prove the transformation from filtered activity diagrams to state machines correct for asubset of activity diagrams. We first introduce auxiliary definitions.

Definition 24 (Minimal areas) Let AD = (A,O,C,E, guard, instate) be a filtered activitydiagram such that no edge (x, y) connects two fork/join nodes, so E∩ (FJ×FJ) = ∅. An areais a set X ⊆ A ∪O ∪ C such that for each x ∈ X,

– if x ∈ FJ then in(x) ⊆ X ⇔ out(x) ⊆ X;– if x 6∈ FJ then in(x) ∪ out(x) ⊆ X.

A minimal area for a set Y ⊆ A ∪ O ∪ C of nodes is an area X such that Y ⊆ X and X isminimal, so for each area Z, if Z ⊂ X then Y 6⊆ Z.

Page 31: Synthesizing Object Life Cycles from Business Process Modelsis.tm.tue.nl/staff/heshuis/EshuisVanGorp-SoSyM.pdf · Noname manuscript No. (will be inserted by the editor) Synthesizing

Synthesizing Object Life Cycles from Business Process Models 31

For instance, for Fig. 6 the minimal area for {Claim[policy not checked]} is set {Claim[policy notchecked],Claim[policy checked]} while the minimal area for {Claim[policy not checked],Claim[damagenot checked]} is the set of successor nodes upto and including the merge node.

Definition 25 (Consistent areas) Let AD = (A,O,C,E, guard, instate) be an activity di-agram such that no edge (x, y) connects two fork/join nodes, so E ∩ (FJ × FJ) = ∅. Afork/join node n ∈ FJ has consistent areas if for each pair of disjoints set X,Y such thatX,Y ⊆ in(n) or X,Y ⊆ out(n) the minimal areas of X and Y are disjoint. Activity diagramAD has consistent areas if each fork/join node n ∈ FJ has consistent areas.

For instance, the activity diagram in Fig. 6 has consistent areas. For example the minimalareas for sets {Claim[received]} and {Claim[policy not checked],Claim[damage not checked]} aredisjoint.

The activity diagram in Fig. 15 does not have consistent areas. The minimal areas for sets{name[S1] and {name[S5]} overlap, since both contain name[S4]. Also, the activity diagram inFig. 17(a) does not have consistent areas: the minimal area for {name[S2]} is a subset of theminimal area for {name[S3]}, which is the entire set of nodes minus the fork.

Theorem 1 Let AD = (A,O,C,E, guard, instate) be a filtered activity diagram, so A = ∅and let SC = (N,H, source, target, guard, child, default, r) be the state machine that resultsby applying the transformation rules T1, T2, and T3. If AD has consistent areas, then ADand SC have equivalent behavior.

Proof Activity diagram AD maps into a marked Petri net PN = (P, T, F,M) where– P = (O ∪ C) \ FJ– T = FJ ∪ {t(x,y)|(x, y) ∈ E ∧ x, y 6∈ FJ}– F = {(x, y)|(x, y) ∈ E ∧ ((x ∈ FJ ∧ y 6∈ FJ) ∨ (x 6∈ FJ ∧ y ∈ FJ))}∪ {(x, t(x,y)), (t(x,y), y)|(x, y) ∈ E ∧ x, y 6∈ FJ}

– M = {(i, 1)} ∪ {(x, 0)|x ∈ P \ {i}}The definition states that every node in (O∪C)\FJ maps to a Petri net place. Every node inFJ and every edge e ∈ E that is not incident to a fork or a join maps to a Petri net transition.For every edge that connects two nodes in FJ a place is created. The flow relation F is definedaccordingly.

Elsewhere [10] we have defined transformation rules T1 and T2 plus the notion of areasin the context of Petri nets. There, we proved that that if PN has consistent areas and theareas are nestable, so for every pair of overlapping areas X,Y either X ⊆ Y or Y ⊆ X, thenapplying transformation rules T1 and T2 to PN = (P, T, F,M) results in a state machine SCthat has equivalent behavior, in the sense that the reachable markings of PN are isomorphicto the reachable configurations of SC.

We now argue that if rule T3 has been applied, the behavior of the filtered activity diagramis preserved in the refactored state machine. Let O be an OR node to which T3 is applied. Letf ∈ FJ be a fork node of which O is a target node in the reduced activity diagram. In thefiltered activity diagram, if f is taken then all nodes in in(f) are left and all nodes in out(f)are entered. The state machine before refactoring contains a hyperedge f whose source setby construction (Def. 20) equals in(f) and whose target set equals out(f). Refactoring rule 1ensures that when in the refactored state machine hyperedge f is taken, every BASIC nodethat is entered is in the target set of f before refactoring. Since AD has consistent areas, ifin(f) is active, then no node in out(f) is active. Therefore f only leaves BASIC nodes thatare sources of f before refactoring, plus the fresh node initO. Therefore, when f is taken inthe filtered activity diagram, the nodes left and entered correspond to the BASIC nodes leftand entered when hyperedge f is taken in the refactored state machine.

Symmetrically, let j ∈ FJ be a join node of which O is a source node in the reduced activ-ity diagram. Refactoring rule 2 ensures that when in the refactored state machine hyperedgej is taken, every BASIC node that is left is in the source set of j before refactoring, and everyBASIC node that is entered is either in the target set of j before refactoring or initO. Becauseof the guard condition, j only leaves BASIC nodes that are sources of j before refactoring.Therefore, when j is taken in the filtered activity diagram, the nodes left and entered corre-spond to the BASIC nodes left and entered when hyperedge j is taken in the refactored statemachine.

Consequently, the behavior of the refactored state machine corresponds to that of thefiltered activity diagram. ut