generating structured implementation schemes from sequence ... · a class diagram is used to...

12
Generating Structured Implementation Schemes from UML Sequence Diagrams Petri Selonen, Tarja Systa, Kai Koskimies Tampere University of Technology Software Systems Laborator), P.O. Box 553 FIN-331 01 Tampere, Finland (pselonen, tsysta, kk}@cs.tut.fi Abstract In the Unijied Modeling Language (UML), a iise case describes a particular functioriality a system can pe$orm by interacting with outside actors. A realization of a use case can be given as a set of sequence diagrams. This paper discusses how to generate strirctured implementation schemes represented as pseudocode from a set of sequence diagrams in UML. The proposed approach can he applied to any set of sequence diagrams, allowing the user to view the implementations of operation bodies as implied by this set of sequence diagrams, and to merge the different views into a single implementation scheme that can be used as a starting point for the actual implementation. We show how these techniques can be exploited in a UML-based CASE envirorinierit by augmenting an automatically generated class diagram with UML notes describing implenientation schemes f o r individual operations. The described techniques have been implemented in a real CASE environment. 1. Introduction UML [3,16,20,21] has become an industrial standard for the presentation of various design artifacts in object-oriented software development. UML provides different diagram types that can be used to view a system from different perspectives or at different levels of abstraction. With use cases, the designer can specify sequences of actions a system can perform by interacting with outside actors [16]. A realization of a use can be specified by sequence diagrams, which describe collaborations and interactions of objects. This paper discusses how to generate structured pseudocode implementations for operations on the basis of a set of sequence diagrams, typically refining a use case. This technique provides the user a novel way of viewing automatically produced operation bodies as implied by chosen sequence diagrams. Hence, the pseudocode representation reduces the conceptual gap between sequence diagram models and actual implementation. This technique is especially suitable in a use case driven approach where a system model is built incrementally by adding new use cases to the model. The technique allows the designer to examine the implementations of operation bodies from the viewpoint of a single use case (which may represent a particular type of functionality), and then merge these implementations. This resembles the way aspects are weaved in Aspect-Oriented Programming [9]. We show how operation descriptions are synthesized from the sequence diagrams as 0-7695-1251-8/01 $10.00 0 2001 IEEE 317

Upload: others

Post on 16-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

Generating Structured Implementation Schemes from UML Sequence Diagrams

Petri Selonen Tarja Systa Kai Koskimies Tampere University of Technology

Software Systems Laborator) PO Box 553

FIN-331 01 Tampere Finland (pselonen tsysta kkcstutfi

Abstract

I n the Unijied Modeling Language (UML) a iise case describes a particular functioriality a system can pe$orm by interacting with outside actors A realization of a use case can be given as a set of sequence diagrams This paper discusses how to generate strirctured implementation schemes represented as pseudocode from a set of sequence diagrams in UML The proposed approach can he applied to any set of sequence diagrams allowing the user to view the implementations of operation bodies as implied by this set of sequence diagrams and to merge the different views into a single implementation scheme that can be used as a starting point for the actual implementation We show how these techniques can be exploited in a UML-based CASE envirorinierit by augmenting an automatically generated class diagram with UML notes describing implenientation schemes f o r individual operations The described techniques have been implemented in a real CASE environment

1 Introduction

UML [3162021] has become an industrial standard for the presentation of various design artifacts in object-oriented software development UML provides different diagram types that can be used to view a system from different perspectives or at different levels of abstraction With use cases the designer can specify sequences of actions a system can perform by interacting with outside actors [16] A realization of a use can be specified by sequence diagrams which describe collaborations and interactions of objects

This paper discusses how to generate structured pseudocode implementations for operations on the basis of a set of sequence diagrams typically refining a use case This technique provides the user a novel way of viewing automatically produced operation bodies as implied by chosen sequence diagrams Hence the pseudocode representation reduces the conceptual gap between sequence diagram models and actual implementation This technique is especially suitable in a use case driven approach where a system model is built incrementally by adding new use cases to the model The technique allows the designer to examine the implementations of operation bodies from the viewpoint of a single use case (which may represent a particular type of functionality) and then merge these implementations This resembles the way aspects are weaved in Aspect-Oriented Programming [9]

We show how operation descriptions are synthesized from the sequence diagrams as

0-7695-1251-801 $1000 0 2001 IEEE 317

318

state machines The state machines in turn are used as input for a pseudocode generator which outputs a sketch of the operation body in a pseudocode format

The generation of executable code from state machines is a well-known technique and exploited in many UML-based tools For example with Rhapsody [SI and STATEMATE [7] the user can generate code from statecharts [6] To be able to generate code that compiles and executes the underlying statechart has to be exact consistent and complete This however is not our goal we are not aiming at executable models or conventional code generation (although in principle our approach could be refined to actual code generation) Instead we aim at high-level readable descriptions of operation bodies at early stages of design generated from sequence diagram models that can be incomplete and imperfect Note that for most operations of objects appearing in conventional 00 systems the most natural description format is textual If however the dynamic behavior of an operation is such that a structured pseudocode representation cannot be formed a statechart representation is shown instead

A class diagram is used to specify the static structure of a system in UML Class diagrams are often annotated with pseudocode descriptions of operation implementations attached as notes to class symbols For example such descriptions are frequently used in design pattern documentation [ 5 ] This information depends heavily on the dynamic model of the system as illustrated by the sequence diagrams We show how the pseudocode generation technique can be conveniently integrated with a technique for generating class diagrams from sequence diagrams representing operation bodies in pseudocode notes attached to class symbols in a class diagram

Our implementation platform the Nokia TED [23] is a multi-user software development environment that has been implemented at the Nokia Research Center TED supports most of the UML diagram types and a reasonably large subset of the UML metamodel as a component library The transformation algorithms presented in this paper are implemented as COM components interoperable with TED However in principle the proposed techniques can be implemented for any tool supporting UML and providing a reasonable API for accessing the model repository

2 Model synthesis in UML

Automated support for synthesizing one UML model from another can provide significant help for the designer Such synthesis operations help the designer to keep the models consistent to speed up the design process and to decrease the risk of errors Normally every iteration phase increases the information present in the design The new information must be somehow translated and merged to existing models Usually this is done manually which is often tedious and can easily lead to mismatches and inconsistencies due to human errors Model synthesis is also an efficient way of creating intermediate documentation and can thus be used for enhancing communication between designers and management

Sequence and class diagrams have central roles in the software development process While a class diagram is used to view a collection of static model elements and their relationships a sequence diagram provides an example of the dynamic behavior of the system Sequence and class diagrams are also useful in various automated model transformation operations [ 171 For example both class and statechart diagrams can be synthesized from a set of sequence diagrams [ 1 1121722] In what follows we discuss two model transformations used in our work

21 Generating state machines for operation calls 3 19

Koskimies and Makinen have demonstrated how a minimal state machine can be synthesized from sequence diagrams automatically [ IO] This algorithm has been integrated with TED for synthesizing a state machine for a selected participating object from a set of sequence diagrams The algorithm first extracts a trace from the sequence diagrams by traversing the lifeline of the object from top to bottom in each sequence diagram The algorithm then maps items in the message trace to transitions and states in a state machine Sent messages are regarded as primitive actions associated with states Each received message is mapped to a transition A synthesized state machine is deterministic ie there cannot be two similarly labeled

leaving transitions in any particular state If the events causing the transitions are the same but the guards differ it is legal to have them as leaving transitions of the same state In other words the synthesis algorithm does not allow two applicable transitions to be simultaneously satisfied In addition the algorithm does not allow a completion transition and a labeled transition to leave the same state unless their guards differ Taking the conditions of determinism into account the synthesis algorithm gives the minimal state machine with respect to the number of states

The synthesis algorithm can also be used for synthesizing a state machine for the behavior of an operation (instead of an object) from sequence diagrams provided that the messages involved in the execution of an operation can be identified in the sequence diagrams For simplicity we assume that the interesting object is a conventional passive object providing services through its operations In a sequence diagram an operation call for an object is shown with a received message sent by the caller object The corresponding return from the operation is shown with a sent message received by the caller object Assuming that the called object does not receive another call before completing the call (for example callbacks) all the leaving messages between the call and the return are intemal calls of operations of other objects and all arriving messages are returning counterparts of these calls If the object does receive calls during the ongoing call the events associated with such calls are excluded We assume that this can be done in one way or another Sufficient information for this is that the returns are fully marked or that so-called focus of control 1161 has been used All other events associated with the object between the call and the corresponding return are read and given as input for the state machine synthesizer The general synthesis algorithm is described in more detail in [ 10l 1 I 81

In this research the state machine synthesis algorithm is used as an intermediate step in the implementation scheme generation To enable the generation of descriptive and understandable implementation schemes from a state machine we have modified the original synthesis algorithm to identify receivers for each operation call Thus for each Operation call the algorithm appends the name of the receiver to the name of the message The original synthesis algorithm identifies the receivers only when they cannot be uniquely determined ie when the same message is sent to several objects (eg a broadcast)

22 Generating class diagrams from sequence diagrams

A class diagram can be synthesized on a basis of sequence diagrams In Section 4 we combine this transformation technique with the pseudocode synthesis of operations We divide the transformation operation into two phases First we translate elements in a sequence diagram into a class diagram in the following way we map classifier roles (participants) and messages of a sequence diagram to classes associations and operations in a class diagram This mapping is fairly straightforward if there is a message between two

320

objects then there must be an association between the classes of the objects and the class of the receiving object must have the corresponding operation Second we use a collection of heuristic rules to generate interface hierarchies composition relationships and multiplicities The suggestions given by these rules are guaranteed to be consistent even though they do not necessarily match with the intentions of the designer However the suggestions give useful hints for the designer and inform her about the appearance of certain types of patterns in the sequence diagrams The suggestions concern patterns that may have certain implications in a class diagram As an example if an object interacts with several instances of the same class the rules suggest to add a multiplicity symbol () to the association generated from this interaction These rules are discussed in more detail in [17]

3 Transforming state machines into pseudocode

In UML a use case defines a coherent unit of functionality of a (sub)system without revealing the internal structure of the (sub)system [ 161 Sequence diagrams and collaboration diagrams can be used for specifying realizations of use cases In the previous section we explained how it is possible to synthesize a state machine (a statechart diagram in UML) for an operation on the basis of sequence diagrams In this section we show how a state machine can be transformed into structured pseudocode which is usually the most natural representation format for operation bodies

31 Pseudocode generation

We define a simple graph grammar that gives the basic production rules used for decomposing a state machine The production rules define a language that the pseudocode generator can accept The basic structures that we are looking for are well known and date back to the very first imperative programming languages such as Algol Dijkstra [4] suggests three types of decomposition concatenation selection and repetition Concatenation includes statements that can be executed sequentially Selection includes basic branching structures if if then else and case of Repetition includes while and repeat until Dijkstra argues that programs should be written using these constructs We adopt these control structures as the basis of our grammar

It is obvious that only a subset of all state machines conform to the language defined by these rules For example interleaved control structures cannot be expressed by these rules only On the other hand it would be trivial to generate pseudocode from an arbitrary state machine using switch-case structures or goto-instructions However we do not consider that kind of approach useful since our purpose is to give an intuitive natural overview of an operation body as it might be implemented according to current design information If such a readable high-level description cannot be generated we expect that the state machine representation is likely to be more understandable than a textual form based on say a switch-case pattern Since the state machine will be produced in any case as a side-effect in our approach we can display the description of an operation as a state machine (UML statechart diagram) in such cases

A central problem in transforming a state machine into pseudocode is the parsing of a state machine There exist techniques and tools for graphical parsing [131519] Our approach can be seen as an instance of a class of high-level graph parsing algorithms However in our case applying a general graphical parsing algorithm would have been inappropriate because i t is possible to derive a more efficient solution for our simple grammar directly

The algorithm takes a state machine as its input and generates when possible a

321

pseudocode procedure implementing the state machine We assume that the input state machine is given in a form where possible extensions (like entry and exit actions) have been expanded into ordinary states and transitions In this normal form a state machine consists of states with an action and transitions with a label Both actions and labels may be empty The state machine synthesis algorithm used here generates state machines of this form

Actions and transition labels are handled as plain strings We assume that if there i s more than one state transition between any two states these transitions are combined to a single transition together with the label consisting of the disjunction of the original labels L(t) denotes the label associated with transition t P(q) denotes the action associated with state q

The language of the produced pseudocode consists of the following symbols begin block (I) end block () if then else while d o and return In addition assert i s used by default to indicate which triggering conditions must be available for a particular action to take place All other strings in the pseudocode are either simple statements or expressions (conditions) corresponding to the actions and labels of transitions of the state machine respectively a simple statement i s always of form P(q) and an expression is of form L(r) The control constructs of the language are if-statements while-loops and do-while - loops

The production rules are the following

I Start 2 EmptyTerminal state (2 I 22) 3 Sequential blocks 4 Branching (41 - 43) (if - else if) 5 Do loop 6 While loop

1

21

22

3

41

42

reduction rules a= e t a

a =

a =

a =

5

6

Figure 1 Production rules

322

The production rules are shown in Figure I The non-terminals are marked with labels S G and B S is the start symbol G stands for a structured subgraph and B stands for a branch in a conditional structure Note that a start state is always required The terminals of the grammar are states and state transitions Individual state transitions are marked with labels s and t T stands for a (possible) set of one or more transitions resulting from the repeated application of rule 42 Some of the productions contain context vertices that are required to be present but are not modified during transformation The context vertices are indicated using dashed lines

With the given production rules it is assumed that the non-terminal symbol G has unique entry and exit nodes (that is nodes with incoming or outgoing edges) If G produces empty the possible single incoming edge is assimilated with the possible single outgoing edge Otherwise the incoming and outgoing edges are removed

Each non-terminal of the graph grammar is associated with a code attribute The value of the code attribute is the pseudocode representing that particular instance of the non-terminal (that is a subgraph)

The computation rules for the code attribute are the following corresponding to the production rules

1 Scode = Gcode 21 Gcode = empty 22 Gcode = P(q) R(q)

where R(q) is defined as follows R(q) = empty if q has outgoing edges

3 Gcode = GLcode [ assert( L(r) ) ] ampcode 41 Gcode = P(q) if( L(t) ) ( GLcode Bcode ampcode 42 Bcode = =code =code 43 Bcode = else if( L(t) ) Gcode 5 Gcode = do ( P(q) [ assert( L(s) ) ] Gcode ] while ( L(t) ) [ assert( L(t2) 1 I 6 Gcode = P(q) while ( L(s) ) Gcode [ assert( L(t) ) ] P(q) ] [ assert( L(r2) ) ]

Apart from the set of branching rules most of the production rules are reasonably intuitive Rule 41 is used for producing a branching structure of two or more branches beginning with a transition t Rule 42 is used for duplicating an arbitrary number of additional else if branches which are then processed by rule 43 Note that no finishing else branch is generated since we want to explicitly state the reason for taking each particular branch

The context-sensitiveness of the production rules places heavy requirements for the parsing algorithm Rekers and Schurr discuss the problems related to such graph grammar structures in [15] Fortunately by having a stable set of production rules instead of an arbitrary one our parsing procedure is reduced to a graph-matching problem with a fixed set of subgraphs to be searched and identified

In practice i t is often important to express the reason for some or all of the transitions explicitly We define two modes for our algorithm non-verbose mode and verbose mode The former is a loose form of a transformation where we only present those conditions that are used together with if else if or while instructions In this form the exit conditions from iterations and transitions in the sequential blocks rule are not transformed into pseudocode as assertions (shown in brackets in the computation rules)

The graph grammar is actually ambiguous in the sense that it is possible to produce different derivations for the same state machine Naturally the order does not affect the validness of the pseudocode as long as the parsing succeeds We use a tail-recursive

return otherwise

323

algorithm that parses the state machine interpreted as a graph and searches for subgraph structures corresponding to theproduction rules forcing a particular precedence order

UML has two fairly weak additional features that can be taken into account when interpreting messages in a sequence diagram guards and iterations UML does not specify their content so they are handled as ordinary strings Note that some preprocessing is done on the sequence diagram Stereotyped messages for object creation and destruction (m-eaten and (ltdestroygt respectively) are transformed into Java-like expressions using new and delete

I

I l l i f i L l d l I I

eo I til

W I Figure 2 Examples of accepted and rejected state machines

The left side of Figure 2 shows an example of a state machine that is accepted by the production rules together with the generated pseudocode including assertions and the right side shows simplest state machine [ l ] that does not conform to the production rules described previously The reason is the two-directional connection between the two subgraphs of the if-else if -structure which violates the single entry principle of substructures The handwritten example shows however that even though the state machine can be expressed without using explicit state variables or goto instructions the original visualization is clearer In cases like this the structured implementation scheme provided for the designer is the state machine itself not the pseudocode

4 Exploitation in a UML tool

The UML class diagram notation supports notes that can be used eg to attach code fragments to specific model elements In this section we show how pseudocode generation can be exploited in the generation of class diagrams on the basis of sequence diagrams We illustrate the results with a simple example run on an existing tool prototype In order to use the mechanisms described earlier we have integrated them into the TED tool [23] First we export a set of sequence diagrams from the TED repository describing different usage scenarios of a particular operation We synthesize a state machine from the sequence diagrams for a desired operation and import the resulting state machine back into TED We then further export this state machine for operation description synthesis Finally the resulting pseudocode is imported to TED as a UML note and placed to the appropriate class diagram When desired the whole process can be fully automated

324

As an example consider a graphical user interface (GUI) dialog containing a listbox and a button that is used to update the content of the listbox When the button is pressed the state of each element (listed in the listbox) is updated on the basis of the model If the state has become inconsistent the element is deleted and removed from the list and a replacing new element is added We will use this example throughout the rest of this paper

Figure 3 Example sequence diagram and synthesized statechart diagram

Figure 3 shows an example sequence diagram in which the external user presses the update button once The listbox contains only one element (a) whose state is inconsistent The element (a) is deleted and replaced by a new updated element (b) In order to give a more comprehensive example we construct two other sequence diagrams describing different execution paths All three sequence diagrams describe the behavior related to a single use case The second sequence diagram describes a case where the listbox is empty and the third one describes a situation where the elements in the listbox are all valid

We then call the state machine synthesizer component which reads these three sequence diagrams from the TED repository and synthesizes a state machine describing the execution of the ListBoxupdate() operation The view of the resulting state machine is shown next to the example sequence diagram in Figure 3

Figure 4 shows the operation description for ListBoxupdate() synthesized from the state machine in Figure 3 together with the class diagram synthesized directly from the sequence diagrams The pseudocode shows a while loop with an enclosing if-structure object destruction and creation actions and one assertion In short the listbox object checks whether the element list is empty and if not goes trough the immutable elements in the element list one by one If an element is not consistent it is removed and replaced with a new updated element The pseudocode representation shows essentially the same information as the state machine in Figure 3 but in a form that is more readable for programmers and that can be used as a starting point for implementing the operation

325

aupdatc0 fl inconsistent) I

clomentLst rcmovco delete a b = new Element

elementLstadd)

S y n l k s m d C L D S y n l k i l n d CLD L i S t B O X

I

S y n t k i m d CLD S y n t k r m d C L D

Figure 4 Synthesized operation description

As a final remark our technique opens up an interesting possibility to obtain automatically slices of operation implementations as required by certain functionality (represented by a certain set of sequence diagrams) Since the pseudocode synthesis can be applied to any (sub)set of sequence diagrams the designer can easily see what kind of implementation is required for a certain functionality on the basis of the sequence diagrams and in this way examine the implementation in a limited context This allows the designer to focus on one functionality at a time making i t easier to understand and correct the behavior Eventually the designer can synthesize the complete pseudocode implementation of an operation from the total set of sequence diagrams merging the partial implementations This approach is illustrated in Figure 5

Figure 5 Example of combining operation descriptions

In Figure 5 we have two sets of sequence diagrams describing the realization of two distinct use cases We can synthesize a state machine and generate pseudocode for both use

326

cases individually but we can also incrementally produce a new composite state machine This state machine can then be used for the generation of a combined implementation scheme

5 Related research

Biermann and Krisnashwamy present an algorithm for synthesizing programs from their traces [2] This algorithm has been used as a basis for our state machine synthesis The idea is that the user specifies the data structures of a program and describes (graphically) its expected behavior as example traces A trace consists of primitive instructions (like assignments) and conditions that hold before certain instructions Essentially the user gives traces of the expected program and the algorithm produces the smallest program that is capable of executing the given example traces Moreover after giving some finite number of example traces taken from a program the algorithm produces a program that can execute exactly the same set of traces as the original one that is the algorithm learns (or infers) an unknown program The program is represented as a transition graph where the nodes are instructions and transitions are conditions A sequence diagram corresponds to a trace in the algorithm by Biermann and Krisnashwamy sent messages are treated like instructions and received messages like conditions Since their method is intended for producing program code it is suited in principle for pseudocode generation However the output of the algorithm by Biermann and Krisnashwamy is a transition graph In our approach a state machine is an intermediate step Our goal is a pseudocode presentation that includes algorithmic structures searched from the state machine

Normark introduces an approach to extract static models from a set of scenarios [14] The proposed technique has been implemented in a tool called DYNAMO A scenario consists of participating objects and messages sent between them shown in a UML sequence diagram fashion In addition pre and post conditions can be set at any point in the scenario diagram Our approach does not require any extensions to the UML sequence diagram notation On the other hand the pre and post conditions enable DYNAMO to produce more complete code compared to our pseudocode generator The extracted static models in DYNAMO are presented in terms of an object-oriented programming language syntax while we aim to visualize the extracted information as annotated UML class diagrams

6 Future work

In this paper we have discussed simple UML sequence diagrams consisting only of objects and messages between them This basic form of a sequence diagram is extended in UML with various other concepts for instance conditional brunching iteration recursion and explicit return messages In addition states of objects can be attached to a sequence diagram

We aim to extend our approach to handle the full UML sequence diagram notation The state machine synthesizer for instance can easily be extended to take the above mentioned notation concepts into account For example a conditional branching can be interpreted as a shorthand notation for several merged sequence diagrams each of them expressing one branch in the structure From the senders point of view the name of the message is parsed so that the guard condition itself is considered as a received message (and thus will be mapped to a transition in the resulting state machine) and the message name as a normal sent message

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 2: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

318

state machines The state machines in turn are used as input for a pseudocode generator which outputs a sketch of the operation body in a pseudocode format

The generation of executable code from state machines is a well-known technique and exploited in many UML-based tools For example with Rhapsody [SI and STATEMATE [7] the user can generate code from statecharts [6] To be able to generate code that compiles and executes the underlying statechart has to be exact consistent and complete This however is not our goal we are not aiming at executable models or conventional code generation (although in principle our approach could be refined to actual code generation) Instead we aim at high-level readable descriptions of operation bodies at early stages of design generated from sequence diagram models that can be incomplete and imperfect Note that for most operations of objects appearing in conventional 00 systems the most natural description format is textual If however the dynamic behavior of an operation is such that a structured pseudocode representation cannot be formed a statechart representation is shown instead

A class diagram is used to specify the static structure of a system in UML Class diagrams are often annotated with pseudocode descriptions of operation implementations attached as notes to class symbols For example such descriptions are frequently used in design pattern documentation [ 5 ] This information depends heavily on the dynamic model of the system as illustrated by the sequence diagrams We show how the pseudocode generation technique can be conveniently integrated with a technique for generating class diagrams from sequence diagrams representing operation bodies in pseudocode notes attached to class symbols in a class diagram

Our implementation platform the Nokia TED [23] is a multi-user software development environment that has been implemented at the Nokia Research Center TED supports most of the UML diagram types and a reasonably large subset of the UML metamodel as a component library The transformation algorithms presented in this paper are implemented as COM components interoperable with TED However in principle the proposed techniques can be implemented for any tool supporting UML and providing a reasonable API for accessing the model repository

2 Model synthesis in UML

Automated support for synthesizing one UML model from another can provide significant help for the designer Such synthesis operations help the designer to keep the models consistent to speed up the design process and to decrease the risk of errors Normally every iteration phase increases the information present in the design The new information must be somehow translated and merged to existing models Usually this is done manually which is often tedious and can easily lead to mismatches and inconsistencies due to human errors Model synthesis is also an efficient way of creating intermediate documentation and can thus be used for enhancing communication between designers and management

Sequence and class diagrams have central roles in the software development process While a class diagram is used to view a collection of static model elements and their relationships a sequence diagram provides an example of the dynamic behavior of the system Sequence and class diagrams are also useful in various automated model transformation operations [ 171 For example both class and statechart diagrams can be synthesized from a set of sequence diagrams [ 1 1121722] In what follows we discuss two model transformations used in our work

21 Generating state machines for operation calls 3 19

Koskimies and Makinen have demonstrated how a minimal state machine can be synthesized from sequence diagrams automatically [ IO] This algorithm has been integrated with TED for synthesizing a state machine for a selected participating object from a set of sequence diagrams The algorithm first extracts a trace from the sequence diagrams by traversing the lifeline of the object from top to bottom in each sequence diagram The algorithm then maps items in the message trace to transitions and states in a state machine Sent messages are regarded as primitive actions associated with states Each received message is mapped to a transition A synthesized state machine is deterministic ie there cannot be two similarly labeled

leaving transitions in any particular state If the events causing the transitions are the same but the guards differ it is legal to have them as leaving transitions of the same state In other words the synthesis algorithm does not allow two applicable transitions to be simultaneously satisfied In addition the algorithm does not allow a completion transition and a labeled transition to leave the same state unless their guards differ Taking the conditions of determinism into account the synthesis algorithm gives the minimal state machine with respect to the number of states

The synthesis algorithm can also be used for synthesizing a state machine for the behavior of an operation (instead of an object) from sequence diagrams provided that the messages involved in the execution of an operation can be identified in the sequence diagrams For simplicity we assume that the interesting object is a conventional passive object providing services through its operations In a sequence diagram an operation call for an object is shown with a received message sent by the caller object The corresponding return from the operation is shown with a sent message received by the caller object Assuming that the called object does not receive another call before completing the call (for example callbacks) all the leaving messages between the call and the return are intemal calls of operations of other objects and all arriving messages are returning counterparts of these calls If the object does receive calls during the ongoing call the events associated with such calls are excluded We assume that this can be done in one way or another Sufficient information for this is that the returns are fully marked or that so-called focus of control 1161 has been used All other events associated with the object between the call and the corresponding return are read and given as input for the state machine synthesizer The general synthesis algorithm is described in more detail in [ 10l 1 I 81

In this research the state machine synthesis algorithm is used as an intermediate step in the implementation scheme generation To enable the generation of descriptive and understandable implementation schemes from a state machine we have modified the original synthesis algorithm to identify receivers for each operation call Thus for each Operation call the algorithm appends the name of the receiver to the name of the message The original synthesis algorithm identifies the receivers only when they cannot be uniquely determined ie when the same message is sent to several objects (eg a broadcast)

22 Generating class diagrams from sequence diagrams

A class diagram can be synthesized on a basis of sequence diagrams In Section 4 we combine this transformation technique with the pseudocode synthesis of operations We divide the transformation operation into two phases First we translate elements in a sequence diagram into a class diagram in the following way we map classifier roles (participants) and messages of a sequence diagram to classes associations and operations in a class diagram This mapping is fairly straightforward if there is a message between two

320

objects then there must be an association between the classes of the objects and the class of the receiving object must have the corresponding operation Second we use a collection of heuristic rules to generate interface hierarchies composition relationships and multiplicities The suggestions given by these rules are guaranteed to be consistent even though they do not necessarily match with the intentions of the designer However the suggestions give useful hints for the designer and inform her about the appearance of certain types of patterns in the sequence diagrams The suggestions concern patterns that may have certain implications in a class diagram As an example if an object interacts with several instances of the same class the rules suggest to add a multiplicity symbol () to the association generated from this interaction These rules are discussed in more detail in [17]

3 Transforming state machines into pseudocode

In UML a use case defines a coherent unit of functionality of a (sub)system without revealing the internal structure of the (sub)system [ 161 Sequence diagrams and collaboration diagrams can be used for specifying realizations of use cases In the previous section we explained how it is possible to synthesize a state machine (a statechart diagram in UML) for an operation on the basis of sequence diagrams In this section we show how a state machine can be transformed into structured pseudocode which is usually the most natural representation format for operation bodies

31 Pseudocode generation

We define a simple graph grammar that gives the basic production rules used for decomposing a state machine The production rules define a language that the pseudocode generator can accept The basic structures that we are looking for are well known and date back to the very first imperative programming languages such as Algol Dijkstra [4] suggests three types of decomposition concatenation selection and repetition Concatenation includes statements that can be executed sequentially Selection includes basic branching structures if if then else and case of Repetition includes while and repeat until Dijkstra argues that programs should be written using these constructs We adopt these control structures as the basis of our grammar

It is obvious that only a subset of all state machines conform to the language defined by these rules For example interleaved control structures cannot be expressed by these rules only On the other hand it would be trivial to generate pseudocode from an arbitrary state machine using switch-case structures or goto-instructions However we do not consider that kind of approach useful since our purpose is to give an intuitive natural overview of an operation body as it might be implemented according to current design information If such a readable high-level description cannot be generated we expect that the state machine representation is likely to be more understandable than a textual form based on say a switch-case pattern Since the state machine will be produced in any case as a side-effect in our approach we can display the description of an operation as a state machine (UML statechart diagram) in such cases

A central problem in transforming a state machine into pseudocode is the parsing of a state machine There exist techniques and tools for graphical parsing [131519] Our approach can be seen as an instance of a class of high-level graph parsing algorithms However in our case applying a general graphical parsing algorithm would have been inappropriate because i t is possible to derive a more efficient solution for our simple grammar directly

The algorithm takes a state machine as its input and generates when possible a

321

pseudocode procedure implementing the state machine We assume that the input state machine is given in a form where possible extensions (like entry and exit actions) have been expanded into ordinary states and transitions In this normal form a state machine consists of states with an action and transitions with a label Both actions and labels may be empty The state machine synthesis algorithm used here generates state machines of this form

Actions and transition labels are handled as plain strings We assume that if there i s more than one state transition between any two states these transitions are combined to a single transition together with the label consisting of the disjunction of the original labels L(t) denotes the label associated with transition t P(q) denotes the action associated with state q

The language of the produced pseudocode consists of the following symbols begin block (I) end block () if then else while d o and return In addition assert i s used by default to indicate which triggering conditions must be available for a particular action to take place All other strings in the pseudocode are either simple statements or expressions (conditions) corresponding to the actions and labels of transitions of the state machine respectively a simple statement i s always of form P(q) and an expression is of form L(r) The control constructs of the language are if-statements while-loops and do-while - loops

The production rules are the following

I Start 2 EmptyTerminal state (2 I 22) 3 Sequential blocks 4 Branching (41 - 43) (if - else if) 5 Do loop 6 While loop

1

21

22

3

41

42

reduction rules a= e t a

a =

a =

a =

5

6

Figure 1 Production rules

322

The production rules are shown in Figure I The non-terminals are marked with labels S G and B S is the start symbol G stands for a structured subgraph and B stands for a branch in a conditional structure Note that a start state is always required The terminals of the grammar are states and state transitions Individual state transitions are marked with labels s and t T stands for a (possible) set of one or more transitions resulting from the repeated application of rule 42 Some of the productions contain context vertices that are required to be present but are not modified during transformation The context vertices are indicated using dashed lines

With the given production rules it is assumed that the non-terminal symbol G has unique entry and exit nodes (that is nodes with incoming or outgoing edges) If G produces empty the possible single incoming edge is assimilated with the possible single outgoing edge Otherwise the incoming and outgoing edges are removed

Each non-terminal of the graph grammar is associated with a code attribute The value of the code attribute is the pseudocode representing that particular instance of the non-terminal (that is a subgraph)

The computation rules for the code attribute are the following corresponding to the production rules

1 Scode = Gcode 21 Gcode = empty 22 Gcode = P(q) R(q)

where R(q) is defined as follows R(q) = empty if q has outgoing edges

3 Gcode = GLcode [ assert( L(r) ) ] ampcode 41 Gcode = P(q) if( L(t) ) ( GLcode Bcode ampcode 42 Bcode = =code =code 43 Bcode = else if( L(t) ) Gcode 5 Gcode = do ( P(q) [ assert( L(s) ) ] Gcode ] while ( L(t) ) [ assert( L(t2) 1 I 6 Gcode = P(q) while ( L(s) ) Gcode [ assert( L(t) ) ] P(q) ] [ assert( L(r2) ) ]

Apart from the set of branching rules most of the production rules are reasonably intuitive Rule 41 is used for producing a branching structure of two or more branches beginning with a transition t Rule 42 is used for duplicating an arbitrary number of additional else if branches which are then processed by rule 43 Note that no finishing else branch is generated since we want to explicitly state the reason for taking each particular branch

The context-sensitiveness of the production rules places heavy requirements for the parsing algorithm Rekers and Schurr discuss the problems related to such graph grammar structures in [15] Fortunately by having a stable set of production rules instead of an arbitrary one our parsing procedure is reduced to a graph-matching problem with a fixed set of subgraphs to be searched and identified

In practice i t is often important to express the reason for some or all of the transitions explicitly We define two modes for our algorithm non-verbose mode and verbose mode The former is a loose form of a transformation where we only present those conditions that are used together with if else if or while instructions In this form the exit conditions from iterations and transitions in the sequential blocks rule are not transformed into pseudocode as assertions (shown in brackets in the computation rules)

The graph grammar is actually ambiguous in the sense that it is possible to produce different derivations for the same state machine Naturally the order does not affect the validness of the pseudocode as long as the parsing succeeds We use a tail-recursive

return otherwise

323

algorithm that parses the state machine interpreted as a graph and searches for subgraph structures corresponding to theproduction rules forcing a particular precedence order

UML has two fairly weak additional features that can be taken into account when interpreting messages in a sequence diagram guards and iterations UML does not specify their content so they are handled as ordinary strings Note that some preprocessing is done on the sequence diagram Stereotyped messages for object creation and destruction (m-eaten and (ltdestroygt respectively) are transformed into Java-like expressions using new and delete

I

I l l i f i L l d l I I

eo I til

W I Figure 2 Examples of accepted and rejected state machines

The left side of Figure 2 shows an example of a state machine that is accepted by the production rules together with the generated pseudocode including assertions and the right side shows simplest state machine [ l ] that does not conform to the production rules described previously The reason is the two-directional connection between the two subgraphs of the if-else if -structure which violates the single entry principle of substructures The handwritten example shows however that even though the state machine can be expressed without using explicit state variables or goto instructions the original visualization is clearer In cases like this the structured implementation scheme provided for the designer is the state machine itself not the pseudocode

4 Exploitation in a UML tool

The UML class diagram notation supports notes that can be used eg to attach code fragments to specific model elements In this section we show how pseudocode generation can be exploited in the generation of class diagrams on the basis of sequence diagrams We illustrate the results with a simple example run on an existing tool prototype In order to use the mechanisms described earlier we have integrated them into the TED tool [23] First we export a set of sequence diagrams from the TED repository describing different usage scenarios of a particular operation We synthesize a state machine from the sequence diagrams for a desired operation and import the resulting state machine back into TED We then further export this state machine for operation description synthesis Finally the resulting pseudocode is imported to TED as a UML note and placed to the appropriate class diagram When desired the whole process can be fully automated

324

As an example consider a graphical user interface (GUI) dialog containing a listbox and a button that is used to update the content of the listbox When the button is pressed the state of each element (listed in the listbox) is updated on the basis of the model If the state has become inconsistent the element is deleted and removed from the list and a replacing new element is added We will use this example throughout the rest of this paper

Figure 3 Example sequence diagram and synthesized statechart diagram

Figure 3 shows an example sequence diagram in which the external user presses the update button once The listbox contains only one element (a) whose state is inconsistent The element (a) is deleted and replaced by a new updated element (b) In order to give a more comprehensive example we construct two other sequence diagrams describing different execution paths All three sequence diagrams describe the behavior related to a single use case The second sequence diagram describes a case where the listbox is empty and the third one describes a situation where the elements in the listbox are all valid

We then call the state machine synthesizer component which reads these three sequence diagrams from the TED repository and synthesizes a state machine describing the execution of the ListBoxupdate() operation The view of the resulting state machine is shown next to the example sequence diagram in Figure 3

Figure 4 shows the operation description for ListBoxupdate() synthesized from the state machine in Figure 3 together with the class diagram synthesized directly from the sequence diagrams The pseudocode shows a while loop with an enclosing if-structure object destruction and creation actions and one assertion In short the listbox object checks whether the element list is empty and if not goes trough the immutable elements in the element list one by one If an element is not consistent it is removed and replaced with a new updated element The pseudocode representation shows essentially the same information as the state machine in Figure 3 but in a form that is more readable for programmers and that can be used as a starting point for implementing the operation

325

aupdatc0 fl inconsistent) I

clomentLst rcmovco delete a b = new Element

elementLstadd)

S y n l k s m d C L D S y n l k i l n d CLD L i S t B O X

I

S y n t k i m d CLD S y n t k r m d C L D

Figure 4 Synthesized operation description

As a final remark our technique opens up an interesting possibility to obtain automatically slices of operation implementations as required by certain functionality (represented by a certain set of sequence diagrams) Since the pseudocode synthesis can be applied to any (sub)set of sequence diagrams the designer can easily see what kind of implementation is required for a certain functionality on the basis of the sequence diagrams and in this way examine the implementation in a limited context This allows the designer to focus on one functionality at a time making i t easier to understand and correct the behavior Eventually the designer can synthesize the complete pseudocode implementation of an operation from the total set of sequence diagrams merging the partial implementations This approach is illustrated in Figure 5

Figure 5 Example of combining operation descriptions

In Figure 5 we have two sets of sequence diagrams describing the realization of two distinct use cases We can synthesize a state machine and generate pseudocode for both use

326

cases individually but we can also incrementally produce a new composite state machine This state machine can then be used for the generation of a combined implementation scheme

5 Related research

Biermann and Krisnashwamy present an algorithm for synthesizing programs from their traces [2] This algorithm has been used as a basis for our state machine synthesis The idea is that the user specifies the data structures of a program and describes (graphically) its expected behavior as example traces A trace consists of primitive instructions (like assignments) and conditions that hold before certain instructions Essentially the user gives traces of the expected program and the algorithm produces the smallest program that is capable of executing the given example traces Moreover after giving some finite number of example traces taken from a program the algorithm produces a program that can execute exactly the same set of traces as the original one that is the algorithm learns (or infers) an unknown program The program is represented as a transition graph where the nodes are instructions and transitions are conditions A sequence diagram corresponds to a trace in the algorithm by Biermann and Krisnashwamy sent messages are treated like instructions and received messages like conditions Since their method is intended for producing program code it is suited in principle for pseudocode generation However the output of the algorithm by Biermann and Krisnashwamy is a transition graph In our approach a state machine is an intermediate step Our goal is a pseudocode presentation that includes algorithmic structures searched from the state machine

Normark introduces an approach to extract static models from a set of scenarios [14] The proposed technique has been implemented in a tool called DYNAMO A scenario consists of participating objects and messages sent between them shown in a UML sequence diagram fashion In addition pre and post conditions can be set at any point in the scenario diagram Our approach does not require any extensions to the UML sequence diagram notation On the other hand the pre and post conditions enable DYNAMO to produce more complete code compared to our pseudocode generator The extracted static models in DYNAMO are presented in terms of an object-oriented programming language syntax while we aim to visualize the extracted information as annotated UML class diagrams

6 Future work

In this paper we have discussed simple UML sequence diagrams consisting only of objects and messages between them This basic form of a sequence diagram is extended in UML with various other concepts for instance conditional brunching iteration recursion and explicit return messages In addition states of objects can be attached to a sequence diagram

We aim to extend our approach to handle the full UML sequence diagram notation The state machine synthesizer for instance can easily be extended to take the above mentioned notation concepts into account For example a conditional branching can be interpreted as a shorthand notation for several merged sequence diagrams each of them expressing one branch in the structure From the senders point of view the name of the message is parsed so that the guard condition itself is considered as a received message (and thus will be mapped to a transition in the resulting state machine) and the message name as a normal sent message

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 3: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

21 Generating state machines for operation calls 3 19

Koskimies and Makinen have demonstrated how a minimal state machine can be synthesized from sequence diagrams automatically [ IO] This algorithm has been integrated with TED for synthesizing a state machine for a selected participating object from a set of sequence diagrams The algorithm first extracts a trace from the sequence diagrams by traversing the lifeline of the object from top to bottom in each sequence diagram The algorithm then maps items in the message trace to transitions and states in a state machine Sent messages are regarded as primitive actions associated with states Each received message is mapped to a transition A synthesized state machine is deterministic ie there cannot be two similarly labeled

leaving transitions in any particular state If the events causing the transitions are the same but the guards differ it is legal to have them as leaving transitions of the same state In other words the synthesis algorithm does not allow two applicable transitions to be simultaneously satisfied In addition the algorithm does not allow a completion transition and a labeled transition to leave the same state unless their guards differ Taking the conditions of determinism into account the synthesis algorithm gives the minimal state machine with respect to the number of states

The synthesis algorithm can also be used for synthesizing a state machine for the behavior of an operation (instead of an object) from sequence diagrams provided that the messages involved in the execution of an operation can be identified in the sequence diagrams For simplicity we assume that the interesting object is a conventional passive object providing services through its operations In a sequence diagram an operation call for an object is shown with a received message sent by the caller object The corresponding return from the operation is shown with a sent message received by the caller object Assuming that the called object does not receive another call before completing the call (for example callbacks) all the leaving messages between the call and the return are intemal calls of operations of other objects and all arriving messages are returning counterparts of these calls If the object does receive calls during the ongoing call the events associated with such calls are excluded We assume that this can be done in one way or another Sufficient information for this is that the returns are fully marked or that so-called focus of control 1161 has been used All other events associated with the object between the call and the corresponding return are read and given as input for the state machine synthesizer The general synthesis algorithm is described in more detail in [ 10l 1 I 81

In this research the state machine synthesis algorithm is used as an intermediate step in the implementation scheme generation To enable the generation of descriptive and understandable implementation schemes from a state machine we have modified the original synthesis algorithm to identify receivers for each operation call Thus for each Operation call the algorithm appends the name of the receiver to the name of the message The original synthesis algorithm identifies the receivers only when they cannot be uniquely determined ie when the same message is sent to several objects (eg a broadcast)

22 Generating class diagrams from sequence diagrams

A class diagram can be synthesized on a basis of sequence diagrams In Section 4 we combine this transformation technique with the pseudocode synthesis of operations We divide the transformation operation into two phases First we translate elements in a sequence diagram into a class diagram in the following way we map classifier roles (participants) and messages of a sequence diagram to classes associations and operations in a class diagram This mapping is fairly straightforward if there is a message between two

320

objects then there must be an association between the classes of the objects and the class of the receiving object must have the corresponding operation Second we use a collection of heuristic rules to generate interface hierarchies composition relationships and multiplicities The suggestions given by these rules are guaranteed to be consistent even though they do not necessarily match with the intentions of the designer However the suggestions give useful hints for the designer and inform her about the appearance of certain types of patterns in the sequence diagrams The suggestions concern patterns that may have certain implications in a class diagram As an example if an object interacts with several instances of the same class the rules suggest to add a multiplicity symbol () to the association generated from this interaction These rules are discussed in more detail in [17]

3 Transforming state machines into pseudocode

In UML a use case defines a coherent unit of functionality of a (sub)system without revealing the internal structure of the (sub)system [ 161 Sequence diagrams and collaboration diagrams can be used for specifying realizations of use cases In the previous section we explained how it is possible to synthesize a state machine (a statechart diagram in UML) for an operation on the basis of sequence diagrams In this section we show how a state machine can be transformed into structured pseudocode which is usually the most natural representation format for operation bodies

31 Pseudocode generation

We define a simple graph grammar that gives the basic production rules used for decomposing a state machine The production rules define a language that the pseudocode generator can accept The basic structures that we are looking for are well known and date back to the very first imperative programming languages such as Algol Dijkstra [4] suggests three types of decomposition concatenation selection and repetition Concatenation includes statements that can be executed sequentially Selection includes basic branching structures if if then else and case of Repetition includes while and repeat until Dijkstra argues that programs should be written using these constructs We adopt these control structures as the basis of our grammar

It is obvious that only a subset of all state machines conform to the language defined by these rules For example interleaved control structures cannot be expressed by these rules only On the other hand it would be trivial to generate pseudocode from an arbitrary state machine using switch-case structures or goto-instructions However we do not consider that kind of approach useful since our purpose is to give an intuitive natural overview of an operation body as it might be implemented according to current design information If such a readable high-level description cannot be generated we expect that the state machine representation is likely to be more understandable than a textual form based on say a switch-case pattern Since the state machine will be produced in any case as a side-effect in our approach we can display the description of an operation as a state machine (UML statechart diagram) in such cases

A central problem in transforming a state machine into pseudocode is the parsing of a state machine There exist techniques and tools for graphical parsing [131519] Our approach can be seen as an instance of a class of high-level graph parsing algorithms However in our case applying a general graphical parsing algorithm would have been inappropriate because i t is possible to derive a more efficient solution for our simple grammar directly

The algorithm takes a state machine as its input and generates when possible a

321

pseudocode procedure implementing the state machine We assume that the input state machine is given in a form where possible extensions (like entry and exit actions) have been expanded into ordinary states and transitions In this normal form a state machine consists of states with an action and transitions with a label Both actions and labels may be empty The state machine synthesis algorithm used here generates state machines of this form

Actions and transition labels are handled as plain strings We assume that if there i s more than one state transition between any two states these transitions are combined to a single transition together with the label consisting of the disjunction of the original labels L(t) denotes the label associated with transition t P(q) denotes the action associated with state q

The language of the produced pseudocode consists of the following symbols begin block (I) end block () if then else while d o and return In addition assert i s used by default to indicate which triggering conditions must be available for a particular action to take place All other strings in the pseudocode are either simple statements or expressions (conditions) corresponding to the actions and labels of transitions of the state machine respectively a simple statement i s always of form P(q) and an expression is of form L(r) The control constructs of the language are if-statements while-loops and do-while - loops

The production rules are the following

I Start 2 EmptyTerminal state (2 I 22) 3 Sequential blocks 4 Branching (41 - 43) (if - else if) 5 Do loop 6 While loop

1

21

22

3

41

42

reduction rules a= e t a

a =

a =

a =

5

6

Figure 1 Production rules

322

The production rules are shown in Figure I The non-terminals are marked with labels S G and B S is the start symbol G stands for a structured subgraph and B stands for a branch in a conditional structure Note that a start state is always required The terminals of the grammar are states and state transitions Individual state transitions are marked with labels s and t T stands for a (possible) set of one or more transitions resulting from the repeated application of rule 42 Some of the productions contain context vertices that are required to be present but are not modified during transformation The context vertices are indicated using dashed lines

With the given production rules it is assumed that the non-terminal symbol G has unique entry and exit nodes (that is nodes with incoming or outgoing edges) If G produces empty the possible single incoming edge is assimilated with the possible single outgoing edge Otherwise the incoming and outgoing edges are removed

Each non-terminal of the graph grammar is associated with a code attribute The value of the code attribute is the pseudocode representing that particular instance of the non-terminal (that is a subgraph)

The computation rules for the code attribute are the following corresponding to the production rules

1 Scode = Gcode 21 Gcode = empty 22 Gcode = P(q) R(q)

where R(q) is defined as follows R(q) = empty if q has outgoing edges

3 Gcode = GLcode [ assert( L(r) ) ] ampcode 41 Gcode = P(q) if( L(t) ) ( GLcode Bcode ampcode 42 Bcode = =code =code 43 Bcode = else if( L(t) ) Gcode 5 Gcode = do ( P(q) [ assert( L(s) ) ] Gcode ] while ( L(t) ) [ assert( L(t2) 1 I 6 Gcode = P(q) while ( L(s) ) Gcode [ assert( L(t) ) ] P(q) ] [ assert( L(r2) ) ]

Apart from the set of branching rules most of the production rules are reasonably intuitive Rule 41 is used for producing a branching structure of two or more branches beginning with a transition t Rule 42 is used for duplicating an arbitrary number of additional else if branches which are then processed by rule 43 Note that no finishing else branch is generated since we want to explicitly state the reason for taking each particular branch

The context-sensitiveness of the production rules places heavy requirements for the parsing algorithm Rekers and Schurr discuss the problems related to such graph grammar structures in [15] Fortunately by having a stable set of production rules instead of an arbitrary one our parsing procedure is reduced to a graph-matching problem with a fixed set of subgraphs to be searched and identified

In practice i t is often important to express the reason for some or all of the transitions explicitly We define two modes for our algorithm non-verbose mode and verbose mode The former is a loose form of a transformation where we only present those conditions that are used together with if else if or while instructions In this form the exit conditions from iterations and transitions in the sequential blocks rule are not transformed into pseudocode as assertions (shown in brackets in the computation rules)

The graph grammar is actually ambiguous in the sense that it is possible to produce different derivations for the same state machine Naturally the order does not affect the validness of the pseudocode as long as the parsing succeeds We use a tail-recursive

return otherwise

323

algorithm that parses the state machine interpreted as a graph and searches for subgraph structures corresponding to theproduction rules forcing a particular precedence order

UML has two fairly weak additional features that can be taken into account when interpreting messages in a sequence diagram guards and iterations UML does not specify their content so they are handled as ordinary strings Note that some preprocessing is done on the sequence diagram Stereotyped messages for object creation and destruction (m-eaten and (ltdestroygt respectively) are transformed into Java-like expressions using new and delete

I

I l l i f i L l d l I I

eo I til

W I Figure 2 Examples of accepted and rejected state machines

The left side of Figure 2 shows an example of a state machine that is accepted by the production rules together with the generated pseudocode including assertions and the right side shows simplest state machine [ l ] that does not conform to the production rules described previously The reason is the two-directional connection between the two subgraphs of the if-else if -structure which violates the single entry principle of substructures The handwritten example shows however that even though the state machine can be expressed without using explicit state variables or goto instructions the original visualization is clearer In cases like this the structured implementation scheme provided for the designer is the state machine itself not the pseudocode

4 Exploitation in a UML tool

The UML class diagram notation supports notes that can be used eg to attach code fragments to specific model elements In this section we show how pseudocode generation can be exploited in the generation of class diagrams on the basis of sequence diagrams We illustrate the results with a simple example run on an existing tool prototype In order to use the mechanisms described earlier we have integrated them into the TED tool [23] First we export a set of sequence diagrams from the TED repository describing different usage scenarios of a particular operation We synthesize a state machine from the sequence diagrams for a desired operation and import the resulting state machine back into TED We then further export this state machine for operation description synthesis Finally the resulting pseudocode is imported to TED as a UML note and placed to the appropriate class diagram When desired the whole process can be fully automated

324

As an example consider a graphical user interface (GUI) dialog containing a listbox and a button that is used to update the content of the listbox When the button is pressed the state of each element (listed in the listbox) is updated on the basis of the model If the state has become inconsistent the element is deleted and removed from the list and a replacing new element is added We will use this example throughout the rest of this paper

Figure 3 Example sequence diagram and synthesized statechart diagram

Figure 3 shows an example sequence diagram in which the external user presses the update button once The listbox contains only one element (a) whose state is inconsistent The element (a) is deleted and replaced by a new updated element (b) In order to give a more comprehensive example we construct two other sequence diagrams describing different execution paths All three sequence diagrams describe the behavior related to a single use case The second sequence diagram describes a case where the listbox is empty and the third one describes a situation where the elements in the listbox are all valid

We then call the state machine synthesizer component which reads these three sequence diagrams from the TED repository and synthesizes a state machine describing the execution of the ListBoxupdate() operation The view of the resulting state machine is shown next to the example sequence diagram in Figure 3

Figure 4 shows the operation description for ListBoxupdate() synthesized from the state machine in Figure 3 together with the class diagram synthesized directly from the sequence diagrams The pseudocode shows a while loop with an enclosing if-structure object destruction and creation actions and one assertion In short the listbox object checks whether the element list is empty and if not goes trough the immutable elements in the element list one by one If an element is not consistent it is removed and replaced with a new updated element The pseudocode representation shows essentially the same information as the state machine in Figure 3 but in a form that is more readable for programmers and that can be used as a starting point for implementing the operation

325

aupdatc0 fl inconsistent) I

clomentLst rcmovco delete a b = new Element

elementLstadd)

S y n l k s m d C L D S y n l k i l n d CLD L i S t B O X

I

S y n t k i m d CLD S y n t k r m d C L D

Figure 4 Synthesized operation description

As a final remark our technique opens up an interesting possibility to obtain automatically slices of operation implementations as required by certain functionality (represented by a certain set of sequence diagrams) Since the pseudocode synthesis can be applied to any (sub)set of sequence diagrams the designer can easily see what kind of implementation is required for a certain functionality on the basis of the sequence diagrams and in this way examine the implementation in a limited context This allows the designer to focus on one functionality at a time making i t easier to understand and correct the behavior Eventually the designer can synthesize the complete pseudocode implementation of an operation from the total set of sequence diagrams merging the partial implementations This approach is illustrated in Figure 5

Figure 5 Example of combining operation descriptions

In Figure 5 we have two sets of sequence diagrams describing the realization of two distinct use cases We can synthesize a state machine and generate pseudocode for both use

326

cases individually but we can also incrementally produce a new composite state machine This state machine can then be used for the generation of a combined implementation scheme

5 Related research

Biermann and Krisnashwamy present an algorithm for synthesizing programs from their traces [2] This algorithm has been used as a basis for our state machine synthesis The idea is that the user specifies the data structures of a program and describes (graphically) its expected behavior as example traces A trace consists of primitive instructions (like assignments) and conditions that hold before certain instructions Essentially the user gives traces of the expected program and the algorithm produces the smallest program that is capable of executing the given example traces Moreover after giving some finite number of example traces taken from a program the algorithm produces a program that can execute exactly the same set of traces as the original one that is the algorithm learns (or infers) an unknown program The program is represented as a transition graph where the nodes are instructions and transitions are conditions A sequence diagram corresponds to a trace in the algorithm by Biermann and Krisnashwamy sent messages are treated like instructions and received messages like conditions Since their method is intended for producing program code it is suited in principle for pseudocode generation However the output of the algorithm by Biermann and Krisnashwamy is a transition graph In our approach a state machine is an intermediate step Our goal is a pseudocode presentation that includes algorithmic structures searched from the state machine

Normark introduces an approach to extract static models from a set of scenarios [14] The proposed technique has been implemented in a tool called DYNAMO A scenario consists of participating objects and messages sent between them shown in a UML sequence diagram fashion In addition pre and post conditions can be set at any point in the scenario diagram Our approach does not require any extensions to the UML sequence diagram notation On the other hand the pre and post conditions enable DYNAMO to produce more complete code compared to our pseudocode generator The extracted static models in DYNAMO are presented in terms of an object-oriented programming language syntax while we aim to visualize the extracted information as annotated UML class diagrams

6 Future work

In this paper we have discussed simple UML sequence diagrams consisting only of objects and messages between them This basic form of a sequence diagram is extended in UML with various other concepts for instance conditional brunching iteration recursion and explicit return messages In addition states of objects can be attached to a sequence diagram

We aim to extend our approach to handle the full UML sequence diagram notation The state machine synthesizer for instance can easily be extended to take the above mentioned notation concepts into account For example a conditional branching can be interpreted as a shorthand notation for several merged sequence diagrams each of them expressing one branch in the structure From the senders point of view the name of the message is parsed so that the guard condition itself is considered as a received message (and thus will be mapped to a transition in the resulting state machine) and the message name as a normal sent message

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 4: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

320

objects then there must be an association between the classes of the objects and the class of the receiving object must have the corresponding operation Second we use a collection of heuristic rules to generate interface hierarchies composition relationships and multiplicities The suggestions given by these rules are guaranteed to be consistent even though they do not necessarily match with the intentions of the designer However the suggestions give useful hints for the designer and inform her about the appearance of certain types of patterns in the sequence diagrams The suggestions concern patterns that may have certain implications in a class diagram As an example if an object interacts with several instances of the same class the rules suggest to add a multiplicity symbol () to the association generated from this interaction These rules are discussed in more detail in [17]

3 Transforming state machines into pseudocode

In UML a use case defines a coherent unit of functionality of a (sub)system without revealing the internal structure of the (sub)system [ 161 Sequence diagrams and collaboration diagrams can be used for specifying realizations of use cases In the previous section we explained how it is possible to synthesize a state machine (a statechart diagram in UML) for an operation on the basis of sequence diagrams In this section we show how a state machine can be transformed into structured pseudocode which is usually the most natural representation format for operation bodies

31 Pseudocode generation

We define a simple graph grammar that gives the basic production rules used for decomposing a state machine The production rules define a language that the pseudocode generator can accept The basic structures that we are looking for are well known and date back to the very first imperative programming languages such as Algol Dijkstra [4] suggests three types of decomposition concatenation selection and repetition Concatenation includes statements that can be executed sequentially Selection includes basic branching structures if if then else and case of Repetition includes while and repeat until Dijkstra argues that programs should be written using these constructs We adopt these control structures as the basis of our grammar

It is obvious that only a subset of all state machines conform to the language defined by these rules For example interleaved control structures cannot be expressed by these rules only On the other hand it would be trivial to generate pseudocode from an arbitrary state machine using switch-case structures or goto-instructions However we do not consider that kind of approach useful since our purpose is to give an intuitive natural overview of an operation body as it might be implemented according to current design information If such a readable high-level description cannot be generated we expect that the state machine representation is likely to be more understandable than a textual form based on say a switch-case pattern Since the state machine will be produced in any case as a side-effect in our approach we can display the description of an operation as a state machine (UML statechart diagram) in such cases

A central problem in transforming a state machine into pseudocode is the parsing of a state machine There exist techniques and tools for graphical parsing [131519] Our approach can be seen as an instance of a class of high-level graph parsing algorithms However in our case applying a general graphical parsing algorithm would have been inappropriate because i t is possible to derive a more efficient solution for our simple grammar directly

The algorithm takes a state machine as its input and generates when possible a

321

pseudocode procedure implementing the state machine We assume that the input state machine is given in a form where possible extensions (like entry and exit actions) have been expanded into ordinary states and transitions In this normal form a state machine consists of states with an action and transitions with a label Both actions and labels may be empty The state machine synthesis algorithm used here generates state machines of this form

Actions and transition labels are handled as plain strings We assume that if there i s more than one state transition between any two states these transitions are combined to a single transition together with the label consisting of the disjunction of the original labels L(t) denotes the label associated with transition t P(q) denotes the action associated with state q

The language of the produced pseudocode consists of the following symbols begin block (I) end block () if then else while d o and return In addition assert i s used by default to indicate which triggering conditions must be available for a particular action to take place All other strings in the pseudocode are either simple statements or expressions (conditions) corresponding to the actions and labels of transitions of the state machine respectively a simple statement i s always of form P(q) and an expression is of form L(r) The control constructs of the language are if-statements while-loops and do-while - loops

The production rules are the following

I Start 2 EmptyTerminal state (2 I 22) 3 Sequential blocks 4 Branching (41 - 43) (if - else if) 5 Do loop 6 While loop

1

21

22

3

41

42

reduction rules a= e t a

a =

a =

a =

5

6

Figure 1 Production rules

322

The production rules are shown in Figure I The non-terminals are marked with labels S G and B S is the start symbol G stands for a structured subgraph and B stands for a branch in a conditional structure Note that a start state is always required The terminals of the grammar are states and state transitions Individual state transitions are marked with labels s and t T stands for a (possible) set of one or more transitions resulting from the repeated application of rule 42 Some of the productions contain context vertices that are required to be present but are not modified during transformation The context vertices are indicated using dashed lines

With the given production rules it is assumed that the non-terminal symbol G has unique entry and exit nodes (that is nodes with incoming or outgoing edges) If G produces empty the possible single incoming edge is assimilated with the possible single outgoing edge Otherwise the incoming and outgoing edges are removed

Each non-terminal of the graph grammar is associated with a code attribute The value of the code attribute is the pseudocode representing that particular instance of the non-terminal (that is a subgraph)

The computation rules for the code attribute are the following corresponding to the production rules

1 Scode = Gcode 21 Gcode = empty 22 Gcode = P(q) R(q)

where R(q) is defined as follows R(q) = empty if q has outgoing edges

3 Gcode = GLcode [ assert( L(r) ) ] ampcode 41 Gcode = P(q) if( L(t) ) ( GLcode Bcode ampcode 42 Bcode = =code =code 43 Bcode = else if( L(t) ) Gcode 5 Gcode = do ( P(q) [ assert( L(s) ) ] Gcode ] while ( L(t) ) [ assert( L(t2) 1 I 6 Gcode = P(q) while ( L(s) ) Gcode [ assert( L(t) ) ] P(q) ] [ assert( L(r2) ) ]

Apart from the set of branching rules most of the production rules are reasonably intuitive Rule 41 is used for producing a branching structure of two or more branches beginning with a transition t Rule 42 is used for duplicating an arbitrary number of additional else if branches which are then processed by rule 43 Note that no finishing else branch is generated since we want to explicitly state the reason for taking each particular branch

The context-sensitiveness of the production rules places heavy requirements for the parsing algorithm Rekers and Schurr discuss the problems related to such graph grammar structures in [15] Fortunately by having a stable set of production rules instead of an arbitrary one our parsing procedure is reduced to a graph-matching problem with a fixed set of subgraphs to be searched and identified

In practice i t is often important to express the reason for some or all of the transitions explicitly We define two modes for our algorithm non-verbose mode and verbose mode The former is a loose form of a transformation where we only present those conditions that are used together with if else if or while instructions In this form the exit conditions from iterations and transitions in the sequential blocks rule are not transformed into pseudocode as assertions (shown in brackets in the computation rules)

The graph grammar is actually ambiguous in the sense that it is possible to produce different derivations for the same state machine Naturally the order does not affect the validness of the pseudocode as long as the parsing succeeds We use a tail-recursive

return otherwise

323

algorithm that parses the state machine interpreted as a graph and searches for subgraph structures corresponding to theproduction rules forcing a particular precedence order

UML has two fairly weak additional features that can be taken into account when interpreting messages in a sequence diagram guards and iterations UML does not specify their content so they are handled as ordinary strings Note that some preprocessing is done on the sequence diagram Stereotyped messages for object creation and destruction (m-eaten and (ltdestroygt respectively) are transformed into Java-like expressions using new and delete

I

I l l i f i L l d l I I

eo I til

W I Figure 2 Examples of accepted and rejected state machines

The left side of Figure 2 shows an example of a state machine that is accepted by the production rules together with the generated pseudocode including assertions and the right side shows simplest state machine [ l ] that does not conform to the production rules described previously The reason is the two-directional connection between the two subgraphs of the if-else if -structure which violates the single entry principle of substructures The handwritten example shows however that even though the state machine can be expressed without using explicit state variables or goto instructions the original visualization is clearer In cases like this the structured implementation scheme provided for the designer is the state machine itself not the pseudocode

4 Exploitation in a UML tool

The UML class diagram notation supports notes that can be used eg to attach code fragments to specific model elements In this section we show how pseudocode generation can be exploited in the generation of class diagrams on the basis of sequence diagrams We illustrate the results with a simple example run on an existing tool prototype In order to use the mechanisms described earlier we have integrated them into the TED tool [23] First we export a set of sequence diagrams from the TED repository describing different usage scenarios of a particular operation We synthesize a state machine from the sequence diagrams for a desired operation and import the resulting state machine back into TED We then further export this state machine for operation description synthesis Finally the resulting pseudocode is imported to TED as a UML note and placed to the appropriate class diagram When desired the whole process can be fully automated

324

As an example consider a graphical user interface (GUI) dialog containing a listbox and a button that is used to update the content of the listbox When the button is pressed the state of each element (listed in the listbox) is updated on the basis of the model If the state has become inconsistent the element is deleted and removed from the list and a replacing new element is added We will use this example throughout the rest of this paper

Figure 3 Example sequence diagram and synthesized statechart diagram

Figure 3 shows an example sequence diagram in which the external user presses the update button once The listbox contains only one element (a) whose state is inconsistent The element (a) is deleted and replaced by a new updated element (b) In order to give a more comprehensive example we construct two other sequence diagrams describing different execution paths All three sequence diagrams describe the behavior related to a single use case The second sequence diagram describes a case where the listbox is empty and the third one describes a situation where the elements in the listbox are all valid

We then call the state machine synthesizer component which reads these three sequence diagrams from the TED repository and synthesizes a state machine describing the execution of the ListBoxupdate() operation The view of the resulting state machine is shown next to the example sequence diagram in Figure 3

Figure 4 shows the operation description for ListBoxupdate() synthesized from the state machine in Figure 3 together with the class diagram synthesized directly from the sequence diagrams The pseudocode shows a while loop with an enclosing if-structure object destruction and creation actions and one assertion In short the listbox object checks whether the element list is empty and if not goes trough the immutable elements in the element list one by one If an element is not consistent it is removed and replaced with a new updated element The pseudocode representation shows essentially the same information as the state machine in Figure 3 but in a form that is more readable for programmers and that can be used as a starting point for implementing the operation

325

aupdatc0 fl inconsistent) I

clomentLst rcmovco delete a b = new Element

elementLstadd)

S y n l k s m d C L D S y n l k i l n d CLD L i S t B O X

I

S y n t k i m d CLD S y n t k r m d C L D

Figure 4 Synthesized operation description

As a final remark our technique opens up an interesting possibility to obtain automatically slices of operation implementations as required by certain functionality (represented by a certain set of sequence diagrams) Since the pseudocode synthesis can be applied to any (sub)set of sequence diagrams the designer can easily see what kind of implementation is required for a certain functionality on the basis of the sequence diagrams and in this way examine the implementation in a limited context This allows the designer to focus on one functionality at a time making i t easier to understand and correct the behavior Eventually the designer can synthesize the complete pseudocode implementation of an operation from the total set of sequence diagrams merging the partial implementations This approach is illustrated in Figure 5

Figure 5 Example of combining operation descriptions

In Figure 5 we have two sets of sequence diagrams describing the realization of two distinct use cases We can synthesize a state machine and generate pseudocode for both use

326

cases individually but we can also incrementally produce a new composite state machine This state machine can then be used for the generation of a combined implementation scheme

5 Related research

Biermann and Krisnashwamy present an algorithm for synthesizing programs from their traces [2] This algorithm has been used as a basis for our state machine synthesis The idea is that the user specifies the data structures of a program and describes (graphically) its expected behavior as example traces A trace consists of primitive instructions (like assignments) and conditions that hold before certain instructions Essentially the user gives traces of the expected program and the algorithm produces the smallest program that is capable of executing the given example traces Moreover after giving some finite number of example traces taken from a program the algorithm produces a program that can execute exactly the same set of traces as the original one that is the algorithm learns (or infers) an unknown program The program is represented as a transition graph where the nodes are instructions and transitions are conditions A sequence diagram corresponds to a trace in the algorithm by Biermann and Krisnashwamy sent messages are treated like instructions and received messages like conditions Since their method is intended for producing program code it is suited in principle for pseudocode generation However the output of the algorithm by Biermann and Krisnashwamy is a transition graph In our approach a state machine is an intermediate step Our goal is a pseudocode presentation that includes algorithmic structures searched from the state machine

Normark introduces an approach to extract static models from a set of scenarios [14] The proposed technique has been implemented in a tool called DYNAMO A scenario consists of participating objects and messages sent between them shown in a UML sequence diagram fashion In addition pre and post conditions can be set at any point in the scenario diagram Our approach does not require any extensions to the UML sequence diagram notation On the other hand the pre and post conditions enable DYNAMO to produce more complete code compared to our pseudocode generator The extracted static models in DYNAMO are presented in terms of an object-oriented programming language syntax while we aim to visualize the extracted information as annotated UML class diagrams

6 Future work

In this paper we have discussed simple UML sequence diagrams consisting only of objects and messages between them This basic form of a sequence diagram is extended in UML with various other concepts for instance conditional brunching iteration recursion and explicit return messages In addition states of objects can be attached to a sequence diagram

We aim to extend our approach to handle the full UML sequence diagram notation The state machine synthesizer for instance can easily be extended to take the above mentioned notation concepts into account For example a conditional branching can be interpreted as a shorthand notation for several merged sequence diagrams each of them expressing one branch in the structure From the senders point of view the name of the message is parsed so that the guard condition itself is considered as a received message (and thus will be mapped to a transition in the resulting state machine) and the message name as a normal sent message

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 5: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

321

pseudocode procedure implementing the state machine We assume that the input state machine is given in a form where possible extensions (like entry and exit actions) have been expanded into ordinary states and transitions In this normal form a state machine consists of states with an action and transitions with a label Both actions and labels may be empty The state machine synthesis algorithm used here generates state machines of this form

Actions and transition labels are handled as plain strings We assume that if there i s more than one state transition between any two states these transitions are combined to a single transition together with the label consisting of the disjunction of the original labels L(t) denotes the label associated with transition t P(q) denotes the action associated with state q

The language of the produced pseudocode consists of the following symbols begin block (I) end block () if then else while d o and return In addition assert i s used by default to indicate which triggering conditions must be available for a particular action to take place All other strings in the pseudocode are either simple statements or expressions (conditions) corresponding to the actions and labels of transitions of the state machine respectively a simple statement i s always of form P(q) and an expression is of form L(r) The control constructs of the language are if-statements while-loops and do-while - loops

The production rules are the following

I Start 2 EmptyTerminal state (2 I 22) 3 Sequential blocks 4 Branching (41 - 43) (if - else if) 5 Do loop 6 While loop

1

21

22

3

41

42

reduction rules a= e t a

a =

a =

a =

5

6

Figure 1 Production rules

322

The production rules are shown in Figure I The non-terminals are marked with labels S G and B S is the start symbol G stands for a structured subgraph and B stands for a branch in a conditional structure Note that a start state is always required The terminals of the grammar are states and state transitions Individual state transitions are marked with labels s and t T stands for a (possible) set of one or more transitions resulting from the repeated application of rule 42 Some of the productions contain context vertices that are required to be present but are not modified during transformation The context vertices are indicated using dashed lines

With the given production rules it is assumed that the non-terminal symbol G has unique entry and exit nodes (that is nodes with incoming or outgoing edges) If G produces empty the possible single incoming edge is assimilated with the possible single outgoing edge Otherwise the incoming and outgoing edges are removed

Each non-terminal of the graph grammar is associated with a code attribute The value of the code attribute is the pseudocode representing that particular instance of the non-terminal (that is a subgraph)

The computation rules for the code attribute are the following corresponding to the production rules

1 Scode = Gcode 21 Gcode = empty 22 Gcode = P(q) R(q)

where R(q) is defined as follows R(q) = empty if q has outgoing edges

3 Gcode = GLcode [ assert( L(r) ) ] ampcode 41 Gcode = P(q) if( L(t) ) ( GLcode Bcode ampcode 42 Bcode = =code =code 43 Bcode = else if( L(t) ) Gcode 5 Gcode = do ( P(q) [ assert( L(s) ) ] Gcode ] while ( L(t) ) [ assert( L(t2) 1 I 6 Gcode = P(q) while ( L(s) ) Gcode [ assert( L(t) ) ] P(q) ] [ assert( L(r2) ) ]

Apart from the set of branching rules most of the production rules are reasonably intuitive Rule 41 is used for producing a branching structure of two or more branches beginning with a transition t Rule 42 is used for duplicating an arbitrary number of additional else if branches which are then processed by rule 43 Note that no finishing else branch is generated since we want to explicitly state the reason for taking each particular branch

The context-sensitiveness of the production rules places heavy requirements for the parsing algorithm Rekers and Schurr discuss the problems related to such graph grammar structures in [15] Fortunately by having a stable set of production rules instead of an arbitrary one our parsing procedure is reduced to a graph-matching problem with a fixed set of subgraphs to be searched and identified

In practice i t is often important to express the reason for some or all of the transitions explicitly We define two modes for our algorithm non-verbose mode and verbose mode The former is a loose form of a transformation where we only present those conditions that are used together with if else if or while instructions In this form the exit conditions from iterations and transitions in the sequential blocks rule are not transformed into pseudocode as assertions (shown in brackets in the computation rules)

The graph grammar is actually ambiguous in the sense that it is possible to produce different derivations for the same state machine Naturally the order does not affect the validness of the pseudocode as long as the parsing succeeds We use a tail-recursive

return otherwise

323

algorithm that parses the state machine interpreted as a graph and searches for subgraph structures corresponding to theproduction rules forcing a particular precedence order

UML has two fairly weak additional features that can be taken into account when interpreting messages in a sequence diagram guards and iterations UML does not specify their content so they are handled as ordinary strings Note that some preprocessing is done on the sequence diagram Stereotyped messages for object creation and destruction (m-eaten and (ltdestroygt respectively) are transformed into Java-like expressions using new and delete

I

I l l i f i L l d l I I

eo I til

W I Figure 2 Examples of accepted and rejected state machines

The left side of Figure 2 shows an example of a state machine that is accepted by the production rules together with the generated pseudocode including assertions and the right side shows simplest state machine [ l ] that does not conform to the production rules described previously The reason is the two-directional connection between the two subgraphs of the if-else if -structure which violates the single entry principle of substructures The handwritten example shows however that even though the state machine can be expressed without using explicit state variables or goto instructions the original visualization is clearer In cases like this the structured implementation scheme provided for the designer is the state machine itself not the pseudocode

4 Exploitation in a UML tool

The UML class diagram notation supports notes that can be used eg to attach code fragments to specific model elements In this section we show how pseudocode generation can be exploited in the generation of class diagrams on the basis of sequence diagrams We illustrate the results with a simple example run on an existing tool prototype In order to use the mechanisms described earlier we have integrated them into the TED tool [23] First we export a set of sequence diagrams from the TED repository describing different usage scenarios of a particular operation We synthesize a state machine from the sequence diagrams for a desired operation and import the resulting state machine back into TED We then further export this state machine for operation description synthesis Finally the resulting pseudocode is imported to TED as a UML note and placed to the appropriate class diagram When desired the whole process can be fully automated

324

As an example consider a graphical user interface (GUI) dialog containing a listbox and a button that is used to update the content of the listbox When the button is pressed the state of each element (listed in the listbox) is updated on the basis of the model If the state has become inconsistent the element is deleted and removed from the list and a replacing new element is added We will use this example throughout the rest of this paper

Figure 3 Example sequence diagram and synthesized statechart diagram

Figure 3 shows an example sequence diagram in which the external user presses the update button once The listbox contains only one element (a) whose state is inconsistent The element (a) is deleted and replaced by a new updated element (b) In order to give a more comprehensive example we construct two other sequence diagrams describing different execution paths All three sequence diagrams describe the behavior related to a single use case The second sequence diagram describes a case where the listbox is empty and the third one describes a situation where the elements in the listbox are all valid

We then call the state machine synthesizer component which reads these three sequence diagrams from the TED repository and synthesizes a state machine describing the execution of the ListBoxupdate() operation The view of the resulting state machine is shown next to the example sequence diagram in Figure 3

Figure 4 shows the operation description for ListBoxupdate() synthesized from the state machine in Figure 3 together with the class diagram synthesized directly from the sequence diagrams The pseudocode shows a while loop with an enclosing if-structure object destruction and creation actions and one assertion In short the listbox object checks whether the element list is empty and if not goes trough the immutable elements in the element list one by one If an element is not consistent it is removed and replaced with a new updated element The pseudocode representation shows essentially the same information as the state machine in Figure 3 but in a form that is more readable for programmers and that can be used as a starting point for implementing the operation

325

aupdatc0 fl inconsistent) I

clomentLst rcmovco delete a b = new Element

elementLstadd)

S y n l k s m d C L D S y n l k i l n d CLD L i S t B O X

I

S y n t k i m d CLD S y n t k r m d C L D

Figure 4 Synthesized operation description

As a final remark our technique opens up an interesting possibility to obtain automatically slices of operation implementations as required by certain functionality (represented by a certain set of sequence diagrams) Since the pseudocode synthesis can be applied to any (sub)set of sequence diagrams the designer can easily see what kind of implementation is required for a certain functionality on the basis of the sequence diagrams and in this way examine the implementation in a limited context This allows the designer to focus on one functionality at a time making i t easier to understand and correct the behavior Eventually the designer can synthesize the complete pseudocode implementation of an operation from the total set of sequence diagrams merging the partial implementations This approach is illustrated in Figure 5

Figure 5 Example of combining operation descriptions

In Figure 5 we have two sets of sequence diagrams describing the realization of two distinct use cases We can synthesize a state machine and generate pseudocode for both use

326

cases individually but we can also incrementally produce a new composite state machine This state machine can then be used for the generation of a combined implementation scheme

5 Related research

Biermann and Krisnashwamy present an algorithm for synthesizing programs from their traces [2] This algorithm has been used as a basis for our state machine synthesis The idea is that the user specifies the data structures of a program and describes (graphically) its expected behavior as example traces A trace consists of primitive instructions (like assignments) and conditions that hold before certain instructions Essentially the user gives traces of the expected program and the algorithm produces the smallest program that is capable of executing the given example traces Moreover after giving some finite number of example traces taken from a program the algorithm produces a program that can execute exactly the same set of traces as the original one that is the algorithm learns (or infers) an unknown program The program is represented as a transition graph where the nodes are instructions and transitions are conditions A sequence diagram corresponds to a trace in the algorithm by Biermann and Krisnashwamy sent messages are treated like instructions and received messages like conditions Since their method is intended for producing program code it is suited in principle for pseudocode generation However the output of the algorithm by Biermann and Krisnashwamy is a transition graph In our approach a state machine is an intermediate step Our goal is a pseudocode presentation that includes algorithmic structures searched from the state machine

Normark introduces an approach to extract static models from a set of scenarios [14] The proposed technique has been implemented in a tool called DYNAMO A scenario consists of participating objects and messages sent between them shown in a UML sequence diagram fashion In addition pre and post conditions can be set at any point in the scenario diagram Our approach does not require any extensions to the UML sequence diagram notation On the other hand the pre and post conditions enable DYNAMO to produce more complete code compared to our pseudocode generator The extracted static models in DYNAMO are presented in terms of an object-oriented programming language syntax while we aim to visualize the extracted information as annotated UML class diagrams

6 Future work

In this paper we have discussed simple UML sequence diagrams consisting only of objects and messages between them This basic form of a sequence diagram is extended in UML with various other concepts for instance conditional brunching iteration recursion and explicit return messages In addition states of objects can be attached to a sequence diagram

We aim to extend our approach to handle the full UML sequence diagram notation The state machine synthesizer for instance can easily be extended to take the above mentioned notation concepts into account For example a conditional branching can be interpreted as a shorthand notation for several merged sequence diagrams each of them expressing one branch in the structure From the senders point of view the name of the message is parsed so that the guard condition itself is considered as a received message (and thus will be mapped to a transition in the resulting state machine) and the message name as a normal sent message

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 6: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

322

The production rules are shown in Figure I The non-terminals are marked with labels S G and B S is the start symbol G stands for a structured subgraph and B stands for a branch in a conditional structure Note that a start state is always required The terminals of the grammar are states and state transitions Individual state transitions are marked with labels s and t T stands for a (possible) set of one or more transitions resulting from the repeated application of rule 42 Some of the productions contain context vertices that are required to be present but are not modified during transformation The context vertices are indicated using dashed lines

With the given production rules it is assumed that the non-terminal symbol G has unique entry and exit nodes (that is nodes with incoming or outgoing edges) If G produces empty the possible single incoming edge is assimilated with the possible single outgoing edge Otherwise the incoming and outgoing edges are removed

Each non-terminal of the graph grammar is associated with a code attribute The value of the code attribute is the pseudocode representing that particular instance of the non-terminal (that is a subgraph)

The computation rules for the code attribute are the following corresponding to the production rules

1 Scode = Gcode 21 Gcode = empty 22 Gcode = P(q) R(q)

where R(q) is defined as follows R(q) = empty if q has outgoing edges

3 Gcode = GLcode [ assert( L(r) ) ] ampcode 41 Gcode = P(q) if( L(t) ) ( GLcode Bcode ampcode 42 Bcode = =code =code 43 Bcode = else if( L(t) ) Gcode 5 Gcode = do ( P(q) [ assert( L(s) ) ] Gcode ] while ( L(t) ) [ assert( L(t2) 1 I 6 Gcode = P(q) while ( L(s) ) Gcode [ assert( L(t) ) ] P(q) ] [ assert( L(r2) ) ]

Apart from the set of branching rules most of the production rules are reasonably intuitive Rule 41 is used for producing a branching structure of two or more branches beginning with a transition t Rule 42 is used for duplicating an arbitrary number of additional else if branches which are then processed by rule 43 Note that no finishing else branch is generated since we want to explicitly state the reason for taking each particular branch

The context-sensitiveness of the production rules places heavy requirements for the parsing algorithm Rekers and Schurr discuss the problems related to such graph grammar structures in [15] Fortunately by having a stable set of production rules instead of an arbitrary one our parsing procedure is reduced to a graph-matching problem with a fixed set of subgraphs to be searched and identified

In practice i t is often important to express the reason for some or all of the transitions explicitly We define two modes for our algorithm non-verbose mode and verbose mode The former is a loose form of a transformation where we only present those conditions that are used together with if else if or while instructions In this form the exit conditions from iterations and transitions in the sequential blocks rule are not transformed into pseudocode as assertions (shown in brackets in the computation rules)

The graph grammar is actually ambiguous in the sense that it is possible to produce different derivations for the same state machine Naturally the order does not affect the validness of the pseudocode as long as the parsing succeeds We use a tail-recursive

return otherwise

323

algorithm that parses the state machine interpreted as a graph and searches for subgraph structures corresponding to theproduction rules forcing a particular precedence order

UML has two fairly weak additional features that can be taken into account when interpreting messages in a sequence diagram guards and iterations UML does not specify their content so they are handled as ordinary strings Note that some preprocessing is done on the sequence diagram Stereotyped messages for object creation and destruction (m-eaten and (ltdestroygt respectively) are transformed into Java-like expressions using new and delete

I

I l l i f i L l d l I I

eo I til

W I Figure 2 Examples of accepted and rejected state machines

The left side of Figure 2 shows an example of a state machine that is accepted by the production rules together with the generated pseudocode including assertions and the right side shows simplest state machine [ l ] that does not conform to the production rules described previously The reason is the two-directional connection between the two subgraphs of the if-else if -structure which violates the single entry principle of substructures The handwritten example shows however that even though the state machine can be expressed without using explicit state variables or goto instructions the original visualization is clearer In cases like this the structured implementation scheme provided for the designer is the state machine itself not the pseudocode

4 Exploitation in a UML tool

The UML class diagram notation supports notes that can be used eg to attach code fragments to specific model elements In this section we show how pseudocode generation can be exploited in the generation of class diagrams on the basis of sequence diagrams We illustrate the results with a simple example run on an existing tool prototype In order to use the mechanisms described earlier we have integrated them into the TED tool [23] First we export a set of sequence diagrams from the TED repository describing different usage scenarios of a particular operation We synthesize a state machine from the sequence diagrams for a desired operation and import the resulting state machine back into TED We then further export this state machine for operation description synthesis Finally the resulting pseudocode is imported to TED as a UML note and placed to the appropriate class diagram When desired the whole process can be fully automated

324

As an example consider a graphical user interface (GUI) dialog containing a listbox and a button that is used to update the content of the listbox When the button is pressed the state of each element (listed in the listbox) is updated on the basis of the model If the state has become inconsistent the element is deleted and removed from the list and a replacing new element is added We will use this example throughout the rest of this paper

Figure 3 Example sequence diagram and synthesized statechart diagram

Figure 3 shows an example sequence diagram in which the external user presses the update button once The listbox contains only one element (a) whose state is inconsistent The element (a) is deleted and replaced by a new updated element (b) In order to give a more comprehensive example we construct two other sequence diagrams describing different execution paths All three sequence diagrams describe the behavior related to a single use case The second sequence diagram describes a case where the listbox is empty and the third one describes a situation where the elements in the listbox are all valid

We then call the state machine synthesizer component which reads these three sequence diagrams from the TED repository and synthesizes a state machine describing the execution of the ListBoxupdate() operation The view of the resulting state machine is shown next to the example sequence diagram in Figure 3

Figure 4 shows the operation description for ListBoxupdate() synthesized from the state machine in Figure 3 together with the class diagram synthesized directly from the sequence diagrams The pseudocode shows a while loop with an enclosing if-structure object destruction and creation actions and one assertion In short the listbox object checks whether the element list is empty and if not goes trough the immutable elements in the element list one by one If an element is not consistent it is removed and replaced with a new updated element The pseudocode representation shows essentially the same information as the state machine in Figure 3 but in a form that is more readable for programmers and that can be used as a starting point for implementing the operation

325

aupdatc0 fl inconsistent) I

clomentLst rcmovco delete a b = new Element

elementLstadd)

S y n l k s m d C L D S y n l k i l n d CLD L i S t B O X

I

S y n t k i m d CLD S y n t k r m d C L D

Figure 4 Synthesized operation description

As a final remark our technique opens up an interesting possibility to obtain automatically slices of operation implementations as required by certain functionality (represented by a certain set of sequence diagrams) Since the pseudocode synthesis can be applied to any (sub)set of sequence diagrams the designer can easily see what kind of implementation is required for a certain functionality on the basis of the sequence diagrams and in this way examine the implementation in a limited context This allows the designer to focus on one functionality at a time making i t easier to understand and correct the behavior Eventually the designer can synthesize the complete pseudocode implementation of an operation from the total set of sequence diagrams merging the partial implementations This approach is illustrated in Figure 5

Figure 5 Example of combining operation descriptions

In Figure 5 we have two sets of sequence diagrams describing the realization of two distinct use cases We can synthesize a state machine and generate pseudocode for both use

326

cases individually but we can also incrementally produce a new composite state machine This state machine can then be used for the generation of a combined implementation scheme

5 Related research

Biermann and Krisnashwamy present an algorithm for synthesizing programs from their traces [2] This algorithm has been used as a basis for our state machine synthesis The idea is that the user specifies the data structures of a program and describes (graphically) its expected behavior as example traces A trace consists of primitive instructions (like assignments) and conditions that hold before certain instructions Essentially the user gives traces of the expected program and the algorithm produces the smallest program that is capable of executing the given example traces Moreover after giving some finite number of example traces taken from a program the algorithm produces a program that can execute exactly the same set of traces as the original one that is the algorithm learns (or infers) an unknown program The program is represented as a transition graph where the nodes are instructions and transitions are conditions A sequence diagram corresponds to a trace in the algorithm by Biermann and Krisnashwamy sent messages are treated like instructions and received messages like conditions Since their method is intended for producing program code it is suited in principle for pseudocode generation However the output of the algorithm by Biermann and Krisnashwamy is a transition graph In our approach a state machine is an intermediate step Our goal is a pseudocode presentation that includes algorithmic structures searched from the state machine

Normark introduces an approach to extract static models from a set of scenarios [14] The proposed technique has been implemented in a tool called DYNAMO A scenario consists of participating objects and messages sent between them shown in a UML sequence diagram fashion In addition pre and post conditions can be set at any point in the scenario diagram Our approach does not require any extensions to the UML sequence diagram notation On the other hand the pre and post conditions enable DYNAMO to produce more complete code compared to our pseudocode generator The extracted static models in DYNAMO are presented in terms of an object-oriented programming language syntax while we aim to visualize the extracted information as annotated UML class diagrams

6 Future work

In this paper we have discussed simple UML sequence diagrams consisting only of objects and messages between them This basic form of a sequence diagram is extended in UML with various other concepts for instance conditional brunching iteration recursion and explicit return messages In addition states of objects can be attached to a sequence diagram

We aim to extend our approach to handle the full UML sequence diagram notation The state machine synthesizer for instance can easily be extended to take the above mentioned notation concepts into account For example a conditional branching can be interpreted as a shorthand notation for several merged sequence diagrams each of them expressing one branch in the structure From the senders point of view the name of the message is parsed so that the guard condition itself is considered as a received message (and thus will be mapped to a transition in the resulting state machine) and the message name as a normal sent message

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 7: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

323

algorithm that parses the state machine interpreted as a graph and searches for subgraph structures corresponding to theproduction rules forcing a particular precedence order

UML has two fairly weak additional features that can be taken into account when interpreting messages in a sequence diagram guards and iterations UML does not specify their content so they are handled as ordinary strings Note that some preprocessing is done on the sequence diagram Stereotyped messages for object creation and destruction (m-eaten and (ltdestroygt respectively) are transformed into Java-like expressions using new and delete

I

I l l i f i L l d l I I

eo I til

W I Figure 2 Examples of accepted and rejected state machines

The left side of Figure 2 shows an example of a state machine that is accepted by the production rules together with the generated pseudocode including assertions and the right side shows simplest state machine [ l ] that does not conform to the production rules described previously The reason is the two-directional connection between the two subgraphs of the if-else if -structure which violates the single entry principle of substructures The handwritten example shows however that even though the state machine can be expressed without using explicit state variables or goto instructions the original visualization is clearer In cases like this the structured implementation scheme provided for the designer is the state machine itself not the pseudocode

4 Exploitation in a UML tool

The UML class diagram notation supports notes that can be used eg to attach code fragments to specific model elements In this section we show how pseudocode generation can be exploited in the generation of class diagrams on the basis of sequence diagrams We illustrate the results with a simple example run on an existing tool prototype In order to use the mechanisms described earlier we have integrated them into the TED tool [23] First we export a set of sequence diagrams from the TED repository describing different usage scenarios of a particular operation We synthesize a state machine from the sequence diagrams for a desired operation and import the resulting state machine back into TED We then further export this state machine for operation description synthesis Finally the resulting pseudocode is imported to TED as a UML note and placed to the appropriate class diagram When desired the whole process can be fully automated

324

As an example consider a graphical user interface (GUI) dialog containing a listbox and a button that is used to update the content of the listbox When the button is pressed the state of each element (listed in the listbox) is updated on the basis of the model If the state has become inconsistent the element is deleted and removed from the list and a replacing new element is added We will use this example throughout the rest of this paper

Figure 3 Example sequence diagram and synthesized statechart diagram

Figure 3 shows an example sequence diagram in which the external user presses the update button once The listbox contains only one element (a) whose state is inconsistent The element (a) is deleted and replaced by a new updated element (b) In order to give a more comprehensive example we construct two other sequence diagrams describing different execution paths All three sequence diagrams describe the behavior related to a single use case The second sequence diagram describes a case where the listbox is empty and the third one describes a situation where the elements in the listbox are all valid

We then call the state machine synthesizer component which reads these three sequence diagrams from the TED repository and synthesizes a state machine describing the execution of the ListBoxupdate() operation The view of the resulting state machine is shown next to the example sequence diagram in Figure 3

Figure 4 shows the operation description for ListBoxupdate() synthesized from the state machine in Figure 3 together with the class diagram synthesized directly from the sequence diagrams The pseudocode shows a while loop with an enclosing if-structure object destruction and creation actions and one assertion In short the listbox object checks whether the element list is empty and if not goes trough the immutable elements in the element list one by one If an element is not consistent it is removed and replaced with a new updated element The pseudocode representation shows essentially the same information as the state machine in Figure 3 but in a form that is more readable for programmers and that can be used as a starting point for implementing the operation

325

aupdatc0 fl inconsistent) I

clomentLst rcmovco delete a b = new Element

elementLstadd)

S y n l k s m d C L D S y n l k i l n d CLD L i S t B O X

I

S y n t k i m d CLD S y n t k r m d C L D

Figure 4 Synthesized operation description

As a final remark our technique opens up an interesting possibility to obtain automatically slices of operation implementations as required by certain functionality (represented by a certain set of sequence diagrams) Since the pseudocode synthesis can be applied to any (sub)set of sequence diagrams the designer can easily see what kind of implementation is required for a certain functionality on the basis of the sequence diagrams and in this way examine the implementation in a limited context This allows the designer to focus on one functionality at a time making i t easier to understand and correct the behavior Eventually the designer can synthesize the complete pseudocode implementation of an operation from the total set of sequence diagrams merging the partial implementations This approach is illustrated in Figure 5

Figure 5 Example of combining operation descriptions

In Figure 5 we have two sets of sequence diagrams describing the realization of two distinct use cases We can synthesize a state machine and generate pseudocode for both use

326

cases individually but we can also incrementally produce a new composite state machine This state machine can then be used for the generation of a combined implementation scheme

5 Related research

Biermann and Krisnashwamy present an algorithm for synthesizing programs from their traces [2] This algorithm has been used as a basis for our state machine synthesis The idea is that the user specifies the data structures of a program and describes (graphically) its expected behavior as example traces A trace consists of primitive instructions (like assignments) and conditions that hold before certain instructions Essentially the user gives traces of the expected program and the algorithm produces the smallest program that is capable of executing the given example traces Moreover after giving some finite number of example traces taken from a program the algorithm produces a program that can execute exactly the same set of traces as the original one that is the algorithm learns (or infers) an unknown program The program is represented as a transition graph where the nodes are instructions and transitions are conditions A sequence diagram corresponds to a trace in the algorithm by Biermann and Krisnashwamy sent messages are treated like instructions and received messages like conditions Since their method is intended for producing program code it is suited in principle for pseudocode generation However the output of the algorithm by Biermann and Krisnashwamy is a transition graph In our approach a state machine is an intermediate step Our goal is a pseudocode presentation that includes algorithmic structures searched from the state machine

Normark introduces an approach to extract static models from a set of scenarios [14] The proposed technique has been implemented in a tool called DYNAMO A scenario consists of participating objects and messages sent between them shown in a UML sequence diagram fashion In addition pre and post conditions can be set at any point in the scenario diagram Our approach does not require any extensions to the UML sequence diagram notation On the other hand the pre and post conditions enable DYNAMO to produce more complete code compared to our pseudocode generator The extracted static models in DYNAMO are presented in terms of an object-oriented programming language syntax while we aim to visualize the extracted information as annotated UML class diagrams

6 Future work

In this paper we have discussed simple UML sequence diagrams consisting only of objects and messages between them This basic form of a sequence diagram is extended in UML with various other concepts for instance conditional brunching iteration recursion and explicit return messages In addition states of objects can be attached to a sequence diagram

We aim to extend our approach to handle the full UML sequence diagram notation The state machine synthesizer for instance can easily be extended to take the above mentioned notation concepts into account For example a conditional branching can be interpreted as a shorthand notation for several merged sequence diagrams each of them expressing one branch in the structure From the senders point of view the name of the message is parsed so that the guard condition itself is considered as a received message (and thus will be mapped to a transition in the resulting state machine) and the message name as a normal sent message

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 8: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

324

As an example consider a graphical user interface (GUI) dialog containing a listbox and a button that is used to update the content of the listbox When the button is pressed the state of each element (listed in the listbox) is updated on the basis of the model If the state has become inconsistent the element is deleted and removed from the list and a replacing new element is added We will use this example throughout the rest of this paper

Figure 3 Example sequence diagram and synthesized statechart diagram

Figure 3 shows an example sequence diagram in which the external user presses the update button once The listbox contains only one element (a) whose state is inconsistent The element (a) is deleted and replaced by a new updated element (b) In order to give a more comprehensive example we construct two other sequence diagrams describing different execution paths All three sequence diagrams describe the behavior related to a single use case The second sequence diagram describes a case where the listbox is empty and the third one describes a situation where the elements in the listbox are all valid

We then call the state machine synthesizer component which reads these three sequence diagrams from the TED repository and synthesizes a state machine describing the execution of the ListBoxupdate() operation The view of the resulting state machine is shown next to the example sequence diagram in Figure 3

Figure 4 shows the operation description for ListBoxupdate() synthesized from the state machine in Figure 3 together with the class diagram synthesized directly from the sequence diagrams The pseudocode shows a while loop with an enclosing if-structure object destruction and creation actions and one assertion In short the listbox object checks whether the element list is empty and if not goes trough the immutable elements in the element list one by one If an element is not consistent it is removed and replaced with a new updated element The pseudocode representation shows essentially the same information as the state machine in Figure 3 but in a form that is more readable for programmers and that can be used as a starting point for implementing the operation

325

aupdatc0 fl inconsistent) I

clomentLst rcmovco delete a b = new Element

elementLstadd)

S y n l k s m d C L D S y n l k i l n d CLD L i S t B O X

I

S y n t k i m d CLD S y n t k r m d C L D

Figure 4 Synthesized operation description

As a final remark our technique opens up an interesting possibility to obtain automatically slices of operation implementations as required by certain functionality (represented by a certain set of sequence diagrams) Since the pseudocode synthesis can be applied to any (sub)set of sequence diagrams the designer can easily see what kind of implementation is required for a certain functionality on the basis of the sequence diagrams and in this way examine the implementation in a limited context This allows the designer to focus on one functionality at a time making i t easier to understand and correct the behavior Eventually the designer can synthesize the complete pseudocode implementation of an operation from the total set of sequence diagrams merging the partial implementations This approach is illustrated in Figure 5

Figure 5 Example of combining operation descriptions

In Figure 5 we have two sets of sequence diagrams describing the realization of two distinct use cases We can synthesize a state machine and generate pseudocode for both use

326

cases individually but we can also incrementally produce a new composite state machine This state machine can then be used for the generation of a combined implementation scheme

5 Related research

Biermann and Krisnashwamy present an algorithm for synthesizing programs from their traces [2] This algorithm has been used as a basis for our state machine synthesis The idea is that the user specifies the data structures of a program and describes (graphically) its expected behavior as example traces A trace consists of primitive instructions (like assignments) and conditions that hold before certain instructions Essentially the user gives traces of the expected program and the algorithm produces the smallest program that is capable of executing the given example traces Moreover after giving some finite number of example traces taken from a program the algorithm produces a program that can execute exactly the same set of traces as the original one that is the algorithm learns (or infers) an unknown program The program is represented as a transition graph where the nodes are instructions and transitions are conditions A sequence diagram corresponds to a trace in the algorithm by Biermann and Krisnashwamy sent messages are treated like instructions and received messages like conditions Since their method is intended for producing program code it is suited in principle for pseudocode generation However the output of the algorithm by Biermann and Krisnashwamy is a transition graph In our approach a state machine is an intermediate step Our goal is a pseudocode presentation that includes algorithmic structures searched from the state machine

Normark introduces an approach to extract static models from a set of scenarios [14] The proposed technique has been implemented in a tool called DYNAMO A scenario consists of participating objects and messages sent between them shown in a UML sequence diagram fashion In addition pre and post conditions can be set at any point in the scenario diagram Our approach does not require any extensions to the UML sequence diagram notation On the other hand the pre and post conditions enable DYNAMO to produce more complete code compared to our pseudocode generator The extracted static models in DYNAMO are presented in terms of an object-oriented programming language syntax while we aim to visualize the extracted information as annotated UML class diagrams

6 Future work

In this paper we have discussed simple UML sequence diagrams consisting only of objects and messages between them This basic form of a sequence diagram is extended in UML with various other concepts for instance conditional brunching iteration recursion and explicit return messages In addition states of objects can be attached to a sequence diagram

We aim to extend our approach to handle the full UML sequence diagram notation The state machine synthesizer for instance can easily be extended to take the above mentioned notation concepts into account For example a conditional branching can be interpreted as a shorthand notation for several merged sequence diagrams each of them expressing one branch in the structure From the senders point of view the name of the message is parsed so that the guard condition itself is considered as a received message (and thus will be mapped to a transition in the resulting state machine) and the message name as a normal sent message

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 9: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

325

aupdatc0 fl inconsistent) I

clomentLst rcmovco delete a b = new Element

elementLstadd)

S y n l k s m d C L D S y n l k i l n d CLD L i S t B O X

I

S y n t k i m d CLD S y n t k r m d C L D

Figure 4 Synthesized operation description

As a final remark our technique opens up an interesting possibility to obtain automatically slices of operation implementations as required by certain functionality (represented by a certain set of sequence diagrams) Since the pseudocode synthesis can be applied to any (sub)set of sequence diagrams the designer can easily see what kind of implementation is required for a certain functionality on the basis of the sequence diagrams and in this way examine the implementation in a limited context This allows the designer to focus on one functionality at a time making i t easier to understand and correct the behavior Eventually the designer can synthesize the complete pseudocode implementation of an operation from the total set of sequence diagrams merging the partial implementations This approach is illustrated in Figure 5

Figure 5 Example of combining operation descriptions

In Figure 5 we have two sets of sequence diagrams describing the realization of two distinct use cases We can synthesize a state machine and generate pseudocode for both use

326

cases individually but we can also incrementally produce a new composite state machine This state machine can then be used for the generation of a combined implementation scheme

5 Related research

Biermann and Krisnashwamy present an algorithm for synthesizing programs from their traces [2] This algorithm has been used as a basis for our state machine synthesis The idea is that the user specifies the data structures of a program and describes (graphically) its expected behavior as example traces A trace consists of primitive instructions (like assignments) and conditions that hold before certain instructions Essentially the user gives traces of the expected program and the algorithm produces the smallest program that is capable of executing the given example traces Moreover after giving some finite number of example traces taken from a program the algorithm produces a program that can execute exactly the same set of traces as the original one that is the algorithm learns (or infers) an unknown program The program is represented as a transition graph where the nodes are instructions and transitions are conditions A sequence diagram corresponds to a trace in the algorithm by Biermann and Krisnashwamy sent messages are treated like instructions and received messages like conditions Since their method is intended for producing program code it is suited in principle for pseudocode generation However the output of the algorithm by Biermann and Krisnashwamy is a transition graph In our approach a state machine is an intermediate step Our goal is a pseudocode presentation that includes algorithmic structures searched from the state machine

Normark introduces an approach to extract static models from a set of scenarios [14] The proposed technique has been implemented in a tool called DYNAMO A scenario consists of participating objects and messages sent between them shown in a UML sequence diagram fashion In addition pre and post conditions can be set at any point in the scenario diagram Our approach does not require any extensions to the UML sequence diagram notation On the other hand the pre and post conditions enable DYNAMO to produce more complete code compared to our pseudocode generator The extracted static models in DYNAMO are presented in terms of an object-oriented programming language syntax while we aim to visualize the extracted information as annotated UML class diagrams

6 Future work

In this paper we have discussed simple UML sequence diagrams consisting only of objects and messages between them This basic form of a sequence diagram is extended in UML with various other concepts for instance conditional brunching iteration recursion and explicit return messages In addition states of objects can be attached to a sequence diagram

We aim to extend our approach to handle the full UML sequence diagram notation The state machine synthesizer for instance can easily be extended to take the above mentioned notation concepts into account For example a conditional branching can be interpreted as a shorthand notation for several merged sequence diagrams each of them expressing one branch in the structure From the senders point of view the name of the message is parsed so that the guard condition itself is considered as a received message (and thus will be mapped to a transition in the resulting state machine) and the message name as a normal sent message

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 10: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

326

cases individually but we can also incrementally produce a new composite state machine This state machine can then be used for the generation of a combined implementation scheme

5 Related research

Biermann and Krisnashwamy present an algorithm for synthesizing programs from their traces [2] This algorithm has been used as a basis for our state machine synthesis The idea is that the user specifies the data structures of a program and describes (graphically) its expected behavior as example traces A trace consists of primitive instructions (like assignments) and conditions that hold before certain instructions Essentially the user gives traces of the expected program and the algorithm produces the smallest program that is capable of executing the given example traces Moreover after giving some finite number of example traces taken from a program the algorithm produces a program that can execute exactly the same set of traces as the original one that is the algorithm learns (or infers) an unknown program The program is represented as a transition graph where the nodes are instructions and transitions are conditions A sequence diagram corresponds to a trace in the algorithm by Biermann and Krisnashwamy sent messages are treated like instructions and received messages like conditions Since their method is intended for producing program code it is suited in principle for pseudocode generation However the output of the algorithm by Biermann and Krisnashwamy is a transition graph In our approach a state machine is an intermediate step Our goal is a pseudocode presentation that includes algorithmic structures searched from the state machine

Normark introduces an approach to extract static models from a set of scenarios [14] The proposed technique has been implemented in a tool called DYNAMO A scenario consists of participating objects and messages sent between them shown in a UML sequence diagram fashion In addition pre and post conditions can be set at any point in the scenario diagram Our approach does not require any extensions to the UML sequence diagram notation On the other hand the pre and post conditions enable DYNAMO to produce more complete code compared to our pseudocode generator The extracted static models in DYNAMO are presented in terms of an object-oriented programming language syntax while we aim to visualize the extracted information as annotated UML class diagrams

6 Future work

In this paper we have discussed simple UML sequence diagrams consisting only of objects and messages between them This basic form of a sequence diagram is extended in UML with various other concepts for instance conditional brunching iteration recursion and explicit return messages In addition states of objects can be attached to a sequence diagram

We aim to extend our approach to handle the full UML sequence diagram notation The state machine synthesizer for instance can easily be extended to take the above mentioned notation concepts into account For example a conditional branching can be interpreted as a shorthand notation for several merged sequence diagrams each of them expressing one branch in the structure From the senders point of view the name of the message is parsed so that the guard condition itself is considered as a received message (and thus will be mapped to a transition in the resulting state machine) and the message name as a normal sent message

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 11: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

327

The version of the operation description synthesizer described in this paper can be extended to accept a larger subset of state machines Such extensions could include the generation of labels and goto break and continue instructions

Although it has not been our primary aim in principle this approach could be extended to actual code generation instead of pseudocode we could easily follow the syntax of an actual language The problem is however the level of information available in sequence diagrams Usually sequence diagrams are employed in early stages of design when detailed information is not available or sensible If sequence diagrams were augmented with detailed implementation-level constructs like assignments and other primitive expressions generating executable code would become possible However the current UML sequence diagram notation gives poor support for describing these kinds of features

In order to validate the approach described in this paper a more comprehensive case study is in order We expect to be able to experiment with the technique with a larger real- life project in order to determine the strong and weak points of the method

7 Conclusions

In this paper we discussed a technique for generating structured implementation schemes presented as pseudocode from UML sequence diagrams The pseudocode generation consists of two steps synthesizing state machines for operations and generating pseudocode from the state machines We also discussed howa class diagram synthesized from the same set of sequence diagrams can be annotated with these operation descriptions The proposed techniques are implemented in a real-life UML design tool the Nokia TED W e presented an example of how the techniques were used in diagram synthesis We also presented possible extensions and future research topics

The proposed techniques provide help for the designer in the earlyphases of the design process The synthesized class diagram and operation descriptions do not aim to specify the system to be designed comprehensively they are constructed from the incomplete information given as sequence diagrams and thus reflect the current state of the design It is our belief that as such the techniques presented in this paper can support the designer during software development by introducing an automated mechanism for viewing not only static models but also associated dynamic descriptions of operations as implied by a set of sequence diagrams

Acknowledgements The authors wish to thank Erkki Makinen and Markku Sakkinen for their valuable comments This research has been financially supported by the National Technology Agency of Finland (TEKES grant 4090898) Nokia Metso Automation Sensor Software Consulting Ebsolut and Plenware

8 References

[ I ] Aho AV and Ullinan JD The Theory of Parsing Translation and Compiling vol 11 Prentice-Hall 1973

121 Biermann AW and Kriahnaswamy U Constructing prosrams from example computations IEEE Trans Softw En 2(3) 1976 pp 141-153

[31 Booch G Rumbaugh J and Jacobson 1 The Unified Modeling Language User Guide Addison-Wesley 1999

[4] Dah1 0 -J Dijkstra EW and Hoare CAR Structured Programming Academic Press 1972

[SI Gamma E Helm R Johnson U and Vlissides J Design Patterns - Elements of Reusable Object-Oriented Software Addison-Wesley 1994

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf

Page 12: Generating Structured Implementation Schemes from Sequence ... · A class diagram is used to specify the static structure of a system in UML. Class diagrams are often annotated with

328

[6] Harel D Statecharts A Visual Formalism for Complex Systems Science of Computer Programming 8 1987 pp 231-274

[7] Harel D Lachover H Naamad A Pnueli A Politi M Sherman R Shtull-Tauring A and Trakhtenbrot M STATEMATE A Working Environment for the Devehpment of Complex Reactive Systems IEEE Trans Softw Eng 16(4) 1990 pp 403414

[8] I-Logix 2000 On-line at httpww~ilooixcoiri

[9] Kiczales G Lamping J Mendhekar A Maeda C Videira Lopes C Loingtier J-M and Irwin J Aspect- Oriented Programming in Proc of ECOOP 97 LNCS- 1241 Springer-Verlag 1997 pp 220-243

[IO] Koskimies K and Makinen E Automatic Synthesis of State Machines from Trace Diagrams Softw Pract amp Exper 24(7) 1994 pp 643-658

[ I I ] Koskimies K Mannisto T Systh T and Tuomi J Automated Support for Modeling 00 Software IEEE Software 15( 1 ) JanuaryFebruary 1998 pp 87-94

[ I21 Mikinen E and Systa T MAS - An Interactive Synthesizer to Support Behavioral Modeling in UML in Proc of ICSE 2001 Toronto Canada 2001 pp IS-24

[ 131 Marriot K and Meyer B (eds) Visual Language Theory Springer 1998

1141 n a r k K Synthesis of Program Outlines from Scenarios in DYNAMO Aalborg University 1998 On-line at httpwwwcsaucdW-normarWdynamohtm1

[ 151 Rekers J and Schurr A A Graph Grammar Approach to Graphical Parsing in Proc of fh r I h

Inremntional IEEE Symposium 017 Vimcd Languages IEEE 1995

[ 161 Rumbaugh J Jacobson I and Booch G The Unified Modeling Language Reference Manual Addison- Wesley 1999

[I71 Selonen P Koskimies K and Sakkinen M How to Make Apples from Oranges in UML in Proc of HICSS-34 (CD-ROM) Maui Hawaii 2001

[IS] Systa T Static and Dynamic Reverse Engineering Techniques for Java Software Systems Dept of Computer and Information Sciences University of Tampere Report A-2000-4 PhD Dissertation 2000

[I91 Taentzer G Ermel C and Rudolf M The AGG Approach Language and Tool Environment thc Handbook of Graph Grammars and Computing by Graph Transformation Volume 2 Applications Languages and Tools World Scientific 1999

[20] The Unified Modeling Language Notation Guide v13 OMG 1999 On-line at httpwwwrationalcom

[21] The Unified Modeling Language Semantics V I 3 OMG 1999 On-line at httpwwwrationalcom

[22] Whittle J and Schumann J Generating Statechart Designs From Scenarios in Proc of ICSEOO Limerick Ireland 2000 pp 314-323

[ 2 3 ] Wikman J Evolution of a Distributed Repository-Based Architecture in Proc of NOSA98 1998 On-line at httplwwwhk-rsefouforskinfonsf