Slide 1, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Masters Project
Extraction of State Machine Diagrams of
Legacy C code with Cpp2XMI
Vanderlande Industries B.V. (Veghel)Dennie van Zeeland
Slide 2, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Outline
• Introduction• Project goal• Related work• Tools• State Machine patterns• Switch-Case / Switch-Case pattern• Overview of design of Cpp2XMI• CPPML• Case Study - FSC• Problems encoutered• Conclusions / Future work
Slide 3, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Introduction
• Vanderlande Industries (VI)- Automated material handling systems:
• distribution centres• baggage handling at airports • express parcel sortation facilities
• Flow System Controller (FSC)- Controls the product flows in conveyor systems for transport and
sortation
- Controls all motors, photocells and other active components
- Send and receive control and status information
- Realtime control of equipment
- Separated GUI
- Standard interfaces with equipment
- Fully configurable for all VI sorting equipment
Slide 4, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Project goal
• Reverse Engineering documentation from C/C++ source code- Why:
• Limited documentation available for V5/V6- State machine diagrams
• Identifying possible flaws in current versions• V7 is being developed from scratch• Extracting dynamic behavior from source code for V7• Identifying requirements
- Building a tool for extracting state machine diagrams from code • Preprocessing source code• Parsing source code• Finding patterns in source code of state machines• Creating diagrams
Slide 5, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Related work
• ESPaRT [KNOR98]- Finding state machine patterns and replacing them by better constructs
• Bandera [CORB00]- Extract state machines and produce input for model checkers
• Symbolic Execution [WALK08]- Program conditioning
• Brastra [YUAN06]- Extracting state machines using unit tests
• Log based state machine extraction [GANN06]
Slide 6, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Tools
• Cpp2XMI- Developed by Elena Korshunova for LaQuSo- C++ to UML-diagrams
• Class diagram• Sequence diagram• Activity diagram
• Columbus/CAN toolset- Pre-processor + Parser + Linker- Static analyser (Class diagram extraction)- CPPML (C++ Markup Language)
• Use Cpp2XMI as basis and extend it with - State machine diagram
Slide 7, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
State Machine Patterns
• Global type of patterns for implementing state machines- Nested Choice Patterns
• Switch/Case – Switch/Case pattern (mainly used in FSC)• Switch/Case – If/Else• If/Else – Switch/Case• If/Else – If/Else
- Jump Tables• Pointer to a function pointer array represents state variable• Function pointer array is the jumb table• Table contains the event handler functions
- Object oriented approach• States as classes
- Goto statements• Use goto statements to jump through program• Each goto statement corresponds to a state transition
Slide 8, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Switch/Case – Switch/Case pattern (1)• States
Source Code Diagram1234567891011121314151617181920212223
/* State of the object */enum OBJ_STATE{ STATE_A, STATE_B, STATE_C};
...switch (...) { case STATE_A: ... case STATE_B: ... case STATE_C: ... default: ...}...
All possible statesAll possible states
Determine current state
Determine current state
Slide 9, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Switch/Case – Switch/Case pattern (2)• Transitions
Source Code Diagram123456789101112131415161718192021222324
/* events to react on */enum OBJ_EVENT{ EVENT1, EVENT2, EVENT3}; ...switch (...) { case STATE_A: switch (event) { case EVENT1: ... case EVENT2: ... } break; case STATE_B: switch (event) { case EVENT1: ... case EVENT2: ... } break; ...}
Determine arrived event
Determine arrived event
All possible eventsAll possible events
Slide 10, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Switch/Case – Switch/Case pattern (3)• Object with current state variable
Source Code Diagram1234567891011121314151617181920212223242526
typedef struct { bool initialised; OBJ_STATE state;} OBJ;...switch (obj->state) { case STATE_A: switch (event) { case EVENT1: obj->state = STATE_B; break; case EVENT2: obj->state = STATE_C; break; } break; case STATE_B: switch (event) { case EVENT1: obj->state = STATE_B; break; case EVENT2: obj->state = STATE_C; break; } break; ...}
Assign new state to state variable
Assign new state to state variable
Object with variable for current state
Object with variable for current state
Slide 11, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
• Control Function
Switch/Case – Switch/Case pattern (4)
Source Code Diagram1234567891011121314151617181920212223242526
static void OBJ_control(OBJ *obj, OBJ_EVENT event) { switch (obj->state) { case STATE_A: switch (event) { case EVENT1: obj->state = STATE_B; break; case EVENT2: obj->state = STATE_C; break; } break; case STATE_B: switch (event) { case EVENT1: obj->state = STATE_B; break; case EVENT2: obj->state = STATE_C; break; } break; ... }}
Control FunctionControl
Function Event that needs to be handled
Event that needs to be handled
Slide 12, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
• Multiple events with the same event-handler
Switch/Case – Switch/Case pattern (5)
Source Code Diagram1234567891011121314151617181920212223242526
static void OBJ_control(OBJ *obj, OBJ_EVENT event) { switch (obj->state) { case STATE_A: switch (event) { case EVENT1: case EVENT2: obj->state = STATE_C; break; } break; case STATE_B: switch (event) { case EVENT1: obj->state = STATE_B; break; case EVENT2: obj->state = STATE_C; break; } break; ... }}
Multiple events are handled the same way
Multiple events are handled the same way
Slide 13, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
• State assignment in called function
Switch/Case – Switch/Case pattern (6)
Source Code Diagram1234567891011121314151617181920212223242526
do_Function(OBJ *obj){ ... obj->state = STATE_C; ...}...static void OBJ_control(OBJ *obj, OBJ_EVENT event){ switch (obj->state) { case STATE_A: switch (event) { case EVENT1: do_Function(obj); break; case EVENT2: obj->state = STATE_C; break; } break; ... }}...
State assignment in called functionState assignment in called function
Slide 14, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
• Conditional transitions
Switch/Case – Switch/Case pattern (7)
Source Code Diagram1234567891011121314151617181920212223242526
bool y = true;switch (obj->state) { case STATE_A: switch (event) { case EVENT1: if (y) { obj->state = STATE_B; } else { obj->state = STATE_C; } break; case EVENT2: obj->state = STATE_C; break; } break; case STATE_B: switch (event) { case EVENT1: obj->state = STATE_B; break; case EVENT2: obj->state = STATE_C; break; } break; ...}
Conditional transitionsConditional transitions
Slide 15, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Overview of design of Cpp2XMI (1)
C++ Source Code
Pre-processor
Pre-processed C++ Source
Code
XMI
Filtered Columbus
CPPML
CPPML
Filtered Columbus XMI
Columbus Parser & Exporter
Diagram Extractor
Library Filter
XMI (with layout)
Abstract Syntax Tree in XML-
format
Abstract Syntax Tree in XML-
format
Class Diagram in XML (XMI)-formatClass Diagram in XML (XMI)-format
Class, Sequence, Activity Diagram in XML (XMI)-format
Class, Sequence, Activity Diagram in XML (XMI)-format
Slide 16, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Overview of design of Cpp2XMI (2)
Parser for CPPML
Store (internal
datastructure)
CollaborationOrganizer
ActivityOrganizer
Store (internal
datastructure)
State Machine
Organizer
Add Layout using
Seuence-Layout
Form XMI-tags / Write
to File
Store (internal
datastructure) + Layout
Add Layout using
StateDot-Layout
Add Layout using Dot-
Layout
Add Layout using Class-
Layout
Abstract Syntax Tree
(CPPML filtered)
UML model in XMI 1.1
(with layout)
Second Parser for CPPML for Objective C
Could be improvedCould be improved
XX
Slide 17, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Case Study - FSC (1)
• Applied Cpp2XMI to FSC V6
Slide 18, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Case Study - FSC (2)
• Example: SPUR
Slide 19, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Case Study - FSC (3)
• Example: Triplesorter
Slide 20, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Case Study (4)
• Some results of the Cpp2XMI State Machine Diagram Extractor- V6: 27 State Machine Diagrams, 2 false positives
- Extracted state machine is for 70% identical to original documented state machine (Gappex)
- Correctness was checked and confirmed by domain experts
Slide 21, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Problems encountered
• Preprocessing issues - Columbus Parser doesn’t understand anonymous structs/enums
• Sax parser issues- Sax parser halts on the escape character (0x1B)
• CPPML should have a tree-structure (1 on 1 with AST)- However it’s a DAG, which makes pattern recognition is tricky
• Cpp2XMI doesn’t extract control-statements (do, while, if, switch break, continue, etc.) correctly (for C)
• Performance issues with CPPML parser- JDOM (Memory peak level: 1,8GB)- SAX (will not solve memory peaks)
• Bug in XMI-part - Duplicate id’s, which raises errors when importing XMI into case-tools
(Enterprise Architect)
• Position information not part of XMI- Enterprise Architect doesn’t do anything with position information
Slide 22, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Conclusions / Future work
• Conclusions:- UML state machines are useful for the developers and maintainers
- Can be derived automatically from source code (by finding patterns)
- Was very successful in the case study and is in general promising
• Future work:- Expanding the state machine patterns
- Combine with logging to get a general state machine
- Export function for model checkers
- On Entry / On Exit
- Metrics / analysis over overlap of state machine diagrams
- Filter / Zoom
- Use original CPPML parser
- XMI conversion
- Improve performance issues (due to choice of parser)
Slide 23, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
Questions?
Slide 24, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
References (1)• [KNOR98]
- Roland Knor, Georg Trausmuth and Johannes Weidl- Reengineering C/C++ Source Code by Transforming State Machines- Proceedings of the Second International ESPRIT ARES Workshop on Development and Evolution of
Software Architectures for Product Families- 1998- ISBN: 3-540-64916-6, Page 97—105- Springer-Verlag
• [CORB00]- James C. Corbett, et al.- Bandera: extracting finite-state models from Java source code- Proceedings of the 22nd international conference on Software engineering, - 2000- ISBN: 1-58113-206-9, Page 439—448- ACM
• [WALK08]- Neil Walkinshaw et al.- Automated discovery of state transitions and their functions in source code- Journal: Software Testing Verification Reliability (Vol 18, nr 2)- 2008- ISBN 0960-0833, Page 99—121- John Wiley and Sons Ltd.
Slide 25, Presentation: Extraction of State Machine Diagrams of Legacy C code with Cpp2XMI, 4-2-2009 | Dennie van Zeeland
References (2)• [YUAN06]
- Tao Xie and Evan Martin and Hai Yuan- Automatic extraction of abstract-object-state machines from unit-test executions- Proceedings of the 28th international conference on Software engineering- 2006- ISBN: 1-59593-375-1, Page 835—838- ACM
• [GANN02]- Gerald C. Gannod and Shilpa Murthy- Using Log Files to Reconstruct State-Based Software Architectures- Workshop on Software Architecture Reconstruction at WCRE 2002 - 2002
• [FERE02]- Rudolf Ferenc et al. - Columbus - Reverse Engineering Tool and Schema for C++- Proceedings of the International Conference on Software Maintenance- 2002- ISBN: 0-7695-1819-2, Page 172- IEEE Computer Society
• [KORS06]- E. Korshunova and M. Petkovic and M. G. J. van den Brand and M. R. Mousavi- CPP2XMI: Reverse Engineering of UML Class, Sequence, and Activity Diagrams from C++ Source Code- WCRE '06: Proceedings of the 13th Working Conference on Reverse Engineering- 2006- ISBN: 0-7695-2719-1, Page 297—298- IEEE Computer Society