UNIVERSITAS SCIENTIARUM SZEGEDIENSISUNIVERSITY OF SZEGEDDepartment of Software Engineering
Source code analysis with Columbus
Quality assessment Architecture reconstruction
Árpád BeszédesDepartment of Software Engineering, University of Szeged, Hungary
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 2
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Who we are
One of the leading SE research groups in Hungary■ http://www.inf.u-szeged.hu/sed/
Competences■ Software quality■ Software testing■ Embedded systems■ Networks■ Open Source■ .NET■ Java
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 3
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Monitor
Source code quality
Test management
Architecture
IT operation performance
Processes(e.g. issue m)
Place of source code analysis
Software maintenance / evolution of large systems
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 4
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
References Telecom, financial, other Analyzed systems (1-30 MLOC)
■ Graphisoft ArchiCAD■ Nuance-Scansoft Recognita■ evoSoft■ Erste Bank■ Nokia S60 platform■ Mozilla Firefox & Thunderbird■ SUN, OpenOffice.org■ Eclipse■ NASA WorldWind■ Etc.
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 5
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Source code-based QA methodology architecture
Continuous measurement and monitoring is needed!
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 6
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Quality decrease of sw systems
from: Roger Pressman - Software Engineering Software Engineering: A Practitioner's Approach, McGraw-Hill
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 7
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Columbus technology
Analysis of large sw systemsIn the scope of regular maintenance
■ C/C++/C#/Java/SQL■ Quality measurements, auditing■ Reverse Engineering, architecture
reconstruction■ One-shot assessment■ Continuous monitoring
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 8
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Some history
1998-2001: Nokia Research CenterFrontEndART Software Ltd.Further development
■ Industrial projects■ Grants
50-100 man-years
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 9
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Main components Robust source code parsers Analysis methodologies Representation metamodels designed for maintenance
tasks (language schemas) Programming interfaces (API) Extensions: CFG, call graph, DU, support for dynamic
analysis (testing), etc. Back ends
■ Code measurement■ Reverse engineering
Standalone or SDK integration, command-line, API■ SourceAudit, SourceDoc (previously Columbus/CAN)
Monitoring subsystem: SourceInventory (was Monitor)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 11
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Reverse Engineering SourceDoc tool Creation of UML logical views of the analyzed system
■ Class diagrams■ In standard XMI format
– can be loaded e.g. into Rational Rose Automatic generation of HTML documentation
■ Class-level■ Hyperlinked
Design pattern usage detection Detecting architecture-level dependencies
■ “Superlinking”■ Among physical components (e.g. exe, dll)■ E.g. function calls, includes
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 12
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
“Bad smell” detection
SourceAudit toolFinds code fragments that
■ might be problematic■ and error-prone,■ so it needs refactoring■ e.g.: Feature Envy: a function which does not
use its own class, but relies on others
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 13
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Clone detection
Finds duplicated code fragments (copy/paste, clones)
Clones can cause many hard-to-find bugsThe detection can be scaled
■ from exact match■ to similar code fragments
Uses efficient flat-tree based recognitionLanguage independent
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 14
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Code checking Checking conformance to coding rules
■ Some of them indicate bugs The rules
■ check the coding style■ extend the warning capabilities of the compiler■ check typical implementation errors■ new rules can be added easily (C++ API)
General good practice rules Company specific rules Integration of other checkers’ results Integration with MS Visual Studio and Eclipse Command-line operation
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 15
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Code measurement – metrics
All popular metrics for procedural, OO, SQL languages■ CK, size, complexity, coupling, cohesion, OO-ness,
etc.
Interpretation of metrics?■ E.g. different types of cyclomatic complexity
The metrics-based fault predictor model selects C++ classes that are liable to errors1
■ successfully tested on Mozilla
[1] Gyimóthy T, Ferenc R and Siket I. Empirical Validation of Object-Oriented Metrics on Open Source Software for Fault Prediction. IEEE TSE vol.31, no.10, October 2005
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 16
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Connectivity to other tools ART (RSF) {CPP|J|CS}ML (XML) FAMIX XMI GXL Maisa (Prolog) RSF UML XMI VCG Machine readable CSV Human readable txt
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 17
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Technical details
Own and commercial front endsAnalysis of complex systems:
■ Compiler wrapping technology
Standard schemas■ OO-style■ C++ API
High-level language independent modelCross-module dependencies:
superlinking
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 18
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Scalability?
Limited low-level analyses■ Lightweight?■ Problems of other kind
Common high-level modelRelatively easy system integration
Anything else you wanted to know?
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 19
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Monitoring subsystem SourceInventory The source code of the
system is■ downloaded automatically
by the framework from a configuration management system (e.g. CVS)
■ analyzed, and the results are stored in a database
■ queries can be run and diagrams can be drawn from a web-based interface, which communicates with the database
Monitor
Source code analysisand measurement
SQL database
Source code configurationversioning repository
Web clientWeb client
...
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 20
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Monitoring subsystem (cont.)
Automatic alerts can be issued when indicators overrun critical thresholds■ metric baselines
Internally: quality assessmentCustomer: continuous measurementPublic databases: Mozilla, maemo,
(OpenOffice, Eclipse)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 21
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Screenshots (1)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 22
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Screenshots (2)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 23
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Screenshots (3)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 24
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Screenshots (4)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 25
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Screenshots (5)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 26
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Screenshots (6)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 27
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Screenshots (7)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 28
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Screenshots (8)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 29
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Screenshots (9)
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 30
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Nokia R&D projects
History■ 1998 – TED projects: C++ to UML■ 2005 – ART projects: Symbian platform
Architecture reconstruction of Symbian platform■ Identification of “architectural erosion”
Quality measurement of Symbian platform■ Metrics■ Official Symbian coding guidelines (SourceAudit)
Quality measurement of maemo platform
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 31
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Technology extensions – 1 SourceAudit coding rules (114)
■ general 'good practice' C++ coding rules ■ Symbian OS-specific rules ■ Nokia recommended rules
viability – the code will not work reliability – the code may not work maintainability – the code may be difficult to modify readability – the code may be difficult to understand reusability – the code may be less usable in
conjunction with other code convention – the code will be unconventional
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 32
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Technology extensions – 2 Architecture cross-component dependencies
■ call: inter-component function calls■ include: the same header file is included by two components ■ resource: the consumer’s serviceClass attribute equals to the
provider’s interface_uid, the consumer’s contentType attribute equals to provider's default data item, and consumer's serviceCmd attribute equals to provider's opaque_data
■ publish & subscribe: if a component sets a property (category and key), and an other subscribes for this property then there is a P&S dependency between the two components
Dependency metrics, e.g. xCallIn, xCallOut, xInclude Visualization
2008.04.17 Scalable Program Analysis - Dagstuhl Seminar 33
UN
IVER
SIT
Y O
F SZ
EG
ED
Dep
artm
ent o
f Sof
twar
e E
ngin
eeri
ng
UN
IVE
RS
ITA
S S
CIE
NT
IAR
UM
SZ
EG
ED
IEN
SIS
Technology extensions – 3
Build process (Symbian SDK) wrapping■ Hides the original compiler with wrapper programs
(e.g. gcc.exe)
After activating the SDKWrapper the project can be built as usual■ creating abld.bat by ‘bldmake bldfiles’ and■ building the project by ‘abld build …’
Build scripts provided for linking and inter-component dependency computation
Difficulties: compiling resource files, excluding test components