concern-based cohesion: unveiling a hidden dimension of cohesion measurement
TRANSCRIPT
Concern-based Cohesion: Unveiling
a Hidden Dimension of Cohesion
Measurement
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
Software Design and Evolution Group
aside.dcc.ufba.br
Measurement
Bruno C. da SilvaBruno C. da [email protected]@dcc.ufba.br
Cláudio Cláudio Sant’AnnaSant’[email protected]@dcc.ufba.br
Christina Christina [email protected]@dcc.ufba.br
Federal University of Bahia (UFBA)
Alessandro GarciaAlessandro [email protected]@inf.puc--rio.brrio.br
Cohesion can be defined as:
The degree to which a module represents an abstraction of a
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
represents an abstraction of a single concern of the software
2
Structural Cohesion Metrics
Almost all methods share
the same instance
E.g. LCOM, LCOM2, etc.
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 3…
the same instance
variable
Is it a high cohesive
class?
Lack of Concern-Based Cohesion (LCbC)
How many
concerns does this
class address?
http
response
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 4
LCbC = 6
…
http
response
header
response
buffer
URL enconding
web cookies
Error sending and others…
http redirecting
Is it a high
cohesive class?
Cohesion: Structure-based vs. Concern-based
They capture different dimensions of cohesion• Different source of information and counting
mechanism;
• Different interpretation of cohesion;
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 5
LCOM2 = 0
LCbC = 6
Example – ResponseFacade (Tomcat)
low lack of cohesion
or
high cohesion
high lack of cohesion
or
low cohesion
Empirical Study – First Goal
Provide empirical evidence about
whether the concern-driven nature of a
cohesion metric makes it significantly
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 6
cohesion metric makes it significantly
different from structural cohesion
metrics.
Moreover…
http response
buffer
http response
header
URL enconding
changes
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
… the number of concerns a module realizes may influence
positively the number of changes it may be subject to.
…
7
web cookies
http redirecting
Error sending
Empirical Study – Second Goal
Investigate whether and how concern-
based cohesion is associated to
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 8
based cohesion is associated to
change-proneness.
Research Questions
RQ1: Does LCbC capture a dimension of module cohesion that is not captured by structural cohesion metrics?
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
by structural cohesion metrics?
9
Research Questions
RQ2: How strong is the correlation between LCbC and module change-proneness?
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
proneness?
10
Research Questions
RQ3: Does the LCbC metric applied together with structural cohesion
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
together with structural cohesion metrics enhance the prediction of module changes?
11
Empirical Study Settings
Change history
Module1 - - - - -
Module 2 - - - -
Module 3 - - - - - - -
…
Module n - - - -
System
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 12
Module n - - - -
LCOM2, LCOM3,
LCOM4, LCOM5,
TCC, LCbC
Change Count (CC)
Change history
Module1 - - - - -
Module 2 - - - -
Module 3 - - - - - - -
…
Module n - - - -
System
LCOM2, LCOM3,
LCOM4, LCOM5,
TCC, LCbC
Change Count (CC)
System Revisions analyzed
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 13
System Revisions analyzedJFreeChart 2,272
Freecol 3,426
jEdit 2,916
Tomcat 3,157
Findbugs 3,765
Rhino 777
Total 16,313
Empirical Study Settings
LCbC needs a concern-to-code mapping
concern A concern B concern C
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 14…
concern A concern B concern C
…
SystemJFreeChart
Freecol
jEdit
Concerns automatically
mapped using the XScan tool
Empirical Study Settings
Concern-to-code mapping procedure
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
jEdit
Tomcat
Findbugs
Rhino
mapped using the XScan tool
Manual concern mapping
provided by Eaddy et al (2008)
15
PC1 PC2 PC3 PC4 PC1 PC2 PC3 PC4 PC1 PC2 PC3 PC4 PC1 PC2 PC3 PC4 PC1 PC2 PC3 PC4 PC1 PC2 PC3 PC4
LCOM2 0.94 0.14 - 0.11 0.11 0.96 0.04 0.11 0.07 0.06 0.98 0.08 0.04 0.09 0.08 0.25 0.96 0.12 0.04 0.14 0.98 - 0.12 0.72 0.41 0.34
FreecolJFreeChart Rhino jEdit Tomcat Findbugs
RQ1: Does LCbC capture a dimension of module
cohesion that is not captured by structural cohesion
metrics?
Principal Component Analysis (PCA)
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
LCOM2 0.94 0.14 - 0.11 0.11 0.96 0.04 0.11 0.07 0.06 0.98 0.08 0.04 0.09 0.08 0.25 0.96 0.12 0.04 0.14 0.98 - 0.12 0.72 0.41 0.34
LCOM3 0.02 0.72 - 0.43 0.37 0.23 0.72 0.43 0.24 0.90 0.12 0.17 0.07 0.89 0.15 0.12 0.08 0.90 0.12 0.10 0.12 0.64 0.20 0.53 0.15
LCOM4 0.87 0.03 - 0.04 0.37 0.94 0.19 0.07 0.09 0.18 0.09 0.97 0.08 0.11 0.09 0.95 0.25 0.12 0.00 0.98 0.14 0.16 0.09 0.03 0.96
LCOM5 0.14 0.94 - 0.12 - 0.04 0.11 0.21 0.94 0.19 0.87 0.12 - 0.03 0.06 0.88 0.04 - 0.06 0.13 0.88 0.06 - 0.02 0.07 0.27 - 0.06 0.89 0.01
TCC - 0.12 - 0.24 0.95 - 0.09 - 0.08 - 0.95 - 0.08 - 0.02 - 0.79 0.12 - 0.21 0.06 - 0.85 - 0.01 - 0.16 0.03 - 0.80 0.06 - 0.14 - 0.04 - 0.89 - 0.05 - 0.16 - 0.11
LCbC 0.51 0.10 - 0.13 0.81 0.11 0.11 0.19 0.97 0.04 0.04 0.08 0.99 0.10 0.99 0.08 0.07 0.05 0.99 0.00 0.04 0.22 0.89 - 0.18 - 0.03
LCbC was the major metric of at least one PC in all systems. And in most of the systems it contributed exclusively for a PC
16
RQ2: How strong is the correlation between LCbC and
module change-proneness?
Spearman Correlation: each cohesion metric vs CC
JFreeChart Rhino jEdit Tomcat Findbugs Freecol
LCOM2 0.48 0.69 0.16 0.33 0.48 0.49
LCOM3 0.34 0.48 0.17 0.27 0.38 0.19
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
In jEdit and Findbugs LCbC did not perform well
17
LCOM3 0.34 0.48 0.17 0.27 0.38 0.19
LCOM4 0.32 0.46 0.10 0.21 0.23 0.20
LCOM5 0.15 0.30 0.18 0.23 0.34 0.22
TCC 0.24 0.22 0.13 0.16 0.06* 0.30
LCbC 0.66 0.62 0.15 0.35 0.21 0.46
* no signicance level
RQ2: How strong is the correlation between LCbC and
module change-proneness?
Spearman Correlation: each cohesion metric vs CC
JFreeChart Rhino jEdit Tomcat Findbugs Freecol
LCOM2 0.48 0.69 0.16 0.33 0.48 0.49
LCOM3 0.34 0.48 0.17 0.27 0.38 0.19
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
LCbC and LCOM2 were the most correlated with
change count
18
LCOM3 0.34 0.48 0.17 0.27 0.38 0.19
LCOM4 0.32 0.46 0.10 0.21 0.23 0.20
LCOM5 0.15 0.30 0.18 0.23 0.34 0.22
TCC 0.24 0.22 0.13 0.16 0.06* 0.30
LCbC 0.66 0.62 0.15 0.35 0.21 0.46
* no signicance level
JFreeChart Rhino jEdit Tomcat Findbugs Freecol
LCOM2 0.48 0.69 0.16 0.33 0.48 0.49
LCOM3 0.34 0.48 0.17 0.27 0.38 0.19
RQ2: How strong is the correlation between LCbC and
module change-proneness?
Spearman Correlation: each cohesion metric vs CC
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
LCOM3 0.34 0.48 0.17 0.27 0.38 0.19
LCOM4 0.32 0.46 0.10 0.21 0.23 0.20
LCOM5 0.15 0.30 0.18 0.23 0.34 0.22
TCC 0.24 0.22 0.13 0.16 0.06* 0.30
LCbC 0.66 0.62 0.15 0.35 0.21 0.46
* no signicance level
In Rhino and Freecol, LCbC was the second most correlated (strong and moderate, respectively) preceded by LCOM2.
19
JFreeChart Rhino jEdit Tomcat Findbugs Freecol
LCOM2 0.48 0.69 0.16 0.33 0.48 0.49
LCOM3 0.34 0.48 0.17 0.27 0.38 0.19
RQ2: How strong is the correlation between LCbC and
module change-proneness?
Spearman Correlation: each cohesion metric vs CC
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
LCOM3 0.34 0.48 0.17 0.27 0.38 0.19
LCOM4 0.32 0.46 0.10 0.21 0.23 0.20
LCOM5 0.15 0.30 0.18 0.23 0.34 0.22
TCC 0.24 0.22 0.13 0.16 0.06* 0.30
LCbC 0.66 0.62 0.15 0.35 0.21 0.46
* no signicance level
LCbC was the most correlated with change count in JFreeChart (strong correlation) and Tomcat (moderate correlation).
20
RQ3: Does the LCbC metric applied together with
structural cohesion metrics enhance the prediction of
module changes?
Linear Regression AnalysisR
2 (adj)
JFreeChart 0.63
Rhino 0.59
(0.47)LCOM2 + (0.11)LCOM3 + (0.59)LCbC + (-0.27)LCOM4
(0.63)LCOM2 + (0.37)LCOM3 + (0.18*)TCC
Metrics in the Final Model with Standardized Coefficients
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
LCbC ended up in four regression models
21
Rhino 0.59
Findbugs 0.37
Freecol 0.35
Tomcat 0.32
jEdit 0.26
* no signicance level
(0.20)LCOM2 + (0.35)LCOM4 + (0.09*)LCOM5 + (0.17)LCbC
(0.63)LCOM2 + (0.37)LCOM3 + (0.18*)TCC
(0.45)LCOM2 + (0.20)LCOM3 + (0.17)LCOM4
(0.44)LCOM2 + (0.21)LCOM3 + (0.11)LCbC
(0.39)LCOM2 + (0.16)LCOM3 + (0.29)LCbC + (-0.07*)LCOM4
RQ3: Does the LCbC metric applied together with
structural cohesion metrics enhance the prediction of
module changes?
Linear Regression AnalysisR
2 (adj)
JFreeChart 0.63
Rhino 0.59
(0.47)LCOM2 + (0.11)LCOM3 + (0.59)LCbC + (-0.27)LCOM4
(0.63)LCOM2 + (0.37)LCOM3 + (0.18*)TCC
Metrics in the Final Model with Standardized Coefficients
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
LCbC was the most important metric for the JFreeChartregression model
22
Rhino 0.59
Findbugs 0.37
Freecol 0.35
Tomcat 0.32
jEdit 0.26
* no signicance level
(0.20)LCOM2 + (0.35)LCOM4 + (0.09*)LCOM5 + (0.17)LCbC
(0.63)LCOM2 + (0.37)LCOM3 + (0.18*)TCC
(0.45)LCOM2 + (0.20)LCOM3 + (0.17)LCOM4
(0.44)LCOM2 + (0.21)LCOM3 + (0.11)LCbC
(0.39)LCOM2 + (0.16)LCOM3 + (0.29)LCbC + (-0.07*)LCOM4
Examples that illustrate the differences on
the dimensions of cohesion captured by
LCbC and structural cohesion metrics
Class (System) LCbC (Rank) LCOM2 (Rank) CC (Rank)
ResponseFacade (Tomcat) 10 (top 2%) 0 5 (top 20%)
CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 23
CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)
Examples that illustrate the differences on
the dimensions of cohesion captured by
LCbC and structural cohesion metrics
Class (System) LCbC (Rank) LCOM2 (Rank) CC (Rank)
ResponseFacade (Tomcat) 10 (top 2%) 0 5 (top 20%)
CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 24
CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)
Facade class usually has methods related to different
concerns because it serves as entrance point for
different functionalities.
Class (System) LCbC (Rank) LCOM2 (Rank) CC (Rank)
ResponseFacade (Tomcat) 10 (top 2%) 0 5 (top 20%)
CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)
Examples that illustrate the differences on
the dimensions of cohesion captured by
LCbC and structural cohesion metrics
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)
25
Concerns related to: drawing, zooming, axis space, click
handling and plotting.
When concern-based cohesion fails in the
association with changes
Class (System) LCbC (Rank) LCOM2 (Rank) CC (Rank)
When the concern-to-code mapping fails to
identify concerns!
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br 26
Class (System) LCbC (Rank) LCOM2 (Rank) CC (Rank)
jEdit (jEdit) 0 9351 (3rd) 24 (2nd)
JEditBuffer (jEdit) 0 5913 (4th) 17 (5th)
SortedBugCollection (Findbugs) 0 1889 (5th) 76 (4th)
Threats to Validity
Quality of concern-to-code mapping
Underlying tool for concern mapping
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
Change Count
27
Conclusions
LCbC defined itself a new and orthogonal dimension of module cohesion in the studied systems.
LCbC performed well in the association with change-proneness in most of the systems.
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
proneness in most of the systems.
Concern-based cohesion has provided indications that
it is worth to be further investigated.
28
Future Work
�How LCbC performs in comparison with topic-based cohesion metrics such as C3 and MWE
�The association between LCbC and fault-proneness
�Whether or not the type of class would be an
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
�Whether or not the type of class would be an interesting factor to be considered
�The application of different regression analysis techniques
�Search for more complete concern mappings
29
Concern-based Cohesion: Unveiling
a Hidden Dimension of Cohesion
Measurement
Software Engineering Lab – UFBA
Salvador-Bahia-Brazil - les.dcc.ufba.br
Software Design and Evolution Group
aside.dcc.ufba.br
Measurement
Bruno C. da SilvaBruno C. da [email protected]@dcc.ufba.br
Cláudio Cláudio Sant’AnnaSant’[email protected]@dcc.ufba.br
Christina Christina [email protected]@dcc.ufba.br
Federal University of Bahia (UFBA)
Alessandro GarciaAlessandro [email protected]@inf.puc--rio.brrio.br