cohesion and coupling optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_paixao.pdf · semantic and...

31
Cohesion and Coupling Optimisation BALANCING IMPROVEMENT AND DISRUPTION MATHEUS PAIXAO, MARK HARMAN, YUANYUAN ZHANG, YIJUN YU

Upload: others

Post on 26-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Cohesion and Coupling OptimisationBALANCING IMPROVEMENT AND DISRUPTION

MATHEUS PAIXAO, MARK HARMAN, YUANYUAN ZHANG, YIJUN YU

Page 2: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

What I have (not) done

The ultimate solution for software modularisation

Full understanding of developers behaviour

Identification of the best metrics for software modularisation

[email protected] 2

Page 3: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

What I have done

Thorough empirical study of structural cohesion and coupling optimisation

Identification of disruption as an important overlooked issue

Multi-objective approach to balance structural improvement and disruption

[email protected] 3

Page 4: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Structural Dependencies

[email protected]

p1

c1 c2 c3

c4 c5

p2

c6 c7

c8

p3

c9

Function Call

Data Access

Inheritance

Interface Implementation

Page 5: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Modularisation drivers

Semantic and Structural cohesion/coupling metrics better describe developers’ implementations [1][2]

[email protected] 5

[2] Candela, I., Bavota, G., Russo, B., & Oliveto, R. (2016). Using Cohesion and Coupling for Software Remodularization : Is It Enough ? ACM Transactions on Software Engineering and Methodology, 25(3), 1–28.

[1] Bavota, G., Dit, B., Oliveto, R., Di Penta, M., Poshyvanyk, D., & De Lucia, A. (2013). An empirical study on the developers’ perception of software coupling. In 2013 35th International Conference on Software Engineering (ICSE) (pp. 692–701). San Francisco: IEEE.

Page 6: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Literature Review

[email protected]

Metrics Validation

LongitudinalEvaluation

DisruptionAnalysis

largest empirical study of automated software

re-modularisation to date

Page 7: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Software Systems Under Study

[email protected] 7

Page 8: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Modularisation Quality - MQ

[email protected] 8

p1

c1 c2 c3

c4 c5

p2

c6 c7

c8

p3

c9

𝑀𝐹(𝑝1) = 0.72

𝑀𝐹(𝑝2) = 0.66

𝑀𝐹(𝑝3) = 0.00

𝑀𝑄 = 1.38

Page 9: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

RQ1: Is there any evidence that open source software systems respect structural measurements of cohesion and coupling?

[email protected] 9

RQ1.1: Purely Random Distribution RQ1.2: k-Random Neighbourhood Search

RQ1.3: Systematic 1-Neighbourhood Search

Cohesion

99.99%

0.1001%

MQ

99.99%

0.0866%

Page 10: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

RQ2: What is the relationship between raw cohesion and the MQ metric?

[email protected] 10

-100.00%

-50.00%

0.00%

50.00%

100.00%

150.00%

200.00%

250.00%

300.00%

350.00%

400.00%

Bunch Solutions

MQ Difference Cohesion Difference

Bunch Search Solutions

Pivot 2.0.2

13

Packages

58

All Systems

493.11%

Page 11: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

RQ2: What is the relationship between raw cohesion and the MQ metric?

[email protected] 11

Package-constrained Search Solutions

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

Package-constrained Solutions

MQ Difference Cohesion Difference

Page 13: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

DisMoJo

Disruption metric based on MoJoFM[2]

[email protected] 13

[3] Zhihua Wen, & Tzerpos, V. (2004). An effectiveness measure for software clustering algorithms. In Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004. (pp. 194–203).

Page 14: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Disruption Analysis Workflow

[email protected] 14

Releases

Re-modularisationapproach

ImprovedModularisations

DisMoJo

DisruptionValues

BunchPackage-constrained

30 executions

Page 15: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

RQ3: What is the disruption caused by search based approaches for optimising software modularisation?

[email protected] 15

Bunch Package-constrained

Mean

80.39% 57.82%

Page 16: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Release 1.2

p4

Longitudinal Disruption Analysis

[email protected] 16

Release 1.1

p1

c1

p2

p3

c2 c3

c4 c5

c9

c6 c7

c8

p1

c1

p2

p3

c2 c3

c4 c5

c9

c6 c7

c8

c10 c11 c12

Page 17: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Longitudinal Disruption Analysis

[email protected] 17

Lower Bound

Mean

4.32% 30.99%

Upper Bound

Page 18: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Balancing modularity improvement and disruption

[email protected] 18

Page 19: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

RQ5: What is the modularity improvement provided by the multiobjective search for acceptable disruption levels?

[email protected] 19

Package-free

Lower Bound

1.66% in MQ

0.13% in Cohesion

Upper Bound

150.38% in MQ

2.38% in Cohesion

Package-constrained

Lower Bound

3.36% in MQ

0.72% in Cohesion

Upper Bound

59.25% in MQ

23.49% in Cohesion

Page 20: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Conclusion

Software systems respect modularity measurements

Search based approaches for re-modularisation cause large disruption

Multiobjective search can be used to find clear and constant trade-off between modularity improvement and disruption

Modularity can be improved within lower and upper bounds of acceptable disruption performed by developers

[email protected] 20

Page 21: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Backup – System’s Selection Criteria

At least 10 subsequent official releases

No general libraries and APIs

Java systems

[email protected] 21

Page 22: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Backup - Software Systems Under Study

[email protected] 22

Page 23: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

it’s an ordinal metric[1]

value has no meaning

it’s not normalised

Backup - Modularisation Metrics

[email protected] 23

avoids god packages

MQ

[4] Stanley Stevens. (1946). On the Theory of Scales of Measurement. American Association for the Advancement of Science, 103(2684), 677–680.

it’s an interval metric[1]

easy to understand

it’s normalised

leads to god packages

Cohesion

inflation effect

Page 24: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Backup – RQ1 table of results

[email protected] 24

Page 25: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Backup – RQ2 table of results

[email protected] 25

Page 26: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Backup – Classes distribution

[email protected] 26

RQ2.3: Package-constrained Search Solutions

30.06% cohesion improvement

34.15%

Biggest package

Smallest package

Original Implementations

0.32%

Biggest package

Smallest package

Package-constrained solutions

1.33%

19.80%

Page 27: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Backup – RQ3 table of results

[email protected] 27

Page 28: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Backup - RQ3: Best MQ and best cohesion disruption

[email protected] 28

Bunch Package-constrained

Mean

80.39% 57.82%

Best MQ

79.67% 55.30%

Best Cohesion

78.77% 54.33%

Page 29: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Backup – RQ4 Two-Archive GA parameters

Population Size: number of classes (N)

Single point crossover (0.8 if N < 100; 1.0 otherwise)

Swap mutation (0.004 x log2N)

Tournament selection (size 2)

50N Generations

[email protected] 29

Page 30: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Backup – RQ5 natural disruption

[email protected] 30

Page 31: Cohesion and Coupling Optimisationcrest.cs.ucl.ac.uk/cow/49/slides/cow49_Paixao.pdf · Semantic and Structural cohesion/coupling metrics better describe developers’ implementations

Backup – RQ5 modularity improvement within acceptable disruption

[email protected] 31