impact analysis of granularity levels on feature location technique

15
Impact Analysis of Granularity Levels on Feature Location Technique Chakkrit Tantithamthavorn (Ph.D. Student) and Akinori Ihara, Hideaki Hata, Ken-ichi Matsumoto Software Engineering Laboratory Graduate School of Information Science Nara Institute of Science and Technology

Upload: chakkrit-tantithamthavorn

Post on 13-Jul-2015

124 views

Category:

Technology


0 download

TRANSCRIPT

Impact Analysis of Granularity Levels on Feature Location Technique

Chakkrit Tantithamthavorn (Ph.D. Student) and Akinori Ihara, Hideaki Hata, Ken-ichi Matsumoto !Software Engineering Laboratory Graduate School of Information Science Nara Institute of Science and Technology

Outline

✤ Introduction

✤ Motivation

✤ Study Design

✤ Results

✤ Conclusion

2Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

Growing complexity makes software difficult to maintain.

The evolution of the software size of Eclipse Platform Project.

Within 12 years, the product size has grown more than 10 folds.

3

Millions lines of code!!!!!

Introduction > Motivation > Study Design > Result > Conclusion

Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

WHERE is a bug?

4

Identifying WHERE a feature is implemented in the source codebased on a given requirement is painstaking and time-consuming.

Implement new features

Enhance existing feature Fix bugs

Introduction > Motivation > Study Design > Result > Conclusion

Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

IR-based feature localization helps get it done.

Current research adopts Information Retrieval (IR) models to find source code entities that are textually similar to a given issue report.

5

Bug

Summary

Description

Bug

Summary

Description

Bug

Summary

Description

Code

Method 1

Method 2

Method 3

Code

Method 1

Method 2

Method 3

Code

Method 1

Method 2

Method 3

File-LevelVersion Control

Data Extraction

Method-LevelDocument Collection

Code

Method 1

Method 2

Method 3

Code

Method 1

Method 2

Method 3

Method

Indexing

Query Construction

QuerySimilarity Function

Data Preprocessing

Retrieving and Ranking

Rank Method Score

1 foo() 0.98

2 bar() 0.854

3 foobar() 0.321

Top N search results

New Bug Report

Searching

Source Code Entities

Overview of Information Retrieval based feature localization

New Bug Report

Class-Level [Rao et al,. 2011]

Query

Document Corpus

{Source code entities

Function-Level [Lukins et al,. 2010]

How does the granularity levels impact to the performance and effort of IR-based feature localization,

however, it’s not known.

An Open Issue:

Introduction > Motivation > Study Design > Result > Conclusion

Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

1.) Class-level feature localization might not be practical in reality.

14 functions in a class{Only 1 function is buggy.

6

A Motivating Example:

2.) Class-level feature localization requires a huge amount of extra effort to locate bugs.

Only 1 line is needed to be fixed.

~ 500 lines of code

Bug Report

Source Code

Study Design: Overview

7

Research Questions

RQ1: Does function-level feature localization

outperform class-level feature localization?

RQ2: How much effort does function-level feature

localization save over class-level feature localization?

Research Hypothesis: Function-level feature localization is more practical than class-level feature localization. !To validate this hypothesis, we aim to explore two research questions by comparing the performance and effort of IR-based feature localization at the class and feature levels.

Introduction > Motivation > Study Design > Result > Conclusion

Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

Study Design: Studied Projects

8

Reasons: 1.) These projects are large, active and real-world systems. 2.) Each software project carefully maintains bug tracking system and source code version control repositories.

Introduction > Motivation > Study Design > Result > Conclusion

Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

Study Design: IR-based Feature Localization

9

Source code files or methods

Introduction > Motivation > Study Design > Result > Conclusion

Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

Class 3

Function C

Function F

...Rank3

Function E

Function F

...

Rank5

Rank6

LOC required to review suspicious entities

Related

Non-Related

Function A

Function B

Function C

Function D

Ranked results for an issue report at

the function level

Class 1

Class 2

Function A

Function D

Function B

Function E

Ranked results for an issue report at the class level

Rank1

Rank2

Rank3

Rank4

Rank1

Rank2

LOC threshold

0 LOC

}

Study Design: Effort-Based Evaluation

10

We used lines of code as a proxy to measure effort required to find the first relevant source code entity.

Introduction > Motivation > Study Design > Result > Conclusion

Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

RQ1: Does function-level feature localization outperform class-level feature localization?

When inspecting 1,000 LOC, function-level feature localization can localize 50% of issue reports, while class-level feature

localization can localize 40% of issue reports.

11

●●

●●

●● ● ●

LOC

LOC−b

ased

Per

form

nace

(%)

0

10

20

30

40

50

60

70

80

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Eclipse Platform

● MethodFile

●●

●●

●●

● ●

LOC

LOC−b

ased

Per

form

nace

(%)

0

10

20

30

40

50

60

70

80

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

● MethodFile

Eclipse PDE

●●

● ●●

● ● ●

LOC

LOC−b

ased

Per

form

nace

(%)

0

10

20

30

40

50

60

70

80

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

● MethodFile

Eclipse JDT

LOC-based Performance: The percentage of successfully localized bug reports at the LOC threshold.

FeatureClass

FeatureClass

FeatureClass

Introduction > Motivation > Study Design > Result > Conclusion

Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

RQ2: How much effort does function-level feature localization save over class-level feature localization?

12

●●●●●

●●

●●●

●●●●

●●

Method File

010

0020

0030

0040

0050

00

Eclipse Platform

LOC

●●

●●

●●●●

●●

●●●

Method File

010

0020

0030

0040

0050

00

Eclipse PDE

LOC

Method File

010

0020

0030

0040

0050

00

Eclipse JDT

LOC

Effort required to find the first buggy location

●●

●●

●●●

●●

●●

●●

●●●

Method File

050

0010

000

1500

020

000

2500

030

000

Eclipse PlatformLO

C

●●●

●●

●●

●●

●●

Method File

050

0010

000

1500

020

000

2500

030

000

Eclipse PDE

LOC

Method File

010

000

2000

030

000

4000

050

000

6000

0

Eclipse JDT

LOC

Effort required to find 80% of buggy locations

saves 7 times saves 4.4 times

Function Class Function Class Function Class Function Class Function Class Function Class

Introduction > Motivation > Study Design > Result > Conclusion

Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

Function-level feature localization requires 113 LOC, while class-level feature

localization requires 906 LOC to locate the first relevant source code entity.

Function-level bug localization requires 1,309 LOC, while class-level feature

localization requires 2,744 LOC to locate 80% of relevant source code entities.

SummaryGoal: To investigate the impact of granularity levels on the performance and effort of IR-based feature localization

Main findings: !- For the same amount of inspection effort, function-level feature localization outperforms class-level feature localization. !- Function-level feature localization saves 7 times of inspection effort to find the first relevant bug location and 4.4 times to find 80% of bug locations.

13

Approach: We used the Vector Space Model (VSM) to localize bugs at method and file granularity levels. We evaluated on 1,968 bug reports with 10,959 files and 82,946 methods.

Introduction > Motivation > Study Design > Result > Conclusion

Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

14

15

“Feature localization at the function-level is effective in practice.”

!Thank you for your attention