rethinking user interfaces for feature location

Rethinking User Interfaces for Feature LocationFabian Beck∗, Bogdan Dit†, Jaleo Velasco-Madden‡, Daniel Weiskopf∗ and Denys Poshyvanyk‡

∗VISUS, University of Stuttgart, Germany†Boise State University, Boise, ID, USA

‡The College of William and Mary, Williamsburg, VA, USAEmail: [email protected], [email protected], [email protected],

[email protected], [email protected]

Abstract—Locating features in large software systems is afundamental maintenance task for developers when fixing bugsand extending software. We introduce In Situ Impact Insight (I3),a novel user interface to support feature location. In additionto a list of search results, I3 provides support for developersduring browsing and inspecting the retrieved code entities. In situvisualizations augment results and source code with additionalinformation relevant for further exploration. Developers are ableto retrieve details on the textual similarity of a source codeentity to the search query and to other entities, as well asthe information on co-changed entities from a project’s history.Execution traces recorded during program runs can be used asfilters to further refine the search results. We implemented I3as an Eclipse plug-in and tested it in a user study involving 18students and professional developers that were asked to performthree feature location tasks chosen from the issue tracking systemof jEdit. The results of our study suggest that I3’s user interface isintuitive and unobtrusively supports developers with the requiredinformation when and where they need it.

I. INTRODUCTION

Feature location [1] and impact analysis [2] are two funda-mental activities that need to be performed during a mainte-nance task (e.g., implementing a new feature, fixing a bug),and involve finding the source code entities that need to bechanged for the maintenance task. These activities are difficultand time consuming, especially for large software systems.Therefore, developing techniques to support developers duringmaintenance tasks is paramount, and a lot of effort has beendedicated towards developing such techniques [3]. The typicaloutput of such techniques is a list of code entities (e.g.,methods) that should be considered by the developers. But fordevelopers, a list of methods is not the end but rather the startof the feature location process. Currently, the only further toolsupport addresses the reformulation of queries [4], [5], [6], [7]and the refinement of the search space [8]. Developers are leftwithout much help to determine which of the methods reallyneed to be changed and why; most previous feature locationapproaches considered developer interaction as a ‘black box’.

In this paper, we introduce In Situ Impact Insight (I3),a novel user interface to better support developers whenperforming feature location. The main contribution is that I3explains the search result to the users, allows for filtering of theresults with respect to different execution scenarios, and relatescode entities to each other (Figure 1). I3 uses visualizationand other advanced user interface elements to present searchresults. In situ visualization augments the source code and

?feature location technique

search query

list of code entities

codechanges

standard approach

I3 approach

explain relatefilter

Fig. 1. Standard feature location process and our proposal of supporting thedeveloper in understanding and navigating the search results.

the list of results with small diagrams. Detail views describethe textual similarities between a search query and a method,or between two methods. Also, developers can gain insightsabout change impact based on the change history of methods.

Our approach is based on state-of-the-art feature locationintegrating textual, historical, and dynamic information [9],[10] (Section III). We modeled the iterative feature locationprocess as inter-connected cognitive tasks (Section IV). Theuser interface, implemented as an Eclipse plug-in, is designedto support these tasks (Section V). In a user study involving 18participants, we tested the usefulness of I3 on three realisticfeature location tasks for the popular text editor jEdit (Sec-tion VI). The feedback of the participants indicates that I3’sinterface is easy to use and unobtrusively supports developerswith the required information from different data sources(Section VII). I3 drastically differs from previous featurelocation interfaces in the general approach of presenting theresults to the developers (Section VIII).

II. RELATED WORK

The feature (or concept) location process is a fundamentalfirst step towards addressing software maintenance tasks, andresearchers have approached this problem from different di-rections. A systematic survey of feature location techniques isdescribed by Dit et al. [3]. These automatic or semi-automatictechniques use textual [11], [12], [13], [14], [15], [16], [17],[18], dynamic [19], [20], historical information [21], [22], orhybrid approaches [23], [24], [12], [13], [4], [20], [25], [26],[27], [28], [29], [30], [31].

beckfn

Typewriter

beckfn

Typewriter

beckfn

Typewriter

ICPC 2015

Researchers already addressed the user interfaces and vi-sualization of the results produced by feature location tools.Sando [5], [7] and CONQUER [4], [6] help refine the queryby suggesting co-occurring terms and synonyms. MFIE [8]is a faceted search approach that allows for focusing thesearch result with respect to a variety of facets. Our approach,I3, however, is most closely related to FLAT3 [32] and itsextension ImpactMiner [10]: execution traces are used to filterthe search results and co-change information to find relatedcode entities. The main difference of I3 to previous approachesis that the goal is not to come up with the perfect list of searchresults, but to consider the list as only the starting point offeature location. A detailed discussion, pointing out differencesin the use of data sources, data representations, and evaluation,is provided as part of Section VIII.

Another avenue pursued consisted of conducting user stud-ies to evaluate how developers perform feature location [33],[34], [35], [36], [37], [38], [39], [40], [41]. For instance, Ko etal. [35] conducted an exploratory study investigating how de-velopers understand unfamiliar code during maintenance tasks,and reported that developers seek, relate, and collect relevantinformation when performing these tasks. Similar findings arereported by Wang et al. [41] who present a generalized model(consisting on phases, patterns, and actions) on how developersperform feature location. Li et al. [37] investigated help-seeking task structures and strategies, information sources,process models, and information needs of developers duringsoftware development tasks.

A sizeable body of work has been devoted to investigate thevocabulary of the source code [42], [43] in order to support de-velopers during the process of formulating and reformulatingqueries. Haiduc et al. [44] used machine learning to propose atechnique that recommends a reformulation strategy for a par-ticular query based on the properties of the query. Stolee et al.[45], [46], [47], [48] proposed an approach that uses an SMTsolver and symbolic analysis to identify the program fragmentsthat adhere to query defined by developers as a set of inputsand expected outputs. Ge et al. [5] enhanced the Sando searchtool [7] to recommend queries based on the dictionary of thesoftware system, the term co-occurrence matrix, verb-direct-object pairs, a software engineering thesaurus, and a generalEnglish thesaurus. CodeExchange [49] aims at constructinga network of social and technical information relevant to thecode. Although I3 was not designed to automatically suggestreformulations, it provides visual feedback to help with thetask, for instance, by (i) highlighting the words from the querythat were found in the code base and by (ii) showing thefrequency of search terms and the terms of a particular method.

Our approach uses in situ software visualizations (i.e., smallvisualization embedded into the code and other elements ofthe IDE providing additional information where required)to preview textual search similarities and change history ofmethods. In general, small visualizations embedded in textare known as Sparklines [50]. While pretty printing and codehighlighting are common visual aids in text editors, recently,in situ visualization were also used to augment source code,

for instance, to encode static software metrics [51], dynamicsoftware metrics [52], runtime consumption [53], or values ofnumeric variables [54]. However, to the best of our knowledge,in situ visualizations have never been used for feature location.

III. FEATURE LOCATION TECHNIQUE

Our I3 approach implements a feature location techniquethat uses textual, dynamic, and historical information [9], [10].Since I3 is integrated into Eclipse as a plug-in, the implemen-tation is based on Java. This section briefly summarizes thetechnical realization of the feature location engine and dataretrieval, while the following sections describe how the datais presented to the users.

Information Retrieval: We rely on Information Retrieval(IR) to provide a list of methods ranked based on their textualsimilarity to a search query [9], [55]. First, we take advantageof the Eclipse Java Development Tools (JDT) to extract all themethods from a software project. Second, using Lucene [56],we build a corpus at method-level granularity, preprocess it(i.e., remove special characters, split identifiers [57], removeEnglish and Java stop words, and apply the Porter stemmer)and generate the indexed corpus. Third, given a search query,we use Lucene to compute textual similarities between thequery and the methods of the software system. We finallyretrieve a sorted list of search results.

Dynamic Analysis: In order to allow filtering of the resultsproduced by the IR technique [9], [19], we provide developerswith a mechanism to use the Eclipse Test and PerformanceTools Platform (TPTP) to collect full traces of specific execu-tions of software (i.e., they capture all the methods executedfrom the time the software starts until it is stopped by thedeveloper). Once the list of methods from an execution traceis extracted, the developer can choose to rank these executedmethods based on their similarity to the search query. Notethat the use of dynamic information is optional.

Mining Software Repositories: The historical informationabout a software system, which is used as input for I3,is extracted in a similar way as described in our previouswork [9], [58]. First, we retrieve all the commits and metadatafrom the version control system (in our case, SVN), includingcommit log message, author, list of modified files, and theactual changes. Second, for each commit, we identified thelist of methods that had their content modified, using a toolbased on the Eclipse Abstract Syntax Tree (from Eclipse JDT).Third, the list of methods modified for each revision is used asinput to compute which methods frequently changed together(i.e., co-change relation).

IV. COGNITIVE TASK ANALYSIS

Feature location is not the end in itself, but rather a part of alarger sense-making process that enables a developer to makeinformed decisions on code changes. To better understandthe process, we specialize cognitive tasks taken from a moregeneral model [59] for feature location. This Cognitive TaskAnalysis enables us to understand key activities behind featurelocation and provide good theoretical basis for designing a

search

and filter

read and extract

follow relations

calls

textually similar

co-change

textual information retrieval

filter by execution trace

augmented source code

details on demand

[IDE functionality]

Cognitive Tasks I3 Task Support

Fig. 2. Foraging loop of a Cognitive Task Analysis of a sensemakingprocess [59] applied to feature location.

user interface and visualizations supporting these activities,which can be considered as a form of support for distributedcognition [60]. In particular, we fit the foraging loop designedby Pirolli and Card [59] for intelligence analysis to featurelocation. It describes the process of seeking, understanding,and relating information. In its original version, it consists ofthe tasks search and filter, read and extract, search for rela-tions, and search for information. In case of feature location,we identified the following inter-connected tasks (Figure 2):

Search (and Filter): The feature location process startswith a search, which is covered by the IR method in ourapproach. Optionally, filtering the results may restrict themto a smaller subset, here provided through execution traces.

Read and Extract: Following search and filter, initialinformation can be extracted from the search results by readingmethod names. But to really understand whether a method isrelevant, developers need to look into the source code. Thisprocess could lead to reformulating the search query. Thetask of understanding the search results is facilitated in ourapproach by visually augmenting the results and the code.

Follow Relations: Search results are not independent ofeach other, but connected. While the IDE already allowsfollowing relations such as method calls, our approach addsother relations that can be relevant for feature location—textually similar or co-changed methods. Following relationsgenerally leads back to task read and extract.

These tasks largely conform to those originally pro-posed [59]. Only search for information is only indirectlycovered by the reformulation of a search query, which we onlysee as a variant of the formulation of the original search querybeing part of search (and filter). Pirolli and Card [59] furtherdescribe a sensemaking loop, which follows after the foragingloop. Since it includes building a mental model or theory of thecollected information, it, however, reaches beyond what ourfeature location approach may support. Applying this theory inour context is supported by Lawrance et al. [61] who found ina study that information foraging as formulated by Pirolli andCard [62] much better describes the strategies developers usefor debugging, in particular, in the early stages of navigatingthrough the code before starting to fix the bug.

V. I3 VISUAL USER INTERFACE

The main contribution of this paper is proposing a noveluser interface for feature location that handles and visualizesseveral sources of information relevant for the task. Resultsneed to be represented, explained, and made explorable. Asa consequence, our implementation of the proposed approachclosely merges into an IDE, namely, Eclipse: in addition toa search view, the source code in the editor view becomesvisually augmented with information that explains the searchresult and points to other possibly relevant code. Figure 3 givesan overview of the user interface. On the right side, a searchview allows to begin the feature location activity by enteringa search query (A). Results are presented in the list below(C) and can be filtered by execution traces (B). Search resultsare also indicated in the editor view by highlighting queryterms. The method list and the editor view are enriched within situ visualizations (D; Figure 4) that provide a preview ofadditional information related to textual similarity to the searchquery and change history of the methods. Moreover, details ofthese visualization techniques can be retrieved on demand intooltip dialogs (Figure 5 and 6).

A. Search Interface

As indicated in Figure 2, the starting point of each featurelocation process is a search of a textual description of thetask, formulated as a query by the developer. Also, longerdescriptions of the tasks can be used (e.g., description of abug report, phrases from documentation). For this reason, thesearch field is a multi-line text field (Figure 3, A). During thefeature location process—in particular, as a consequence ofthe task read and extract (Figure 2)—the search query mightbe corrected, specified in more detail, or alternated otherwise.After running the search, terms excluding stop words areprinted in bold to give feedback to the users which terms havebeen really considered by the search engine.

The results, in the form of a list of methods, are presentedbelow the search field. The methods are ordered by similarityranking to the search query as provided by the Lucene searchengine [56]. Supporting the filter task (Figure 2), they canbe restricted to collected execution traces, available in adrop-down combo box (Figure 3, B): methods that were notexecuted in the respective recorded program run are removedfrom the list of results. For each method, the name, type ofparameters, and return type are provided in the first line ofeach list entry. The second line further references the classand package containing the method. In front of the methodand class names, Eclipse-specific icons are used to encodevisibility properties and other modifiers. At the right side ofan entry in the result list, a small visualization indicates thatmore details are available on demand (see Section V-B). Byclicking on one of the search results, the corresponding methodis opened and centered in the source code editor. All codebelonging to methods that are part of the search results ishighlighted with a blue background to discern it from othercode. Terms in the source code that directly match one of theterms in the query are highlighted with darker blue. Further,

A

B

C

D

Fig. 3. User interface of I3 enriching Eclipse with an additional view for method search consisting of a search field (A), a filter selection (B), and a list ofsearch results (C); search results are also highlighted and visualized in the editor (D).

SearchSimilarity

Change History

daysweeksmonthsyears

7 days4 weeks12 monthsage:

method change(darkness: number of changes)

Fig. 4. Enlarged and annotated version of an in situ visualization of a method.

every method signature is augmented with an situ visualizationequivalent to those used in the search results (Figure 3, D).

B. In Situ Visualization

The goal of the in situ visualizations is to provide a previewof additional information that is available on demand foreach method. Hence, they should not replace the detailedinformation but just allow the developer, when browsing thelist of search results or source code, to quickly decide whetherretrieving those details might be interesting. Being word-sizedgraphics, the in situ visualizations can be considered as typesof Sparklines [50]. Each visualization is divided into two partsas shown in Figure 4:

Search Similarity: The blue bar on the left encodes thesimilarity of the method to the search query. Since Lucenerankings are not limited to a specific interval, we map theranking value to the height of the bar based on a hyperbolicfunction, where a similarity of 0 produces an (invisible) bar ofheight 0, higher values asymptotically approach the maximumheight indicated by the surrounding black border line.

Change History: The other part of the visualization en-codes more complex information, namely, the full change

history of the method as recorded. Since recent developmentis often more relevant than changes that occurred a longtime ago, the timeline does not use a uniform resolution, butdisplays the younger history in greater detail. In particular, thetimeline is split into four parts, the rightmost one showing thechanges of the past 7 days (resolution: 1 day), followed to theleft by the changes of the last 4 weeks (resolution: 1 week),the last 12 months (resolution: 1 month), and the last years(resolution: 1 year). The different areas of the timeline areframed by black borders (or gray border in case the methodwas not changed in the respective time interval) and drawnwith decreasing height. Time steps are visualized as verticallines color-coding the number of changes (the darker, the morechanges; white, no changes).

While the similarity to the search result is displayed forevery method, the change history part is only displayed whenthere is at least one commit stored for the method. Based onthe two parts, two different tooltip dialogs provide details ondemand when hovering the mouse over blue bar or the browntimeline. These dialogs contain lists of methods that supportthe follow relations task (Figure 2).

C. Textual Similarity Details

Details for the blue bar give more background on thevocabulary used in the method as shown in Figure 5. On theleft, the similarity value of the method to the search querycan be retrieved next to the blue bar. Below, all keywordscontained in the identifiers and comments of the method arelisted with decreasing frequency. While keywords matchingthe search query are highlighted, the keyword frequency isencoded in the font size and subscript number. On the right, alist of methods is shown that are textually similar to the current

Fig. 5. Tooltip dialog showing textual similarity details of a method displayedwhen hovering the search similarity visualization of the method.

Fig. 6. Tooltip dialog showing change details of a method displayed whenhovering the change history visualization of the method.

one: the keywords used in the method are taken as a queryfor starting a new search, for which the results are shown inthe list. Here, the similarity to the specific search query isencoded in the green horizontal line at the top of the results—the blue bars in the search result still encode the similarity tothe initial search, which is still active in background. Again,methods in the list can be clicked to jump to their source coderepresentation.

D. Co-change Details

In contrast, the tooltip dialog that appears when moving themouse over the timeline visualization of a method providesdetails on past changes of the method (Figure 6). The liston the left further explains the data already displayed in thevisualization: the changes grouped by periods starting with themost recent ones at the top. For each change, date, author, andcommit message are displayed. The list on the right relates toother methods that were changed frequently together with themethod in the past. The values encoded in the top brown barsand used for sorting refers to the so-called confidence of theco-changes [63]—the percentage of changes of the methodwhere the other method changed as well. Also this list can beused for navigating to the source code of the related methods.

VI. USER STUDY DESIGN

We performed a user study to test how I3 would be appliedin practice by software engineers. The primary goal for thestudy design was to create a realistic software engineeringscenario with non-trivial complexity. Hence, we used a realand sufficiently complex software system as a sample projectand took real maintenance tasks from the issue trackingsystem as sample tasks for the study. Moreover, we did notreduce the evaluation to a single (quantitatively measurable)research question, but aimed at broader understanding of thefeature location process as applicable through our approach. Inparticular, we designed the evaluation to answer the followingresearch questions (RQs):

• RQ1: What information sources do developers use toidentify the methods to change?

• RQ2: Which interaction strategies are applied for thefeature location process?

• RQ3: Is the visual interface easy to understand and use?What are the obstacles?

• RQ4: How can I3 be applied in practice? What are themissing features?

The study consisted of (i) an introductory phase familiariz-ing the participants with I3, (ii) the main part where partici-pants solved maintenance tasks, and (iii) a final questionnaireto collect feedback. During the main part, we collected theparticipants’ answers and comments to the provided tasks andlogged their interactions with I3. The questionnaire asked forquantifiable ratings and qualitative free text statements. Theexperiment was performed as a distributed user study withparticipants running the software on their own computers. Weinvited colleagues, industrial professionals, and current andformer graduate students to participate. We designed the exper-iment to last 90 minutes. However, to reflect a realistic scenariowhere software engineers could freely schedule their tasks,we did not impose hard time constraints. The materials of theexperiment, including the software, tutorial, questionnaire, andaggregated results are available online1.

A. Study Preface

After a short introduction, participants were asked to reada tutorial designed as a set of self-explaining slides. Partic-ipants were allowed to use it as a reference throughout theexperiment. Then, participants started Eclipse as provided ina software and data bundle. The workspace already includedthe sample software project and I3 was initialized with a searchindex, execution traces, and historical data, so that participantsdid not need to set up anything. To familiarize participants withthe tool and to check whether they understood its features, weasked them to conduct a number of simple tasks: to perform asearch, to filter methods, to name the most frequent keywordsof a method, to find textually similar methods, to retrievecertain changes of a method, and to identify co-changedmethods. The participants had to fill in their answers for thesetasks in a text field, which we used to check correctness.

1https://sites.google.com/site/i3featurelocation/

B. Main Part

The main part of the user study consisted of performingfeature location tasks very close to a realistic software engi-neering maintenance scenario. We decided to use fixing issues,such as bugs or simple feature improvements, reported in theissue tracking system of the sample project: First, the issuedescription provides a well-defined starting point for featurelocation. Second, it is a realistic scenario that the developerfixing the issue is not familiar with the code and, in particular,is not the person who initially wrote the code. Third, foralready closed issues, there exists a reference solution of whichmethods were needed to be changed to fix the issue.

As a sample project, we selected jEdit, a text editor writtenin Java. It is a reasonably large software system, but not toocomplex for participants to understand the source code. It hasbeen already used in related studies [33], [64], [36], [38], [32],[40]. Code, change information, bug reports, and executiontraces are available as a benchmark dataset [58]. From thisbenchmark, we used version 4.3 pre9 of jEdit with a commithistory from 2000-01-15 to 2007-01-20 (SVN revision 1–8676; we removed duplicate revisions included in the dataset).To simulate that participants continue development of jEdit 4.3pre9, we artificially set the date to 2007-01-20 in I3 becausethe timeline visualizations depict the recent history in higherresolution.

We selected three issues from the benchmarks as sampletasks for the participants of the study to solve. Criteria for theselection were that the issues cover different features of jEdit,that the issue reports provide a comprehensive descriptions ofthe requested fixes, that the fixes were non-trivial (at least fivemethods changed in the reference solution), that the fixes didnot require to add considerable amount of new code, and thatthe issues had been fixed after 2007-01-20, but the related codehas been already available in version 4.3 pre9. The followingissues were finally retrieved as a set of tasks for the study:

• Task 1 (Issue 1671312): The issue describes that regularexpressions used in the search dialog do not find matchesof zero length.

• Task 2 (Issue 1999448): It is reported that joining linesis not working efficiently if a folding mode is used forcollapsing parts of the text.

• Task 3 (Issue 1730845): The issue is requesting thefeature of selecting an entire line or multiple lines throughclicking in the gutter (i.e., the area at the left side of theeditor usually containing line numbers).

The specific question we asked was “Which methods wouldyou think need to be changed for fixing the above issue?”and participants provided their answers as a list of methods.As additional questions, we further wanted to know, whichspecific sources of information were relevant for solving thetask and which Eclipse functionalities were used as relevantadditional help. To warrant that participants had understoodthe procedure, we started with a simpler, yet also realisticwarm-up task that follows the same question design (resultsfrom this questions are not evaluated in the following).

C. Questionnaire

The study closed with a questionnaire for general feedback.First, we asked the participants to formulate what they didand did not like about I3 as free text and wanted to knowif they could imagine an approach like this in their dailydevelopment work. Then, we explored the different featuresof the approach based on—among other questions—ratingscales, one for the intuitiveness of the features, one for theusefulness of the features. We also wanted to learn whetherthe participants considered the tasks as realistic softwaredevelopment scenarios. Finally, we collected some personalbackground of the participants, in particular, their professionaloccupation and experience in software development.

D. Participants

Twenty participants took part in and completed the userstudy. Based on the results from training tasks, for which weknow the correct answers, we excluded 2 participants becausethey only answered less than 5 out of the 7 questions correctly.In the following, we only analyze the answers of the remaining18 participants: The largest group of participants are PhDstudents (8), followed by professional software developers(3), Master students (3), Bachelor students (2), professors (1),and researchers (1). They judged their software developmentexpertise as high: average of 3.9 on a scale from 1 (very low)to 5 (very high). Also, their Java experience was not muchlower (average of 3.6 on the same scale).

VII. RESULTS

For analyzing the results of the study and answering theresearch questions, we applied mixed methods: descriptivestatistics for evaluating quantifiable measures and qualitativemethods summarizing free text answers. All averages reportedin the following are mean values aggregated from rating scales.The scales were all only labeled for the two extreme values.Hence, they can be interpreted, at least in approximation, asinterval scales. Additional histograms are provided for themost important rating scales to further clarify.

The participants agreed that the issue fixing tasks arerealistic: on average 4.4 on a scale from 1 (I do not agree) to5 (I agree). They attested the tasks to be of medium difficulty:2.7 on a scale from 1 (too difficult) to 5 (too easy). Since thereare different ways to implement a fix, it is difficult to judgewhether the answers provided as a list of methods are correct.Using the solution by the original developers, we at leastconsider the methods being part of this reference as correct,but cannot make any statements about the correctness of othermethods (the agreement values provided in the following canbe interpreted as estimated lower bounds of precision values).For Task 1, participants answered 3.2 methods on average ofwhich 43% also appeared in the reference; for Task 2, 3.1methods and 35% agreement; for Task 3, 3.3 methods and 45%agreement. Hence, the tasks seem to have a comparable levelof difficulty. Abstracting from methods to classes shows thatthe answers not matching the reference are often in the contextof the reference answers: a higher agreement is reached if we

TABLE IRELEVANCE RATINGS OF DATA SOURCES ON A SCALE FROM 1 (TOTALLY IRRELEVANT) TO 5 (HIGHLY RELEVANT) DISCERNED PER TASKS AS WELL AS

AVERAGED; HISTOGRAMS SHOW THE DISTRIBUTION OF THE ANSWERS, PERCENTAGES QUANTIFY RATINGS OF AT LEAST 4 (RELEVANT).

list of methods execution trace source code list of keywords text. similar methods changes co-changed methods

Task 1 4.0 3.3 4.1 2.4 3.3 2.7 3.2

Task 2 4.6 3.1 3.9 2.7 2.9 1.9 2.9

Task 3 4.1 3.1 4.2 2.6 3.2 2.1 2.8

avg 4.2 3.2 4.1 2.5 3.1 2.2 2.9

histogram

relevant (≥4) 81% 44% 74% 26% 44% 17% 39%

1 2 3 4 5

check whether participants found the same classes as changedin the reference (Task 1: 83%; Task 2: 66%; Task 3: 50%).However, we did not analyze the ‘correctness’ of the answersfurther because there is not only one correct solution, butmultiple ones that are subjective, context-dependent, and donot necessarily have considerable agreement [65], [66].

RQ1: Information Sources

Investigating the importance of different sources of infor-mation used in I3, we analyze the ratings of relevance thatparticipants gave. Table I summarizes these relevance ratingsper task and data source across all participants applying anaverage measure. The results show that the list of methodsas retrieved through textual search and the source code formthe most relevant source of information (average relevance of4.2 and 4.1, being considered as relevant in 81% and 74%of the cases); this matches the design decision to put textualsearch and code highlighting of search results in the mainfocus of I3. Less important, but still rated as relevant in abouta third to half of the cases were execution traces (44%),textually similar methods (44%), and co-changed methods(39%); again this matches the role of the information inthe user interface because this information is only presentedthrough additional selection. Although not totally irrelevant,the list of keywords and the change history of a method arerated with lower relevance (average relevance of 2.5 and 2.2,rated as relevant in 26% and 17% of the cases). The histogramsreveal different kinds of distributions for the answers: whilethe list of methods and the source code are rated with a highagreement among the participants, the other data sources showa broader distribution. This could indicate that particularlythose data sources are more difficult to interpret and helpedonly some of the participants. Comparing the tasks, we mostlyobserve consistent results, only with a few exceptions: the listof methods in Task 2 and the list of changes in Task 1 seemto be considerably more relevant than in the other tasks.

Result 1: The relevance ratings of the data sourcesconform to their role in the user interface of I3. Theanswers further confirm that not only IR seems to revealrelevant results, but filtering by execution traces, textuallysimilar methods, and co-changed methods are rated asimportant as well.

search

and filter

read and extract

follow relations

textually similar

co-change

textually similar

co-change

Task 1 Task 2 Task 3 avg

3.4 4.1 3.4 3.6

2.7 3.0 3.6 3.1

2.9 5.1 3.3 3.7

4.5 4.7 3.5 4.2

1.0 1.4 0.8 1.1

1.9 1.2 1.1 1.4

open tooltip dialog

click on method in

tooltip dialog

use search view

Fig. 7. Model from Cognitive Task Analysis (Figure 2) adapted for analysisand augmented with average usage frequencies for the three tasks of the study.

RQ2: Interaction Strategies

To analyze the applied usage strategies, we map the spe-cific usage interactions recorded in log files to the high-level cognitive tasks as derived from Cognitive Task Analysis(Section IV). Since there are multitudes of ways in Eclipseto navigate through the code, we only focus on the meansof interaction provided by I3. As a consequence, we slightlyadapt Figure 2 by removing the parts reflecting IDE func-tionality and specifying the read and extract task in moredetail. Figure 7 shows the adapted version enriched withusage frequencies of the related functionality. Please note that‘textually similar’ and ‘co-change’ refer to opening tooltipdialogs of the respective type for ‘read and extract’ whilethe same terms are mapped to clicking on a related method inthese tooltip dialogs for ‘follow relations’.

The average usage frequencies reveal that a search querywas (re-)formulated 3.6 times per task and filters were set3.1 times (filters needed to be set again when changing thesearch query). This indicates that feature location is appliedas an iterative process. A manual analysis of the of appliedfilters shows that the relative high frequencies for filtering areonly partly an effect of consistent usage of this functionality:some participants gave up on using the feature at all whileothers tested different filters one after the other (also those forother tasks than the current one). Tooltip dialogs are openedon average about 4 times per task for each type of dialog (3.7times for textual similarity details and 4.2 times for co-change

TABLE IIINTUITIVENESS AND USEFULNESS RATINGS OF USER INTERFACE

ELEMENTS ON A SCALE FROM 1 (NOT INTUITIVE/USEFUL AT ALL) TO 5(VERY INTUITIVE/USEFUL) AS AVERAGES AND HISTOGRAMS.

search

resultfiltering

high-

lighting of

search

terms

blue bartimeline

vis.

textual

similarity

details

history

details

4.7 2.6 4.4 4.3 3.7 4.2 4.3

4.7 2.8 3.6 3.8 3.1 3.7 4.0

usefulness

intuitiveness

details). When a tooltip dialog is opened, a relation is followedin the form of clicking on a related method in about third ofthe cases (1.1 of 3.7 times for textual similarity details and1.4 of 4.2 times for co-change details). We, however, can onlyspeculate what participants did in the other two thirds of thecases—reading the information on the left, studying the list ofmethods on the right, or they might have opened the dialogjust by accident. The usage frequencies are in general quiteconsistent with respect to the three tasks, only that Task 2tends to require more user interaction with I3, which couldpoint to a higher degree of difficulty.

For each task, we asked for an extra free text answer toretrieve built-in functionality of Eclipse that participants con-sidered as relevant to solve the task. In 69% of the cases, extrafeatures were named, which we manually classified. The usageof Eclipse functionality was quite diverse: no item occurredin more than third of the cases. Most regularly, participantsused local search, opening a declaration of a code entity,and the call hierarchy (answered in 30%, 22%, and 20% ofthe cases). Other functionality is mentioned only occasionally,such as the outline view (11%), the package explorer (7%),the type hierarchy (4%), finding references (2%), and JavaDoc(2%). Interestingly, no participant named the global searchfunctionality of Eclipse. The answered functionalities, withthe exception of JavaDoc, serve for code navigation and canbe classified into global and local navigation (global: packageexplorer, global search; local: call hierarchy, find references,local search, open declaration, outline, type hierarchy). Ifparticipants named additional Eclipse functionality (69%),these almost always included local navigation (65%), but onlyrarely global navigation (7%).

Result 2: Developers use I3 iteratively to locate featuresas intended, where the first part of the foraging loop isrepeated more often than the second part. In the contextof issue fixing tasks, the new means of code navigationlargely replace the Eclipse functionality for global codenavigation and complements local code navigation.

RQ3: Visual Interface Analysis

To investigate specific elements of the I3’s user interface,we asked participants to rate the intuitiveness and usefulness

on a numeric scale (Table II). With respect to intuitiveness,we intended to find whether interactions matched the users’expectations, visualizations were easy to understand, and I3in general had a flat learning curve. Most features reach goodscores ≥ 4 on a scale from 1 (not intuitive at all) to 5(very intuitive). Only filtering by execution traces and thetimeline visualizations receive lower ratings (2.6 and 3.7);the histograms show a broad spectrum of ratings for thesetwo categories. Analyzing the free text answers describingobstacles of understanding, we find a simple explanation forthe low rating of filtering: some participants just did notunderstand where the execution traces came from (these wererecorded before the experiment and we may have not explainedthis clearly enough in the preface of the user study). Incontrast, for the timeline visualization, participants did notstate to have problems to understand them—their complexitymight just required additional explanations as provided in thetutorial. As the gapped histogram indicates, participants aredivided in their rating of intuitiveness.

The rated usefulness of the user interface elements, as alsoprovided in Table II, connects the elements to the featurelocations tasks, however, cannot be independent of the datasources discussed in RQ1. As expected, the usefulness of theelements largely agrees with the relevance of the underlyingdata sources; for instance, the displayed search result contain-ing the list of methods is rated as most useful (4.7). Further,the history details (4.0), the blue bar in the in situ visualization(3.8), textual similarity details (3.7), and the highlighting ofsearch terms in the code (3.6) tend to be useful. Again, thefiltering by execution traces and the timeline visualization forman exception with a lower usefulness rating around 3, but witha broad spectrum of answers. For the in situ visualizations,we further explicitly asked whether participants liked them: amanual sentiment analysis of the free text answers revealedthat 11 participants were positive about the visualizations,but only one participant was negative (the other participantswere indifferent or did not provide any interpretable answer).While participants seem to agree that the visualizations arenon-obtrusive and provide a good anchor for retrieving detailson demand, in particular for the timeline visualization, theydo not agree whether the visualization as such is useful or not(both statements can be found in the answers).

Result 3: The general intuitiveness of I3 is rated high,only filtering by execution traces and timeline visualiza-tion seem to need some explanation. The presentation ofthe search results and the details on demand provided astooltip dialogs are considered as useful.

RQ4: Practical Applicability

Generally, we convinced participants that I3 is useful inpractical application: with an average of 4.2 on a scale from1 (definitely not) to 5 (definitely) participants would use anapproach like ours in their daily work. Summarizing someof the high-level textual feedback, one participant said thatalready “the IR-based search tool is much better than the state-

of-the-art practice” and “I don’t have to go to another viewto get the information I want.” Another participant liked that“everything is an single view” and somebody remarked thatthe tool “is well integrated into Eclipse”. Many participantsalso explicitly highlight the way of retrieving relations betweenmethods, for instance, that it is “very nice to see correspondingmethods instantly” or “I liked the wide array of ways the toolallows you to draw relationships between methods.” Negativecomments often referred to lacking intuitiveness of filteringby execution traces: one participant further explained that“it is a lot to ask users to execute their program beforesearching.” Suggested features to further improve the approachwere to add way to filter by a specific package, to integratethe possibility to exclude keywords, to cluster correspondingmethods, to visualize method calls as a tree or graph, toselect specific time frames for co-changes, to always showthe timeline visualization (not only when searching), and tomanually mark or hide irrelevant search results. Participantssee the main area of application of I3 (among the suggestedanswers) in adapting existing features (17 of 18 participants),in fixing bugs (14), finding code for reuse (13), implementinga new feature (10), and restructuring/refactoring the program(7). Three participants additionally recommend to use I3 ingeneral to understand a new codebase or project.

Result 4: The answers of the participants suggest that ourapproach can be leveraged in practice, mainly, to adaptexisting features, to fix bugs, and to find code for reuse.

VIII. DISCUSSION

To reflect the result of our work in a broader context, wefinally discuss limitations of the user study and compare I3systematically to related feature location approaches.

A. Limitations of the User Study

The user study was designed to cover a realistic scenario andparticipants confirmed the realism of the tasks. Nevertheless,certain factors limit the validity of our study. First, we wereonly able to test three tasks on a single open source softwareproject; for instance, the relevance of data sources, qualityof the search results, or the applied usage strategies couldbe different for other scenarios. Second, participants werenot familiar with the code nor the project; although this isa realistic scenario, it might not be the standard case. Thepreloaded set up of the data introduced an artificial elementand led to some obstacles (e.g., some participants did notunderstand how the execution traces were generated). Third,participants were mostly graduate students; more experienced,professional software developers could have different require-ments for a feature location tool. Finally, the number ofparticipants is still too small to test numerical differencesfor statistical significance—the insights of the user studyhave to be generally considered as preliminary before moreauthoritative evidence is provided.

Moreover, the user study did not compare I3 to otherapproaches like the standard search of Eclipse or competing

advanced feature location tools. Hence, we cannot claim thatour technique is better than others, but only that it can be lever-aged for feature location, that it received positive feedback,and that our tool largely replaced the build-in global navigationand search features of the IDE during the experiment. While,initially, we planned to conduct a comparison of tools in acomparative controlled experiment, we finally discarded theplan for several reasons. First, a fair comparison is hard todesign due to many differences in implementation of thetools. Second, the results would be difficult to interpret as theapproaches do not differ in one, but a multitude of variablesand characteristics. Third, for valid statistical analyses, thestudy would have required a larger number of participantsand a more streamlined experiment setup (i.e., shorter andless realistic). Instead, we performed a non-comparative studythat allowed us to inspect the individual characteristics of ourapproach in more detail. We believe, such an experiment—atleast as a first evaluation—is likely to provide deeper insightsthan a comparative study of same scale.

B. Qualitative Comparison of Tools

Nevertheless, differences of our approach to others canbe qualitatively discussed in greater depth. Additional to thereview of related work in Section II, we compare the approachto those state-of-the-art feature location techniques that includean advanced user interface. The following discussion is basedon the dimensions of software visualization as introduced inthe taxonomy by Maletic et al. [67]: task, audience, target,representation, and medium. While task (feature location),audience (software developers), and medium (standard colorscreen) are the same for all tools, the remaining two di-mensions can be used to systematically discern the visualuser interfaces of the tools. Additionally, we also discuss theevaluation of the approaches. Table III gives an overview ofthese dimensions and lists for each of the tools some meta-dataand characteristics with respect to the dimensions. The sub-entries for the dimensions were retrieved specialized to featurelocation as a superset of the characteristics of all tools.

1) Information Sources (Targets): With respect to informa-tion sources—or targets as called by Maletic et al. [67]—I3 incorporates co-change information and execution tracesadditional to the standard data source of code and comments(analogous to ImpactMiner). Sando, in contrast, focuses on thevocabulary used in the code and comments; thesauri importedfrom external source and build by finding co-occurrences ofterms are used to help with query (re-)formulation; similarly,CONQUER uses co-occurrences. By taking the parsed codestructure into account, MFIE allows to filter parts of the systembased on containment and usage. Hence, the tools focus onrather different sources, each of them providing potentiallyvaluable extra information. Comparing the impact of each isstill an open research question.

2) Representation of Information: Available in all tools,the standard means to represent the retrieved informationis a list of code entities. Highlighting of terms is anotherstandard feature, either done in the editor itself (CONQUER,

TABLE IIIQUALITATIVE COMPARISON OF INTERACTIVE FEATURE LOCATION TOOLS AND THEIR EVALUATION.

target representation evaluation

tool year environment code

and

com

men

ts

code

stru

ctur

e

thes

auru

s

co-c

hang

es

exec

utio

ntr

aces

list

ofen

titie

s

high

light

ing

ofte

rms

term

sum

mar

ies

quer

ysu

gges

tions

quer

yhi

stor

y

rela

ted

entit

ies

chan

gehi

stor

y

insi

tuvi

sual

izat

ion

field

stud

y

expe

rim

ent

#pa

rtic

ipan

ts

dura

tion

(min

)

#is

sues

#pr

ojec

ts

base

line

com

pari

son

anal

yze

sour

ces

anal

yze

repr

esen

tat.

anal

yze

inte

ract

ion

FLAT3 [32], ImpactMiner [10] 2010/14 Eclipse × × × × × ×Sando [7], [5], [34] 2012/14 Visual Studio × × × × × × × × 274 × ×CONQUER [6], [4] 2013/14 Eclipse × × × × × × × 18 45 8/28 5 × ×MFIE [8] 2013 Web × × × × × × × × 20 120 4 1 × ×I3 2015 Eclipse × × × × × × × × × × 18 90 3 1 × × ×

I3), in code snippets giving a preview of the code (Sando,MIEF), or in abstracted visual thumbnail representations ofthe code (FLAT3/ImpactMiner). Term summaries such as lists,trees, or cloud representations are used for representing co-occurring terms (Sando), organizing the findings in categories(CONQUER), or summarizing the vocabulary of a code entity(MFIE, I3). For formulating or rewording a query, a listof suggested changes or extensions of the query can beuseful (Sando, CONQUER). A query history helps keeping anoverview and jump back to previous queries (Sando, MIEF).Dependencies or similarities between entities enable findingrelated entities or understand interaction between the entities,based on structural dependencies (MFIE), textual similarity(I3), or co-changes (ImpactMiner, I3). Finally, I3 is the onlytool that leverages in situ visualizations augmenting the codeand list of search results and shows a commit history of thecode entities. Although I3 is among the most versatile toolsregarding data representation, only few additional views arerequired: the representations are tightly integrated in the ex-isting user interface of the IDE. Adding further representationslike query suggestions and history will not be difficult.

3) Tool Evaluation: While the user interfaces ofFLAT3/ImpactMiner were only evaluated in small casestudies, the other tools were tested in user studies. Sandowas evaluated in a field study analyzing the log files of 274participants with respect to representations and interactions(in another study Sando was used, but its user interfacewas not evaluated [66]). CONQUER, MFIE, and I3 weretested in user experiments under more controlled conditions.The scales of these studies are similar with respect toparticipants (18–20). The studies of MFIE and I3 let theusers investigate in-depth (ca. 90–120 min) a small number offeature location tasks (3–4) for a single project; the study ofCONQUER covers 8 tasks from a larger selection of 28 tasksfrom 5 projects in shorter time (ca. 45 min). CONQUER(compared to simplified version and Eclipse; no statisticalsignificance difference) and MFIE (compared to Eclipse;statistical significant differences) are evaluated in comparisonto baseline approaches. Our study evaluated only I3, butfocuses on a detailed qualitative and quantitative analysis of

the data sources, data representations, and interactions. Thecomparison CONQUER and MFIE to standard search alreadysuggested that advanced feature location approaches couldimprove the standard. A comparative user study among theapproaches discussed in this section, however, is still lacking.

IX. CONCLUSION AND FUTURE WORK

We presented a novel user interface for feature location andevaluated it in a user study. Feature location was modeledas a cyclic sense-making process. Our approach specificallysupports the cognitive tasks of this process: while a standardIR-based approach is used for search, execution traces allowfor filtering the results with respect to a specific programrun. Supporting reading the results and extracting information,search results are highlighted in the code editor and augmentedwith in situ visualizations. These visualizations also providea starting point for following relations to textually similar orco-changed methods. The user study shows that the IR-basedapproach is the central source of information for I3, and thatextra information available on demand is relevant for retrievingspecific features in the code. The presented user interface islargely intuitive and the majority of functionalities were clearlyuseful for solving the provided feature locations tasks.

In comparison to other tools, I3 has a specific focus onthe exploration of co-change patterns to evaluate the impactof changes. Its use of in situ visualization for explaining andrelating search results is a unique characteristic among thosetools. Nevertheless, future work should also include integratingadditional representations of other feature location tools suchas query suggestions and filtering by other criteria. This wouldfinally allow to evaluate the approaches systematically againsteach other in controlled experiments. Another avenue of futureresearch is to explore whether parts of the introduced userinterface are useful in other development scenarios.

ACKNOWLEDGMENT

Fabian Beck is indebted to the Baden-Wurttemberg Stiftungfor the financial support of this research project within thePostdoctoral Fellowship for Leading Early Career Researchers.The authors from W&M are supported in part by the NSFCCF-1218129 and NSF-1253837 grants.

REFERENCES

[1] T. Biggerstaff, B. Mitbander, and D. Webster, “The concept as-signment problem in program understanding,” in Proceedings of the15th IEEE/ACM International Conference on Software Engineering(ICSE’93), 1993, pp. 482–498.

[2] S. Bohner and R. Arnold, Software Change Impact Analysis. LosAlamitos, CA: IEEE Computer Society, 1996.

[3] B. Dit, M. Revelle, M. Gethers, and D. Poshyvanyk, “Feature locationin source code: a taxonomy and survey,” Journal of Software: Evolutionand Process, vol. 25, no. 1, pp. 53–95, 2013.

[4] E. Hill, M. Roldan-Vega, J. A. Fails, and G. Mallet, “NL-basedquery refinement and contextualized code search results: A user study,”in Proceedings of the IEEE Conference on Software Maintenance,Reengineering and Reverse Engineering 2014 Software Evolution Week(CSMR-WCRE’14), 2014, pp. 34–43.

[5] X. Ge, D. Shepherd, K. Damevski, and E. Murphy-Hill, “How the Sandosearch tool recommends queries,” in Proceedings of the IEEE Confer-ence on Software Maintenance, Reengineering and Reverse Engineering2014 Software Evolution Week (CSMR-WCRE’14), 2014, pp. 425–428.

[6] M. Roldan-Vega, G. Mallet, E. Hill, and J. A. Fails, “CONQUER: A toolfor nl-based query refinement and contextualizing code search results,”in Proceedings of the 29th IEEE International Conference on SoftwareMaintenance (ICSM’13). IEEE, 2013, pp. 512–515.

[7] D. Shepherd, K. Damevski, B. Ropski, and T. Fritz, “Sando: anextensible local code search framework,” in Proceedings of the ACMSIGSOFT 20th International Symposium on the Foundations of SoftwareEngineering (FSE’12). ACM, 2012, pp. 15:1–15:2.

[8] J. Wang, X. Peng, Z. Xing, and W. Zhao, “Improving feature locationpractice with multi-faceted interactive exploration,” in Proceedings ofthe 35th IEEE/ACM International Conference on Software Engineering(ICSE’13), 2013, pp. 762–771.

[9] M. Gethers, B. Dit, H. Kagdi, and D. Poshyvanyk, “Integrated im-pact analysis for managing software changes,” in Proceedings of the34th IEEE/ACM International Conference on Software Engineering(ICSE’12), 2012, pp. 430–440.

[10] B. Dit, M. Wagner, S. Wen, W. Wang, M. Linares-Vsquez, D. Poshy-vanyk, and H. Kagdi, “ImpactMiner: A tool for change impact analysis,”in Proceedings of the 36th ACM/IEEE International Conference onSoftware Engineering (ICSE’14), Formal tool demonstration track, 2014,pp. 540–543.

[11] S. L. Abebe, A. Alicante, A. Corazza, and P. Tonella, “Supportingconcept location through identifier parsing and ontology extraction,”Journal of Systems and Software, vol. 86, no. 11, pp. 2919–2938, 2013.

[12] E. Hill, L. Pollock, and K. Vijay-Shanker, “Automatically capturingsource code context of nl-queries for software maintenance and reuse,”in Proceedings of the 31st IEEE/ACM International Conference onSoftware Engineering (ICSE’09), 2009, pp. 232–242.

[13] ——, “Improving source code search with natural language phrasalrepresentations of method signatures,” in Proceedings of the 26thIEEE/ACM International Conference on Automated Software Engineer-ing (ASE’11), 2011, pp. 524–527.

[14] S. W. Thomas, M. Nagappan, D. Blostein, and A. E. Hassan, “Theimpact of classifier configuration and classifier combination on buglocalization,” IEEE Transactions on Software Engineering, vol. 39,no. 10, pp. 1427–1443, 2013.

[15] S. Wang, D. Lo, Z. Xing, and L. Jiang, “Concern localization using in-formation retrieval: An empirical study on linux kernel,” in Proceedingsof the 18th Working Conference on Reverse Engineering (WCRE’11),2011, pp. 92–96.

[16] J. Zhou, H. Zhang, and D. Lo, “Where should the bugs be fixed? -more accurate information-retrieval-based bug localization based on bugreports,” in Proceedings of the 34th IEEE/ACM International Conferenceon Software Engineering (ICSE’12), 2012, pp. 14–24.

[17] D. Poshyvanyk and D. Marcus, “Combining formal concept analysiswith information retrieval for concept location in source code,” inProceedings of the 15th IEEE International Conference on ProgramComprehension (ICPC’07), 2007, pp. 37–48.

[18] M. Revelle and D. Poshyvanyk, “An exploratory study on assessingfeature location techniques,” in Proceedings of the 17th InternationalConference on Program Comprehension (ICPC’09), May 2009, pp. 218–222.

[19] D. Liu, A. Marcus, D. Poshyvanyk, and V. Rajlich, “Feature locationvia information retrieval based filtering of a single scenario execution

trace,” in Proceedings of the 22nd IEEE/ACM International Conferenceon Automated Software Engineering (ASE’07), 2007, pp. 234–243.

[20] D. Poshyvanyk, Y. Guhneuc, A. Marcus, G. Antoniol, and V. Rajlich,“Feature location using probabilistic ranking of methods based onexecution scenarios and information retrieval,” IEEE Transactions onSoftware Engineering, vol. 33, no. 6, pp. 420–432, 2007.

[21] A. Chen, E. Chou, J. Wong, A. Yao, Q. Zhang, S. Zhang, and A. Michail,“CVSSearch: searching through source code using CVS comments,”in Proceedings of the IEEE International Conference on SoftwareMaintenance (ICSM’01), 2001, pp. 364–373.

[22] D. Cubranic, G. C. Murphy, J. Singer, and K. S. Booth, “Learning fromproject history: a case study for software development,” in Proceedingsof the 2004 ACM Conference on Computer Supported Cooperative Work(CSCW’04). ACM, 2004, pp. 82–91.

[23] S. L. Abebe and P. Tonella, “Natural language parsing of programelement names for concept extraction,” in Proceedings of the 18th IEEEInternational Conference on Program Comprehension (ICPC’10), 2010,pp. 156–159.

[24] E. Hill, L. Pollock, and K. Vijay-Shanker, “Exploring the neighborhoodwith dora to expedite software maintenance,” in Proceedings of the 22ndIEEE/ACM International Conference on Automated Software Engineer-ing (ASE’07), 2007, pp. 14–23.

[25] S. Reiss, “Semantics-based code search,” in Proceedings of the31st IEEE/ACM International Conference on Software Engineering(ICSE’09), 2009, pp. 243–253.

[26] M. P. Robillard and G. C. Murphy, “FEAT: a tool for locating, describ-ing, and analyzing concerns in source code,” in Proceedings of the 25thInternational Conference on Software Engineering (ICSE’03), 2003, pp.822–823.

[27] D. Shepherd, Z. Fry, E. Gibson, L. Pollock, and K. Vijay-Shanker, “Us-ing natural language program analysis to locate and understand action-oriented concerns,” in Proceedings of the 6th International Conferenceon Aspect Oriented Software Development (AOSD’07), 2007, pp. 212–224.

[28] S. Wang, D. Lo, and L. Jiang, “Code search via topic-enriched depen-dence graph matching,” in Proceedings of the 18th Working Conferenceon Reverse Engineering (WCRE’11), 2011, pp. 119–123.

[29] D. Poshyvanyk, Y.-G. Gueheneuc, A. Marcus, G. Antoniol, and V. Ra-jlich, “Combining probabilistic ranking and latent semantic indexingfor feature identification,” in Proceedings of the 14th InternationalConference on Program Comprehension (ICPC’06), 2006, pp. 137–148.

[30] M. Revelle, B. Dit, and D. Poshyvanyk, “Using data fusion and webmining to support feature location in software,” in Proceedings of the18th International Conference on Program Comprehension (ICPC’10),2010, pp. 14–23.

[31] B. Dit, M. Revelle, and D. Poshyvanyk, “Integrating informationretrieval, execution and link analysis algorithms to improve featurelocation in software,” vol. 18, no. 2. Springer US, 2013, pp. 277–309.

[32] T. Savage, M. Revelle, and D. Poshyvanyk, “FLAT3: feature locationand textual tracing tool,” in Proceedings of the 32nd ACM/IEEE Inter-national Conference on Software Engineering – Volume 2 (ICSE’10),2010, pp. 255–258.

[33] B. de Alwis, G. C. Murphy, and M. P. Robillard, “A comparative studyof three program exploration tools,” in Proceedings of the 15th IEEEInternational Conference on Program Comprehension (ICPC’07), 2007,pp. 103–112.

[34] X. Ge, D. Shepherd, K. Damevski, and E. Murphy-Hill, “How de-velopers use multi-recommendation system in local code search,” inProceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC’14), 2014, p. (to appear).

[35] A. J. Ko, B. A. Myers, M. J. Coblenz, and H. H. Aung, “An exploratorystudy of how developers seek, relate, and collect relevant informationduring software maintenance tasks,” IEEE Transactions on SoftwareEngineering, vol. 32, no. 12, pp. 971–987, 2006.

[36] T. D. LaToza, D. Garlan, J. D. Herbsleb, and B. A. Myers, “Programcomprehension as fact finding,” in Proceedings of the the 6th JointMeting of the European Software Engineering Conference and the ACMSIGSOFT Symposium on The Foundations of Software Engineering(ESEC/FSE’07), 2007, pp. 361–370.

[37] H. Li, Z. Xing, X. Peng, and W. Zhao, “What help do developers seek,when and how?” in Proceedings of the 20th Working Conference onReverse Engineering (WCRE’13), 2013, pp. 142–151.

[38] M. P. Robillard, W. Coelho, and G. C. Murphy, “How effective develop-ers investigate source code: An exploratory study,” IEEE Transactionson Software Engineering, vol. 30, no. 12, pp. 889–903, 2004.

[39] M. P. Robillard and G. C. Murphy, “Representing concerns in sourcecode,” ACM Transactions on Software Engineering and Methodology,vol. 16, no. 1, 2007.

[40] T. Savage, B. Dit, M. Gethers, and D. Poshyvanyk, “TopicXP: Exploringtopics in source code using Latent Dirichlet Allocation,” in Proceed-ings of the IEEE International Conference on Software Maintenance(ICSM’10), 2010, pp. 1–6.

[41] J. Wang, X. Peng, Z. Xing, and W. Zhao, “How developers performfeature location tasks: a human-centric and process-oriented exploratorystudy,” Journal of Software: Evolution and Process, vol. 25, no. 11, pp.1193–1224, 2013.

[42] S. L. Abebe, S. Haiduc, P. Tonella, and A. Marcus, “The effect of lexiconbad smells on concept location in source code,” in Proceedings of the11th IEEE International Working Conference on Source Code Analysisand Manipulation (SCAM’11), 2011, pp. 125–134.

[43] S. L. Abebe and P. Tonella, “Towards the extraction of domain conceptsfrom the identifiers,” in Proceedings of the 18th Working Conference onReverse Engineering (WCRE’11), 2011, pp. 77–86.

[44] S. Haiduc, G. Bavota, A. Marcus, R. Oliveto, A. D. Lucia, andT. Menzies, “Automatic query reformulations for text retrieval in soft-ware engineering,” in Proceedings of the 35th IEEE/ACM InternationalConference on Software Engineering (ICSE’13), 2013, pp. 842–851.

[45] K. T. Stolee, “Finding suitable programs: Semantic search withincomplete and lightweight specifications,” in Proceedings of the34th IEEE/ACM International Conference on Software Engineering(ICSE’12). IEEE Press, 2012, pp. 1571–1574.

[46] K. T. Stolee and S. Elbaum, “Toward semantic search via SMT solver,”in Proceedings of the 20th ACM SIGSOFT International Symposium onthe Foundations of Software Engineering (FSE’12). ACM, 2012, pp.1–4.

[47] ——, “On the use of input/output queries for code search,” in Proceed-ings of the 2013 ACM / IEEE International Symposium on EmpiricalSoftware Engineering and Measurement (ESEM’13), 2013, pp. 251–254.

[48] K. T. Stolee, S. Elbaum, and D. Dobos, “Solving the search for sourcecode,” ACM Transactions on Software Engineering and Methodology,vol. 23, no. 3, pp. 26:1–26:45, 2014.

[49] L. Martie and A. Van der Hoek, “Toward social-technical code search,”in Proceedings of the 6th International Workshop on Cooperative andHuman Aspects of Software Engineering (CHASE’13), 2013, pp. 101–104.

[50] E. R. Tufte, Beautiful evidence, 1st ed. Graphics Press, 2006.[51] M. Harward, W. Irwin, and N. Churcher, “In situ software visualisation,”

in Proceedings of the 21st Australian Software Engineering Conference(ASWEC’10), 2010, pp. 171–180.

[52] D. Rothlisberger, M. Harry, A. Villazon, D. Ansaloni, W. Binder,O. Nierstrasz, and P. Moret, “Augmenting static source views in ideswith dynamic metrics,” in Proceedings of the International Conferenceon Software Maintenance (ICSM’09), 2009, pp. 253–262.

[53] F. Beck, O. Moseler, S. Diehl, and G. D. Rey, “In situ understand-ing of performance bottlenecks through visually augmented code,” in

Proceedings of the 21st IEEE International Conference on ProgramComprehension (ICPC’13), 2013, pp. 63–72.

[54] F. Beck, F. Hollerich, S. Diehl, and D. Weiskopf, “Visual monitoringof numeric variables embedded in source code,” Proceedings of the 1stIEEE Working Conference on Software Visualization (VISSOFT’13), pp.1–4, 2013.

[55] A. Marcus, A. Sergeyev, V. Rajlich, and J. Maletic, “An informationretrieval approach to concept location in source code,” in Proceedings ofthe 11th IEEE Working Conference on Reverse Engineering (WCRE’04),2004, pp. 214–223.

[56] (2008) The Apache Software Foundation - Lucene. [Online]. Available:http://lucene.apache.org

[57] B. Dit, L. Guerrouj, D. Poshyvanyk, and G. Antoniol, “Can better iden-tifier splitting techniques help feature location?” in Proceedings of the19th International Conference on Program Comprehension (ICPC’11),2011, pp. 11–20.

[58] B. Dit, A. Holtzhauer, D. Poshyvanyk, and H. Kagdi, “A dataset fromchange history to support evaluation of software maintenance tasks,”in Proceedings of the 10th Working Conference on Mining SoftwareRepositories (MSR’13), Data Track, 2013, pp. 131–134.

[59] P. Pirolli and S. Card, “The sensemaking process and leverage pointsfor analyst technology as identified through cognitive task analysis,” inProceedings of International Conference on Intelligence Analysis, vol. 5,2005, pp. 2–4.

[60] Z. Liu, N. J. Nersessian, and J. T. Stasko, “Distributed cognition as atheoretical framework for information visualization,” IEEE Transactionson Visualization and Computer Graphics, vol. 14, no. 6, pp. 1173–1180,2008.

[61] J. Lawrance, C. Bogart, M. Burnett, R. Bellamy, K. Rector, and S. D.Fleming, “How programmers debug, revisited: An information forag-ing theory perspective,” IEEE Transactions on Software Engineering,vol. 39, no. 2, pp. 197–215, 2013.

[62] P. Pirolli and S. Card, “Information foraging,” Psychological Review,vol. 106, no. 4, pp. 643–675, 1999.

[63] T. Zimmermann, A. Zeller, P. Weissgerber, and S. Diehl, “Miningversion histories to guide software changes,” IEEE Transactions onSoftware Engineering, vol. 31, no. 6, pp. 429–445, 2005.

[64] G. Gay, S. Haiduc, A. Marcus, and T. Menzies, “On the use of relevancefeedback in IR-based concept location,” in Proceedings of the IEEEInternational Conference on Software Maintenance (ICSM’09), 2009,pp. 351–360.

[65] M. P. Robillard, D. Shepherd, E. Hill, K. Vijay-Shanker, and L. Pollock,“An empirical study of the concept assignment problem,” School ofComputer Science, McGill University, Tech. Rep., 2007.

[66] K. Damevski, D. Shepherd, and L. Pollock, “A case study of pairedinterleaving for evaluating code search techniques,” in Proceedings of theIEEE Conference on Software Maintenance, Reengineering and ReverseEngineering 2014 Software Evolution Week (CSMR-WCRE’14), 2014,pp. 54–63.

[67] J. I. Maletic, A. Marcus, and M. L. Collard, “A task oriented view ofsoftware visualization,” in Proceedings of the 1st International Workshopon Visualizing Software for Understanding and Analysis (VISSOFT’02),

rethinking user interfaces for feature location

Documents